Submitted by 47153 t3_10a3fwn in deeplearning
currentscurrents t1_j44pu0u wrote
Is it though? These days it seems like even a lot of research papers are just "we stuck together a bunch of pytorch components like lego blocks" or "we fed a transformer model a bunch of data".
Math is important if you want to invent new kinds of neural networks, but for end users it doesn't seem very important.
EmployeeOne817 t1_j45v4j9 wrote
Hah sticking things together will only get you so far. True innovation and improvement of existing solution comes from fundamental understanding of these theoretical concepts.
UpperCut95 t1_j46b1rt wrote
Totally UNDERRATED.
The whole research industry is chasing the x% performance gain while the train/compute/energy cost increase by 10x%
Aiming for efficiency and interpretability would be a good way.
But meh.
derpderp3200 t1_j45pioz wrote
I imagine it's important when you're theorycrafting about whether a novel architecture will be able to propagate gradients in a way that might facilitate learning things, but yeah for the most part it seems about intuition and copying successful approaches more than anything.
Viewing a single comment thread. View all comments