currentscurrents t1_j44pu0u wrote on January 13, 2023 at 3:38 AM

Is it though? These days it seems like even a lot of research papers are just "we stuck together a bunch of pytorch components like lego blocks" or "we fed a transformer model a bunch of data".

Math is important if you want to invent new kinds of neural networks, but for end users it doesn't seem very important.

EmployeeOne817 t1_j45v4j9 wrote on January 13, 2023 at 11:16 AM

Hah sticking things together will only get you so far. True innovation and improvement of existing solution comes from fundamental understanding of these theoretical concepts.

UpperCut95 t1_j46b1rt wrote on January 13, 2023 at 1:50 PM

Totally UNDERRATED.

The whole research industry is chasing the x% performance gain while the train/compute/energy cost increase by 10x%

Aiming for efficiency and interpretability would be a good way.

But meh.

derpderp3200 t1_j45pioz wrote on January 13, 2023 at 10:02 AM

I imagine it's important when you're theorycrafting about whether a novel architecture will be able to propagate gradients in a way that might facilitate learning things, but yeah for the most part it seems about intuition and copying successful approaches more than anything.