Viewing a single comment thread. View all comments

currentscurrents t1_j44pu0u wrote

Is it though? These days it seems like even a lot of research papers are just "we stuck together a bunch of pytorch components like lego blocks" or "we fed a transformer model a bunch of data".

Math is important if you want to invent new kinds of neural networks, but for end users it doesn't seem very important.

7

EmployeeOne817 t1_j45v4j9 wrote

Hah sticking things together will only get you so far. True innovation and improvement of existing solution comes from fundamental understanding of these theoretical concepts.

6

UpperCut95 t1_j46b1rt wrote

Totally UNDERRATED.

The whole research industry is chasing the x% performance gain while the train/compute/energy cost increase by 10x%

Aiming for efficiency and interpretability would be a good way.

But meh.

1

derpderp3200 t1_j45pioz wrote

I imagine it's important when you're theorycrafting about whether a novel architecture will be able to propagate gradients in a way that might facilitate learning things, but yeah for the most part it seems about intuition and copying successful approaches more than anything.

0