SodomizedPanda t1_j9jyhem wrote on February 22, 2023 at 2:46 PM Reply to comment by activatedgeek in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer And somehow, the best answer is at the bottom of the thread.. A small addition : Recent research suggests that the implicit bias in DNN that helps generalization does not only lie in the structure of the network but in the learning algorithm as well (Adam, SGD, ...). https://francisbach.com/rethinking-sgd-noise/ https://francisbach.com/implicit-bias-sgd/ Permalink Parent 27
SodomizedPanda t1_j9jyhem wrote
Reply to comment by activatedgeek in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
And somehow, the best answer is at the bottom of the thread..
A small addition : Recent research suggests that the implicit bias in DNN that helps generalization does not only lie in the structure of the network but in the learning algorithm as well (Adam, SGD, ...). https://francisbach.com/rethinking-sgd-noise/ https://francisbach.com/implicit-bias-sgd/