Submitted by GraciousReformer t3_118pof6 in MachineLearning
SodomizedPanda t1_j9jyhem wrote
Reply to comment by activatedgeek in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
And somehow, the best answer is at the bottom of the thread..
A small addition : Recent research suggests that the implicit bias in DNN that helps generalization does not only lie in the structure of the network but in the learning algorithm as well (Adam, SGD, ...). https://francisbach.com/rethinking-sgd-noise/ https://francisbach.com/implicit-bias-sgd/
red75prime t1_j9k0i84 wrote
Does in-context learning suggest that inductive biases could also be extracted from training data?
activatedgeek t1_j9k4z4o wrote
Very much indeed. See https://arxiv.org/abs/2205.05055
activatedgeek t1_j9k58ev wrote
Not only dataset, the Transformer architecture itself seems to be amenable to in-context learning. See https://arxiv.org/abs/2209.11895
Viewing a single comment thread. View all comments