SodomizedPanda t1_j9jyhem wrote on February 22, 2023 at 2:46 PM

And somehow, the best answer is at the bottom of the thread..

A small addition : Recent research suggests that the implicit bias in DNN that helps generalization does not only lie in the structure of the network but in the learning algorithm as well (Adam, SGD, ...). https://francisbach.com/rethinking-sgd-noise/ https://francisbach.com/implicit-bias-sgd/

red75prime t1_j9k0i84 wrote on February 22, 2023 at 3:21 PM

Does in-context learning suggest that inductive biases could also be extracted from training data?

Not only dataset, the Transformer architecture itself seems to be amenable to in-context learning. See https://arxiv.org/abs/2209.11895