UnderstandingDry1256
UnderstandingDry1256 t1_j5c0y0o wrote
Reply to [D] Simple Questions Thread by AutoModerator
What are the training strategies used for GPT models? Are transformer blocks or layers trained independently? Are they trained using some subset of data and fine tuned then?
I would appreciate any references or details :)
UnderstandingDry1256 t1_j8ev9bx wrote
Reply to [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho
An obvious idea is to connect gpt to browser api and let it go and learn 😄