Viewing a single comment thread. View all comments

starstruckmon t1_irmlck4 wrote

While this is the current consensus ( they went too broad ) , it's still a guess. These are all black boxes so we can't say for certain.

Basically, a "jack of all trades , master of none" type issue. As we all know from the Chinchilla paper, current models are already severely undertrained data-wise. They went even further by having even less data per language , even if the total dataset was comparable to GPT3.

6

Akimbo333 OP t1_irmo2a2 wrote

Oh damn! Oh hey you should look into Sparrow it is pretty good at being a jack of all trades lol!

https://youtu.be/dt9rv-Pf0b0

3

starstruckmon t1_irmt5ng wrote

It's not open to anyone. He's putting on a show by recreating examples from their paper.

It's basically a fine-tuned variation of Chinchilla ( smaller than GPT3 with just 1/3rd the parameters but performs better since it was trained adequately data-wise ) to be more aligned, like how they modded GPT3 into the current InstructGPT variation.

It's not really a jack of all trades in that sense since it was trained on a dataset simmilar to GPT3 of mostly English text.

Most of the new models we'll be seeing ( like the topic of this post ) will definitely be following this path.

3