Submitted by Akimbo333 t3_xz8v3f in singularity
starstruckmon t1_irmabdw wrote
We'd have already had a open source GPT competitor in Bloom ( same amount of parameters as GPT3 and open source / open model ) if they didn't decide to virtue signal. They trained it on too many diverse languages and sources and the AI came out an idiot ( significantly underperforms GPT3 in almost all metrics ).
AsthmaBeyondBorders t1_irnnljm wrote
Virtue signaling or maybe they just thought it would work out, or maybe the objective was to be decent in many languages but not as good as gpt-3 and this wasn't really a surprise to anyone because they wanted to research how the same model may behave differently in different languages with different grammar rules? Or maybe it was never meant to be a final product and they needed to test it before deciding what to do on their next models? Or maybe they thought coming up with a copy of another AI would be more irrelevant than coming up with an AI that has something different to offer? I think virtue signaling is at the bottom of the list of possibilities here dude.
starstruckmon t1_irnsty3 wrote
Depends on your definition of it. There's definitely a bit of this
>Or maybe they thought coming up with a copy of another AI would be more irrelevant than coming up with an AI that has something different to offer?
but another reason was that it's easy to get funding ( in the form of compute in this case ) from public institutions when there's a virtue signalling angle.
Akimbo333 OP t1_irmjuer wrote
Oh wow lol! Though out of curiosity how did the different languages mess it up?
starstruckmon t1_irmlck4 wrote
While this is the current consensus ( they went too broad ) , it's still a guess. These are all black boxes so we can't say for certain.
Basically, a "jack of all trades , master of none" type issue. As we all know from the Chinchilla paper, current models are already severely undertrained data-wise. They went even further by having even less data per language , even if the total dataset was comparable to GPT3.
Akimbo333 OP t1_irmo2a2 wrote
Oh damn! Oh hey you should look into Sparrow it is pretty good at being a jack of all trades lol!
starstruckmon t1_irmt5ng wrote
It's not open to anyone. He's putting on a show by recreating examples from their paper.
It's basically a fine-tuned variation of Chinchilla ( smaller than GPT3 with just 1/3rd the parameters but performs better since it was trained adequately data-wise ) to be more aligned, like how they modded GPT3 into the current InstructGPT variation.
It's not really a jack of all trades in that sense since it was trained on a dataset simmilar to GPT3 of mostly English text.
Most of the new models we'll be seeing ( like the topic of this post ) will definitely be following this path.
Viewing a single comment thread. View all comments