Viewing a single comment thread. View all comments

starstruckmon t1_irmabdw wrote

We'd have already had a open source GPT competitor in Bloom ( same amount of parameters as GPT3 and open source / open model ) if they didn't decide to virtue signal. They trained it on too many diverse languages and sources and the AI came out an idiot ( significantly underperforms GPT3 in almost all metrics ).

7

AsthmaBeyondBorders t1_irnnljm wrote

Virtue signaling or maybe they just thought it would work out, or maybe the objective was to be decent in many languages but not as good as gpt-3 and this wasn't really a surprise to anyone because they wanted to research how the same model may behave differently in different languages with different grammar rules? Or maybe it was never meant to be a final product and they needed to test it before deciding what to do on their next models? Or maybe they thought coming up with a copy of another AI would be more irrelevant than coming up with an AI that has something different to offer? I think virtue signaling is at the bottom of the list of possibilities here dude.

6

starstruckmon t1_irnsty3 wrote

Depends on your definition of it. There's definitely a bit of this

>Or maybe they thought coming up with a copy of another AI would be more irrelevant than coming up with an AI that has something different to offer?

but another reason was that it's easy to get funding ( in the form of compute in this case ) from public institutions when there's a virtue signalling angle.

2

Akimbo333 OP t1_irmjuer wrote

Oh wow lol! Though out of curiosity how did the different languages mess it up?

2

starstruckmon t1_irmlck4 wrote

While this is the current consensus ( they went too broad ) , it's still a guess. These are all black boxes so we can't say for certain.

Basically, a "jack of all trades , master of none" type issue. As we all know from the Chinchilla paper, current models are already severely undertrained data-wise. They went even further by having even less data per language , even if the total dataset was comparable to GPT3.

6

Akimbo333 OP t1_irmo2a2 wrote

Oh damn! Oh hey you should look into Sparrow it is pretty good at being a jack of all trades lol!

https://youtu.be/dt9rv-Pf0b0

3

starstruckmon t1_irmt5ng wrote

It's not open to anyone. He's putting on a show by recreating examples from their paper.

It's basically a fine-tuned variation of Chinchilla ( smaller than GPT3 with just 1/3rd the parameters but performs better since it was trained adequately data-wise ) to be more aligned, like how they modded GPT3 into the current InstructGPT variation.

It's not really a jack of all trades in that sense since it was trained on a dataset simmilar to GPT3 of mostly English text.

Most of the new models we'll be seeing ( like the topic of this post ) will definitely be following this path.

3