Viewing a single comment thread. View all comments

Nmanga90 t1_iqt8fju wrote

Holy shit a 540B LLM. That’s like 3 times the size of GPT-3. Why are the authors anonymous? There’s only a few orgs that this could realistically be

41

manOnPavementWaving t1_iqtkra0 wrote

Actually we know what the LM is, it's PaLM, developed by google under Jeff Dean.

Anonymous peer review is a fucking joke

26

2Punx2Furious t1_iqv2rvn wrote

I mean, in this case it's obvious, but usually it's not that easy to guess who the authors are.

3

manOnPavementWaving t1_iqv3mmi wrote

Its in the author's best interests to show of who they are, misaligning that tends to just result in subtly cheating the system

Peer review in AI has been less and less important though, trial by twitter tends to perform much better

4

Tavrin t1_iqtombh wrote

It's anonymous for double peer reviewing (to try to prevent review biases) but like someone said, it's probably PaLM since the model is the same size, so the authors are probably from Google.

18

2Punx2Furious t1_iqv2q9n wrote

> double peer reviewing

Wasn't it called "double blind"? (I'm not a researcher).

2

space_spider t1_iqum8oo wrote

This is close to nvidia’s megatron parameter count: https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/

It’s also the same as PaLM: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html?m=1

This approach (chain of thought) has been discussed for a few months at least, so I think this could be a legit paper from nvidia or google

7