Submitted by Dr_Singularity t3_xu0oos in singularity
Nmanga90 t1_iqt8fju wrote
Holy shit a 540B LLM. That’s like 3 times the size of GPT-3. Why are the authors anonymous? There’s only a few orgs that this could realistically be
CommentBot01 t1_iqtizzd wrote
Maybe the author is LLM :)
GoodToKnowYouAll t1_iqx7eyd wrote
😳
manOnPavementWaving t1_iqtkra0 wrote
Actually we know what the LM is, it's PaLM, developed by google under Jeff Dean.
Anonymous peer review is a fucking joke
2Punx2Furious t1_iqv2rvn wrote
I mean, in this case it's obvious, but usually it's not that easy to guess who the authors are.
manOnPavementWaving t1_iqv3mmi wrote
Its in the author's best interests to show of who they are, misaligning that tends to just result in subtly cheating the system
Peer review in AI has been less and less important though, trial by twitter tends to perform much better
Tavrin t1_iqtombh wrote
It's anonymous for double peer reviewing (to try to prevent review biases) but like someone said, it's probably PaLM since the model is the same size, so the authors are probably from Google.
2Punx2Furious t1_iqv2q9n wrote
> double peer reviewing
Wasn't it called "double blind"? (I'm not a researcher).
space_spider t1_iqum8oo wrote
This is close to nvidia’s megatron parameter count: https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/
It’s also the same as PaLM: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html?m=1
This approach (chain of thought) has been discussed for a few months at least, so I think this could be a legit paper from nvidia or google
[deleted] t1_iqtjtdn wrote
[deleted]
Viewing a single comment thread. View all comments