Viewing a single comment thread. View all comments

space_spider t1_iqum8oo wrote on October 3, 2022 at 5:24 AM

Reply to comment by Nmanga90 in Large Language Models Can Self-improve by Dr_Singularity

This is close to nvidia’s megatron parameter count: https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/

It’s also the same as PaLM: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html?m=1

This approach (chain of thought) has been discussed for a few months at least, so I think this could be a legit paper from nvidia or google