minhrongcon2000 t1_jdr6xtv wrote on March 26, 2023 at 3:34 PM

Reply to [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

Right now yes! Most of the papers published recently (like Chinchilla, GPT, etc.) show a scaling law on the number of data wrt the number of params in a model. If you want a no-brain training with little preprocessing, bigger models are mostly better. However, if you have sufficient data, then the number of params needed may be mitigated. However, I feel like the number of parameters decreases really slow when the data size grows. So yeah, we still somehow need larger model (of course, this also depends on the scenario where you apply LLM, for example, you don't really need that big of a model for an ecom app)

minhrongcon2000 t1_jcdi9jh wrote on March 16, 2023 at 1:18 AM

Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

Firstly, since OpenAI has released such a good chatbot right now, there is no point of enforcing patent for google and meta chatbot since patent requires you to public your work for other parties to verify that your work didn't overlap with current patent. Secondly, it's too late for Google to do patent now since it is widely used now :D

minhrongcon2000 t1_j6vox3u wrote on February 2, 2023 at 5:17 AM

Reply to [D] What does a DL role look like in ten years? by PassingTumbleweed

Maybe a resource-hungry industry that occupies 85% of the world's energy

minhrongcon2000 OP t1_iyc2a84 wrote on November 30, 2022 at 7:33 AM

Reply to comment by YouAgainShmidhoobuh in [D] Does Transformer need huge pretraining process? by minhrongcon2000

So it does mean that Transformer truly shines when the amount of data is huge right (maybe the word huge is a bit underwhelming for this)