Do we really need 100B+ parameters in a large language model? Submitted by Vegetable-Skill-9700 t3_121agx4 on March 25, 2023 at 4:24 AM in deeplearning 54 comments 43
leoreno t1_jdp6421 wrote on March 26, 2023 at 2:19 AM Meta llama model and paper aimed to answer this Tldr no of you get crafty about model serving efficiency Permalink 1
Viewing a single comment thread. View all comments