fysmoe1121 t1_jdm2mtw wrote on March 25, 2023 at 12:16 PM Reply to Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700 deep double descent => bigger is better Permalink 2
fysmoe1121 t1_jdm2mtw wrote
Reply to Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
deep double descent => bigger is better