Submitted by Vegetable-Skill-9700 t3_121a8p4 in MachineLearning
farmingvillein t1_jdnwda6 wrote
Reply to comment by londons_explorer in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
> But apply those same tricks to a big model, and it works even better.
In general, yes, although there are many techniques that help small models that do not help large ones.
That said, agree with your overall point. I think the only reason we won't see model sizes continue to inflate is if 1) there are substantial underlying architecture discoveries (possible!) or 2) we really hit problems with data availability. But synthetic + multi-modal probably gives us a ways to go there.
londons_explorer t1_jdo4kj3 wrote
Think how many hard drives there are in the world...
All of that data is potential training material.
I think a lot of companies/individuals might give up 'private' data in bulk for ML training if they get a viable benefit from it (for example, having a version of ChatGPT with perfect knowledge of all my friends and neighbours, what they like and do, etc. would be handy)
Viewing a single comment thread. View all comments