Viewing a single comment thread. View all comments

londons_explorer t1_jdn0t7k wrote

Paper after paper has shown that bigger model outperforms smaller model.

Sure, you can use tricks to make a small model work better. But apply those same tricks to a big model, and it works even better.

7

farmingvillein t1_jdnwda6 wrote

> But apply those same tricks to a big model, and it works even better.

In general, yes, although there are many techniques that help small models that do not help large ones.

That said, agree with your overall point. I think the only reason we won't see model sizes continue to inflate is if 1) there are substantial underlying architecture discoveries (possible!) or 2) we really hit problems with data availability. But synthetic + multi-modal probably gives us a ways to go there.

2

londons_explorer t1_jdo4kj3 wrote

Think how many hard drives there are in the world...

All of that data is potential training material.

I think a lot of companies/individuals might give up 'private' data in bulk for ML training if they get a viable benefit from it (for example, having a version of ChatGPT with perfect knowledge of all my friends and neighbours, what they like and do, etc. would be handy)

2