No, we're finding that with new tuning techniques, improved datasets and some multimodal systems we can get our much smaller models to perform just as good as the big boys at many complex tasks. This is a huge area of research right now and is also showing us that our large models themselves also have huge growth potential we have yet to unlock.
The main benefit I've seen from this is not needing to run these massive things across multiple nodes or multiple GPUs which makes building a massive inference service much easier.
Fledgeling t1_jdmv8pp wrote
Reply to Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
No, we're finding that with new tuning techniques, improved datasets and some multimodal systems we can get our much smaller models to perform just as good as the big boys at many complex tasks. This is a huge area of research right now and is also showing us that our large models themselves also have huge growth potential we have yet to unlock.
The main benefit I've seen from this is not needing to run these massive things across multiple nodes or multiple GPUs which makes building a massive inference service much easier.