Submitted by Vegetable-Skill-9700 t3_121agx4 in deeplearning
DataBricks's open-source LLM, Dolly performs reasonably well on many instruction-based tasks while being ~25x smaller than GPT-3, challenging the notion that is big always better?
From my personal experience, the quality of the model depends a lot on the fine-tuning data as opposed to just the sheer size. If you choose your retraining data correctly, you can fine-tune your smaller model to perform better than the state-of-the-art GPT-X. The future of LLMs might look more open-source than imagined 3 months back?
Would love to hear everyone's opinions on how they see the future of LLMs evolving? Will it be few players (OpenAI) cracking the AGI and conquering the whole world or a lot of smaller open-source models which ML engineers fine-tune for their use-cases?
P.S. I am kinda betting on the latter and building UpTrain, an open-source project which helps you collect that high quality fine-tuning dataset
FesseJerguson t1_jdl5xd4 wrote
Personally I see networks of llms cooperating and a sort of director who controls them being the most powerful "ai". But small llms that are field experts will definitely have their place.