Submitted by Vegetable-Skill-9700 t3_121agx4 in deeplearning
DataBricks's open-source LLM, Dolly performs reasonably well on many instruction-based tasks while being ~25x smaller than GPT-3, challenging the notion that is big always better?
From my personal experience, the quality of the model depends a lot on the fine-tuning data as opposed to just the sheer size. If you choose your retraining data correctly, you can fine-tune your smaller model to perform better than the state-of-the-art GPT-X. The future of LLMs might look more open-source than imagined 3 months back?
Would love to hear everyone's opinions on how they see the future of LLMs evolving? Will it be few players (OpenAI) cracking the AGI and conquering the whole world or a lot of smaller open-source models which ML engineers fine-tune for their use-cases?
P.S. I am kinda betting on the latter and building UpTrain, an open-source project which helps you collect that high quality fine-tuning dataset
FirstOrderCat t1_jdl9uwf wrote
gpt is cool because it memorized lots of info, so you can quickly retrieve it. LM memorization abilities likely dependent on the size.