Submitted by Vegetable-Skill-9700 t3_121a8p4 in MachineLearning
gamerx88 t1_jdmrlhh wrote
Reply to comment by wojapa in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
No, check their git repo. They used HF transformer's AutoFromCausalLM in their training script. It's supervised fine-tuning.
Viewing a single comment thread. View all comments