Submitted by Vegetable-Skill-9700 t3_121a8p4 in MachineLearning
wojapa t1_jdl23pj wrote
Did they use RLHF?
Vegetable-Skill-9700 OP t1_jdl2fbp wrote
I think it's just supervised training. Similar to alpaca, I guess
[deleted] t1_jdltev5 wrote
[removed]
A1-Delta t1_jdl325g wrote
GPT-J-6B fine tuned on Alpaca’s instruction dataset.
gamerx88 t1_jdmrlhh wrote
No, check their git repo. They used HF transformer's AutoFromCausalLM in their training script. It's supervised fine-tuning.
Viewing a single comment thread. View all comments