Submitted by Destiny_Knight t3_11tab5h in singularity
CellWithoutCulture t1_jcmsxjq wrote
Reply to comment by ThatInternetGuy in Those who know... by Destiny_Knight
HF-RLHF is the name of the dataset. As far as RLHF... what they did to LLaMA is called "Knowledge Distillation" and iirc usually isn't quite as good as RLHF. It's an approximation.
Viewing a single comment thread. View all comments