Viewing a single comment thread. View all comments

visarga t1_jd0akyj wrote

This is an artefact of RLHF. The model comes out well calibrated after pre-training, but the final stage of training breaks that calibration.

https://i.imgur.com/zlXRnB6.png

Explained by one of the lead authors of GPT4, Ilya Sutskever - https://www.youtube.com/watch?v=SjhIlw3Iffs&t=1072s

Ilya invites us to "find out" if we can quickly surpass the hallucination phase, maybe this year we will see his work pan out.

1