Viewing a single comment thread. View all comments

magnets-are-magic t1_jczs8oe wrote

It makes up sources even when you explicitly tell it not to. I’ve tried a variety of approaches and it’s unavoidable in my experience. It will make up authors, book/article/paper titles, dates, statistics, content, etc - it will make all of them up and will confidently tell you that they’re real and accurate.

2

User1539 t1_jczslss wrote

yeah, that reminds me of when it confidently told me what the code it produced did ... but it wasn't right.

it's kind of weird when you can't say 'No, can't you read what you just produced? That's not what that does at all!'

1

visarga t1_jd0akyj wrote

This is an artefact of RLHF. The model comes out well calibrated after pre-training, but the final stage of training breaks that calibration.

https://i.imgur.com/zlXRnB6.png

Explained by one of the lead authors of GPT4, Ilya Sutskever - https://www.youtube.com/watch?v=SjhIlw3Iffs&t=1072s

Ilya invites us to "find out" if we can quickly surpass the hallucination phase, maybe this year we will see his work pan out.

1