magnets-are-magic t1_jczs8oe wrote on March 20, 2023 at 8:14 PM

Reply to comment by User1539 in Teachers wanted to ban calculators in 1988. Now, they want to ban ChatGPT. by redbullkongen

It makes up sources even when you explicitly tell it not to. I’ve tried a variety of approaches and it’s unavoidable in my experience. It will make up authors, book/article/paper titles, dates, statistics, content, etc - it will make all of them up and will confidently tell you that they’re real and accurate.

User1539 t1_jczslss wrote on March 20, 2023 at 8:16 PM

yeah, that reminds me of when it confidently told me what the code it produced did ... but it wasn't right.

it's kind of weird when you can't say 'No, can't you read what you just produced? That's not what that does at all!'

visarga t1_jd0akyj wrote on March 20, 2023 at 10:14 PM

This is an artefact of RLHF. The model comes out well calibrated after pre-training, but the final stage of training breaks that calibration.

https://i.imgur.com/zlXRnB6.png

Explained by one of the lead authors of GPT4, Ilya Sutskever - https://www.youtube.com/watch?v=SjhIlw3Iffs&t=1072s

Ilya invites us to "find out" if we can quickly surpass the hallucination phase, maybe this year we will see his work pan out.

magnets-are-magic t1_jd1a58w wrote on March 21, 2023 at 2:31 AM

Super interesting, thanks for sharing!