he_who_floats_amogus t1_jdu8479 wrote on March 27, 2023 at 6:03 AM

You could do that, but if it's just hallucinating the confidence intervals then it really isn't very neat. The language model have very high reward for hallucinated responses for things like confidence intervals in particular, because hallucinating figures like this will still produce very coherent responses.

SoylentRox t1_jdu9ya6 wrote on March 27, 2023 at 6:27 AM

So this is an Open domain hallucination:

Closed domain hallucinations refer to instances in which the model is instructed to use only information provided

in a given context, but then makes up extra information that was not in that context. For example, if you ask the

model to summarize an article and its summary includes information that was not in the article, then that would be a

closed-domain hallucination.

Open domain hallucinations, in contrast, are when the model confidently provides false

information about the world without reference to any particular input context.

They handled this via : For tackling open-domain hallucinations, we
collect real-world ChatGPT data that has been flagged by users as being not factual, and collect
additional labeled comparison data that we use to train our reward models.

Not very productive. The best way to check references would be using a plugin and instructions to the model to "check references". The machine also needs to have RL training so that it will use the plugin and use it correctly the first time.

metigue t1_jdw08fp wrote on March 27, 2023 at 4:36 PM

Doesn't GPT-4 have some kind of reinforcement learning already baked in though? I asked it what "green as gravy" meant and it responded with a hallucination about it being a widely used expression and examples of its usage. I said "Nice try, but green as gravy is not a widely used expression is it?" It clarified that it is not a widely used expression and it made the stuff up as a possible definition of green as gravy.

Edit: Tried again just now and it still works. Leave system on default and try the user message: What is the meaning of "green as gravy"

SoylentRox t1_jdw2yey wrote on March 27, 2023 at 4:54 PM

It is not learning from your chats. Apparently OAI does farm for information from CHATGPT queries specifically for RL runs. And I was mentioning that in order for "plugin" support to work even sorta ok the machine absolutely has to learn from it's mistakes.

Remember all it knows is a plugin claims to do something by a description. The machine needs to accurately estimate if a particular user request is going to actually be satisfied by a particular plugin and also how to format the query correctly the first time.

Without this feature it would probably just use a single plugin, ignoring all the others, or get stuck emitting malformed requests a lot and just guess the answer like it does now.

master3243 t1_jdudizk wrote on March 27, 2023 at 7:17 AM

Who needs statistical tests with theoretical grounding and justified/repeatable results when you've got LLMs™

mizmato t1_jdvgcla wrote on March 27, 2023 at 2:24 PM

I've seen too many posts on Reddit trying to justify X by saying ChatGPT told them to do it (e.g., asking ChatGPT to do their taxes and then submitting the results). LLMs are something else.

yaosio t1_jduzcus wrote on March 27, 2023 at 12:06 PM

It can also return hallucinated results from a real source. I've had Bing Chat fabricate paragraphs from real papers. The sidebar can see pages and documents, and even when in the PDF for the paper it will still make things up.

ypxkap t1_jdxwirl wrote on March 28, 2023 at 12:10 AM

the bing chat thing is interesting because it can’t seem to tell when it can’t see the whole page, eg if you ask it “what’s the last line of this webpage” you’ll get some line x words in (usually ~1100 words for me but it’s been awhile since i checked). if you then send text from after the “last sentence”, it will act like it’s been looking at it the whole time, but as far as i can tell it has no capacity to notice the text otherwise. i asked it to summarize a chat log txt file i had loaded into edge and it included in the summary that there was an advertisement for an iphone 14 and also that “user threatened to harm the AI”, neither of which were present in the text file. that gives me the impression that it’s seeing something completely different from what edge is displaying that also includes instructions over how to respond in some scenarios including being threatened?

muskoxnotverydirty t1_jdvak20 wrote on March 27, 2023 at 1:42 PM

We've already seen similar prompts such as telling it to say "I don't know" when it doesn't know, and then priming it with examples of it saying "I don't know" to nonsense. Maybe there's something to the added work of getting an output and then iteratively self-critiquing to get to a better final output.

I wonder if they could be using this idea to automatically and iteratively generate and improve their training dataset at scale, which would create a sort of virtuous cycle of improve dataset -> improve LLM -> repeat.

[D]GPT-4 might be able to tell you if it hallucinated

Borrowedshorts t1_jdu1o78 wrote on March 27, 2023 at 4:47 AM