andreichiffa t1_jdu5wmj wrote
Yes, that’s the mechanism GPT-4 paper showed they were using for a bunch of things in the annex. It was initially discovered in the toxicity detection domain (RealToxicPrompts paper I believe)
tamilupk OP t1_jdvcasr wrote
Thanks, I was not aware of it before. I believe you are referring the below,
>For closed-domain hallucinations, we are able to use GPT-4 itself to generate synthetic data.Specifically, we design a multi-step process to generate comparison data:
>
>1. Pass a prompt through GPT-4 model and get a response
>
>2. Pass prompt + response through GPT-4 with an instruction to list all hallucinations(a) If no hallucinations are found, continue
>
>3. Pass prompt + response + hallucinations through GPT-4 with an instruction to rewrite theresponse without hallucinations
>
>4. Pass prompt + new response through GPT-4 with an instruction to list all hallucinations
>
>(a) If none are found, keep (original response, new response) comparison pair
>
>(b) Otherwise, repeat up to 5x
muskoxnotverydirty t1_jdwjc1w wrote
And this method doesn't have some of the drawbacks seen in OP's prompting. Giving an example of an incorrect response followed by self-correction within the prompt may make it more likely that the initial response is wrong, since that's the pattern you're showing it.
Viewing a single comment thread. View all comments