tamilupk OP t1_jdxfyf7 wrote on March 27, 2023 at 10:09 PM

Reply to comment by New-Act1498 in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

Looks like the Open AI has tried it, check the response to the first reply or the GPT 4 paper annex.

tamilupk OP t1_jdw5mis wrote on March 27, 2023 at 5:11 PM

Reply to comment by boglepy in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

This is the API playground in the Open AI website. https://platform.openai.com/playground?mode=chat

tamilupk OP t1_jdvueqz wrote on March 27, 2023 at 3:58 PM

Reply to comment by erelim in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

This is the API playground in the Open AI website. https://platform.openai.com/playground?mode=chat

tamilupk OP t1_jdvk3xs wrote on March 27, 2023 at 2:50 PM

Reply to comment by LifeScientist123 in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

Yeah humans tend to do that, but llms seems to be a bit better than humans in this. As someone replied to this post even OpenAI used this kind of technique to reduce toxicity/ hallucinations.

tamilupk OP t1_jdvfacd wrote on March 27, 2023 at 2:17 PM

Reply to comment by timelyparadox in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

No, I am no data scientist. I am building a tool based on GPT-4 just wanted to discuss about my ideas on this forum to see if it has any holes. Not trying to prove or disprove anything.

tamilupk OP t1_jdvecc3 wrote on March 27, 2023 at 2:10 PM

Reply to comment by killerfridge in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

That's an interesting thought, for the example prompts at least I tested without the review prompt, it gave out the same answer unless I add "think step by step" at the end of the question. I will test more on this.

tamilupk OP t1_jdve17f wrote on March 27, 2023 at 2:08 PM

Reply to comment by yaosio in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

Yeah, Bing seems too sensitive, it will close the conversation right away if you even ask for clarification the second time. But my intention is to use the chatGPT api, let's see how it works.
Don't even get me started on Bard, it was a huge disappointment for me, I had big expectations even after that paris event. I am saying this being a fan of google products and also it's researches.
I still have hopes that at least their PaLM model to come close to GPT4.

tamilupk OP t1_jdvcsyz wrote on March 27, 2023 at 1:59 PM

Reply to comment by timelyparadox in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

My prompt in the system might be misleading, but my aim was to review only the reasoning answers like the ones listed in the screenshot. A significant portion answers that needs reasoning getting corrected this way.

tamilupk OP t1_jdvcasr wrote on March 27, 2023 at 1:55 PM

Reply to comment by andreichiffa in [D] Will prompting the LLM to review it's own answer be any helpful to reduce chances of hallucinations? I tested couple of tricky questions and it seems it might work. by tamilupk

Thanks, I was not aware of it before. I believe you are referring the below,

>For closed-domain hallucinations, we are able to use GPT-4 itself to generate synthetic data.Specifically, we design a multi-step process to generate comparison data:
>
>1. Pass a prompt through GPT-4 model and get a response
>
>2. Pass prompt + response through GPT-4 with an instruction to list all hallucinations(a) If no hallucinations are found, continue
>
>3. Pass prompt + response + hallucinations through GPT-4 with an instruction to rewrite theresponse without hallucinations
>
>4. Pass prompt + new response through GPT-4 with an instruction to list all hallucinations
>
>(a) If none are found, keep (original response, new response) comparison pair
>
>(b) Otherwise, repeat up to 5x