Submitted by tmblweeds t3_zn0juq in MachineLearning
tl;drI built a site that uses GPT-3.5 to answer natural-language medical questions using peer-reviewed medical studies.
Live demo: https://www.glaciermd.com/search
Background
I've been working for a while on building a better version of WebMD, and I recently started playing around with LLMs, trying to figure out if there was anything useful there.
The problem with the current batch of "predict-next-token" LLMs is that they hallucinate—you can ask ChatGPT to answer medical questions, but it'll either
- Refuse to answer (not great)
- Give a completely false answer (really super bad)
So I spent some time trying to coax these LLMs to give answers based on a very specific set of inputs (peer-reviewed medical research) to see if I could get more accurate answers. And I did!
The best part is you can actually trace the final answer back to the original sources, which will hopefully instill some confidence in the result.
Here's how it works:
- User types in a question
- Pull top ~800 studies from Semantic Scholar and Pubmed
- Re-rank using
sentence-transformers/multi-qa-MiniLM-L6-cos-v1
- Ask
text-davinci-003
to answer the question based on the top 10 studies (if possible) - Summarize those answers using
text-davinci-003
Would love to hear what people think (and if there's a better/cheaper way to do it!).
---
UPDATE 1: So far the #1 piece of feedback has been that I should be way more explicit about the fact that this is a proof-of-concept and not meant to be taken seriously. To that end, I've just added a screen that explains this and requires you to acknowledge it before continuing.
​
Thoughts?
Update 2: Welp that's all the $$$ I have to spend on OpenAI credits, so the full demo isn't running anymore. But you can still follow the link above and browse existing questions/answers. Thanks for all the great feedback!
JanneJM t1_j0ejc6i wrote
Out of curiosity, how well does it work if you simply ask it to base the answer on Pubmed sources only, without any ranking or anything?