Submitted by tmblweeds t3_zn0juq in MachineLearning
farmingvillein t1_j0ifmkt wrote
Reply to comment by Own-Plantain8065 in [P] Medical question-answering without hallucinating by tmblweeds
This also would probably be a good way to gather data on where the model may not be working.
If a relatively recent systematic review is giving a different result than a contemporaneous and/or older set of papers, it is probably (would need to verify this empirically) more likely that something is being processed incorrectly.
(Reviews obviously also aren't perfect--but my guess is that you'd find that they are pretty robust indicators of something being off.)
Viewing a single comment thread. View all comments