Submitted by Balance- t3_124eyso in MachineLearning
regalalgorithm t1_je1eu1e wrote
FYI, the GPT 4 paper has a whole section on contamination in the appendix - I found it to be pretty convince. Removing contaminatimg data did make it worse at some benchmarks, but also better at others, and overall it wasn't a huge effect.
StellaAthena t1_je3tz04 wrote
I found this analysis incredibly unconvincing. They used a weaker standard for deduplication than is standard as well as a weaker analysis than the one they did for the GPT-3 paper.
Viewing a single comment thread. View all comments