Viewing a single comment thread. View all comments

mlresearchoor t1_je1mvf7 wrote

OpenAI blatantly ignored the norm to not train on the ~200 tasks collaboratively prepared by the community for BIG-bench. GPT-4 knows the BIG-bench canary ID afaik, which removes the validity of GPT-4 eval on BIG-bench.

OpenAI is cool, but they genuinely don't care about academic research standards or benchmarks carefully created over years by other folks.

92

obolli t1_je4juzh wrote

I think they used to. Things change when you come under the pressure of returning profits.

21