mlresearchoor t1_je1mvf7 wrote on March 28, 2023 at 7:30 PM

OpenAI blatantly ignored the norm to not train on the ~200 tasks collaboratively prepared by the community for BIG-bench. GPT-4 knows the BIG-bench canary ID afaik, which removes the validity of GPT-4 eval on BIG-bench.

OpenAI is cool, but they genuinely don't care about academic research standards or benchmarks carefully created over years by other folks.

obolli t1_je4juzh wrote on March 29, 2023 at 11:18 AM

I think they used to. Things change when you come under the pressure of returning profits.

mr_house7 t1_je5iuk0 wrote on March 29, 2023 at 3:47 PM

Microsoft is the one in charge now.