Viewing a single comment thread. View all comments

maskedpaki t1_j8c0eso wrote

I've seen so many things like this that actually end up surpassing gpt3 on some narrow benchmark with more optimised prompting rather than just being a better model overall

I hope I'm wrong this time

17

94746382926 t1_j8c1t4r wrote

Yeah we need more benchmarks.

5

beezlebub33 t1_j8d62pw wrote

Benchmarks are really hard and expensive. And they are not fun or exciting for the people involved; the groups that make them really deserve more credit.

1

gay_manta_ray t1_j8c1uv8 wrote

this benchmarks seems pretty comprehensive

3

maskedpaki t1_j8g2v6v wrote

no actually seems like a pretty narrow science benchmark

​

if you told me the MMLU 0 shot was higher than 175 billion gpt 3.5 with under a billion parameters then id be absolutely shocked

1