Viewing a single comment thread. View all comments

maskedpaki t1_j9z7sxs wrote

yes!. the really big breakthrough here is that its on par with the original gpt3 at only 7 billion parameters on a bunch of benchmarks ive seen.

​

that means its gotten 25x more efficient in the last 3 years.

I wonder how efficient these things can get. Like are we going to see a model thats 280 million parameters that rivals original gpt3 in 2026 and a 11 million parameter one in 2029.

3