Submitted by [deleted] t3_11tmu9u in MachineLearning
Single_Blueberry t1_jcjsxa1 wrote
Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]
>the fact that GPT 4 may be two magnitude orders bigger than GPT 3
I'm not aware of any reliable sources that claim that.
Intuitively I don't see why it would stop hallucinating. I imagine the corpus - as big as it may be - doesn't contain a lot of examples for the concept of "not knowing the answer".
That's something people use a lot in private conversation, but not in written language on the public internet or books. Which afaik is where most of the data comes from.
Available_Lion_652 t1_jcjtc6h wrote
I don t understand why people down voted. I saw a claim that GPT 4 was trained on 25k Nvidia A100 for several months. It has used x100 more compute power than GPT3, based on that post. 20 B Llama model was trained on 1.4 trillions tokens. So yeah, I think that my post is based on these claims
Single_Blueberry t1_jcjvh6o wrote
Again, can't find a reliable source for that.
I personally doubt that GPT-4 is significantly larger than GPT 3.x, simply because that would also further inflate inference cost, which you generally want to avoid in a product (as opposed to a research feat).
Better architecture, better RLHF, more and better train data, more train compute? Seems all reasonable.
Orders of magnitudes larger again? Don't think so.
Viewing a single comment thread. View all comments