Available_Lion_652 t1_jcjtc6h wrote on March 17, 2023 at 10:44 AM

Reply to comment by Single_Blueberry in [D] GPT-4 is really dumb by [deleted]

I don t understand why people down voted. I saw a claim that GPT 4 was trained on 25k Nvidia A100 for several months. It has used x100 more compute power than GPT3, based on that post. 20 B Llama model was trained on 1.4 trillions tokens. So yeah, I think that my post is based on these claims

Single_Blueberry t1_jcjvh6o wrote on March 17, 2023 at 11:09 AM

Again, can't find a reliable source for that.

I personally doubt that GPT-4 is significantly larger than GPT 3.x, simply because that would also further inflate inference cost, which you generally want to avoid in a product (as opposed to a research feat).

Better architecture, better RLHF, more and better train data, more train compute? Seems all reasonable.

Orders of magnitudes larger again? Don't think so.