Viewing a single comment thread. View all comments

LetterRip t1_j6yj4z2 wrote

GPT-3 can be quantized to 4bit with little loss, to run on 2 Nvidia 3090's/4090's (Unpruned, pruned perhaps 1 3090/4090). At 2$ a day for 8 hours of electricity to run them, and 21 working days per month. That is 42$ per month (plus amortized cost of the cards and computer to store them).

3

Nhabls t1_j6ymx0w wrote

I seriously doubt they have been able to do what you just described.

Not to mention a rented double gpu setup, even the one you described would run you into the dozen(s) of dollars per day, not 2.

7

cunth t1_j71ocf6 wrote

Not sure about the above claim, but you can train a GPT2 model in 38 hours for about 600 bucks on rented hardware now. Costs are certainly coming down.

1