Viewing a single comment thread. View all comments

zzzthelastuser t1_iwpi7r5 wrote

You could argue GPT-3 was trained on a subset of the available training data, no?

Not completing the first pass-through means the remaining data could be considered as not part of the training data.

7

ReasonablyBadass t1_iwplk0c wrote

Semantics. It didn't see any of it's data more than once and it had more available. Not one full epoch.

9

zzzthelastuser t1_iwpltkw wrote

Sure, but in theory my little Hello World network had also more data available on the internet.

4