Viewing a single comment thread. View all comments

zzzthelastuser t1_iwpi7r5 wrote on November 17, 2022 at 11:26 AM

Reply to comment by ReasonablyBadass in [R] Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning - Epochai Pablo Villalobos et al - Trend of ever-growing ML models might slow down if data efficiency is not drastically improved! by Singularian2501

You could argue GPT-3 was trained on a subset of the available training data, no?

Not completing the first pass-through means the remaining data could be considered as not part of the training data.

ReasonablyBadass t1_iwplk0c wrote on November 17, 2022 at 12:06 PM

Semantics. It didn't see any of it's data more than once and it had more available. Not one full epoch.

zzzthelastuser t1_iwpltkw wrote on November 17, 2022 at 12:09 PM

Sure, but in theory my little Hello World network had also more data available on the internet.