Viewing a single comment thread. View all comments

MysteryInc152 t1_jclpjzi wrote

It's predicting language. as long as the structure can allow properly to learn to predict language, you're good to go.

3

turnip_burrito t1_jcoul9i wrote

Yes, exactly. Everyone keeps leaving the architecture's inductive structural priors out of the discussion.

It's not all about data! The model matters too!

1