AdditionalPizza t1_iu048nq wrote on October 27, 2022 at 3:55 PM

Reply to comment by Akimbo333 in [DEEPMIND] Transformers have shown remarkable capabilities - but can they improve themselves autonomously from trial and error? by Danuer_

A large language model is a transformer. An LM has tokens which are basically parts of words, like syllables and punctuation/spaces. During training it forms parameters from data. The data isn't saved, just the way it relates tokens to other tokens. If it were connect the dots, the dots are tokens and parameters are the lines. You type out a sentence, which is made of tokens and it spits out tokens. It predicts what tokens to return to you by the probability it learned of one token most likely following another. So it has reasoning based on the parameters during training, and some "policies" its given during pre-training.

I think that's a valid way to describe it in simple terms.

Akimbo333 t1_iu2d6go wrote on October 28, 2022 at 1:10 AM

Oh ok. Thanks for the info!