Comments
Singularian2501 t1_itw560i wrote
https://twitter.com/MishaLaskin/status/1585265485314129926 ( Very good Explanation!)
cszintiyl t1_itx3zkd wrote
More than meets the eye !
HyperImmune t1_itwuixs wrote
Can some ELI5 this for me? Seems like a pretty big step to AGI, but I don’t want to get ahead of myself here.
visarga t1_itx6vs1 wrote
They use a large context model to learn (distill) from the gameplay generated by other agents. They put more history in the context so the model needs less samples to learn.
This is significant for robots, bots and AI agents. Transformers are found to be very competent at learning to act/play/work relative to other methods, and this paper shows they can learn with less training.
AdditionalPizza t1_itx7tn0 wrote
"AD learns a more data-efficient RL algorithm than the one that generated the source data"
This part of the paper is very interesting. The transformer is able to improve upon the original RL algorithms used during pre-training.
Nmanga90 t1_itxgyuv wrote
Fuck transformers, all my homies hate transformers
ReverseCaptioningBot t1_itxh0o5 wrote
FUCK TRANSFORMERS ALL MY HOMIES HATE TRANSFORMERS
^^^this ^^^has ^^^been ^^^an ^^^accessibility ^^^service ^^^from ^^^your ^^^friendly ^^^neighborhood ^^^bot
Akimbo333 t1_itxr8w8 wrote
What are the benefits of this?
AdditionalPizza t1_ityza30 wrote
By adding RL algorithms into pre-teaining, the model is able to learn new tasks without having to offline fine tune it. So it's combining reinforment learning with a transformer. And another benefit is the transformer sometimes makes more efficient RL algorithms than the originals that it was trained with.
RL is reinforment learning, a machine learning technique, which is like giving a dog a treat when it does the right trick.
It's kind of hard to explain it simply, and I'm not qualified haha. But it's a pretty big deal. It's makes it way more "out of the box" ready.
Akimbo333 t1_itzw0hb wrote
That's awesome! Oh and I know that this might sound ignorant of me but what is a transformer?
AdditionalPizza t1_iu048nq wrote
A large language model is a transformer. An LM has tokens which are basically parts of words, like syllables and punctuation/spaces. During training it forms parameters from data. The data isn't saved, just the way it relates tokens to other tokens. If it were connect the dots, the dots are tokens and parameters are the lines. You type out a sentence, which is made of tokens and it spits out tokens. It predicts what tokens to return to you by the probability it learned of one token most likely following another. So it has reasoning based on the parameters during training, and some "policies" its given during pre-training.
I think that's a valid way to describe it in simple terms.
Akimbo333 t1_iu2d6go wrote
Oh ok. Thanks for the info!
Down_The_Rabbithole t1_ityj38i wrote
Can we switch away from transformers already? Multiple papers have demonstrated time and time again that transformers are inefficient and doesn't scale well towards AGI. Very cool for narrow AI applications but it's not the future of AI.
AdditionalPizza t1_itvm445 wrote
Here's the arxiv link for anyone interested.