AdditionalPizza t1_itvm445 wrote on October 26, 2022 at 5:09 PM

Here's the arxiv link for anyone interested.

Singularian2501 t1_itw560i wrote on October 26, 2022 at 7:11 PM

https://twitter.com/MishaLaskin/status/1585265485314129926 ( Very good Explanation!)

cszintiyl t1_itx3zkd wrote on October 26, 2022 at 11:02 PM

More than meets the eye !

HyperImmune t1_itwuixs wrote on October 26, 2022 at 9:53 PM

Can some ELI5 this for me? Seems like a pretty big step to AGI, but I don’t want to get ahead of myself here.

visarga t1_itx6vs1 wrote on October 26, 2022 at 11:24 PM

They use a large context model to learn (distill) from the gameplay generated by other agents. They put more history in the context so the model needs less samples to learn.

This is significant for robots, bots and AI agents. Transformers are found to be very competent at learning to act/play/work relative to other methods, and this paper shows they can learn with less training.

AdditionalPizza t1_itx7tn0 wrote on October 26, 2022 at 11:31 PM

"AD learns a more data-efficient RL algorithm than the one that generated the source data"

This part of the paper is very interesting. The transformer is able to improve upon the original RL algorithms used during pre-training.

Nmanga90 t1_itxgyuv wrote on October 27, 2022 at 12:39 AM

Fuck transformers, all my homies hate transformers

ReverseCaptioningBot t1_itxh0o5 wrote on October 27, 2022 at 12:40 AM

FUCK TRANSFORMERS ALL MY HOMIES HATE TRANSFORMERS

^^^this ^^^has ^^^been ^^^an ^^^accessibility ^^^service ^^^from ^^^your ^^^friendly ^^^neighborhood ^^^bot

Akimbo333 t1_itxr8w8 wrote on October 27, 2022 at 1:57 AM

What are the benefits of this?

AdditionalPizza t1_ityza30 wrote on October 27, 2022 at 10:21 AM

By adding RL algorithms into pre-teaining, the model is able to learn new tasks without having to offline fine tune it. So it's combining reinforment learning with a transformer. And another benefit is the transformer sometimes makes more efficient RL algorithms than the originals that it was trained with.

RL is reinforment learning, a machine learning technique, which is like giving a dog a treat when it does the right trick.

It's kind of hard to explain it simply, and I'm not qualified haha. But it's a pretty big deal. It's makes it way more "out of the box" ready.

Akimbo333 t1_itzw0hb wrote on October 27, 2022 at 3:01 PM

That's awesome! Oh and I know that this might sound ignorant of me but what is a transformer?

AdditionalPizza t1_iu048nq wrote on October 27, 2022 at 3:55 PM

A large language model is a transformer. An LM has tokens which are basically parts of words, like syllables and punctuation/spaces. During training it forms parameters from data. The data isn't saved, just the way it relates tokens to other tokens. If it were connect the dots, the dots are tokens and parameters are the lines. You type out a sentence, which is made of tokens and it spits out tokens. It predicts what tokens to return to you by the probability it learned of one token most likely following another. So it has reasoning based on the parameters during training, and some "policies" its given during pre-training.

I think that's a valid way to describe it in simple terms.

Akimbo333 t1_iu2d6go wrote on October 28, 2022 at 1:10 AM

Oh ok. Thanks for the info!

Down_The_Rabbithole t1_ityj38i wrote on October 27, 2022 at 6:33 AM

Can we switch away from transformers already? Multiple papers have demonstrated time and time again that transformers are inefficient and doesn't scale well towards AGI. Very cool for narrow AI applications but it's not the future of AI.

[DEEPMIND] Transformers have shown remarkable capabilities - but can they improve themselves autonomously from trial and error?

Comments