visarga t1_itx6vs1 wrote
Reply to comment by HyperImmune in [DEEPMIND] Transformers have shown remarkable capabilities - but can they improve themselves autonomously from trial and error? by Danuer_
They use a large context model to learn (distill) from the gameplay generated by other agents. They put more history in the context so the model needs less samples to learn.
This is significant for robots, bots and AI agents. Transformers are found to be very competent at learning to act/play/work relative to other methods, and this paper shows they can learn with less training.
AdditionalPizza t1_itx7tn0 wrote
"AD learns a more data-efficient RL algorithm than the one that generated the source data"
This part of the paper is very interesting. The transformer is able to improve upon the original RL algorithms used during pre-training.
Viewing a single comment thread. View all comments