AdditionalPizza t1_itx7tn0 wrote
Reply to comment by visarga in [DEEPMIND] Transformers have shown remarkable capabilities - but can they improve themselves autonomously from trial and error? by Danuer_
"AD learns a more data-efficient RL algorithm than the one that generated the source data"
This part of the paper is very interesting. The transformer is able to improve upon the original RL algorithms used during pre-training.
Viewing a single comment thread. View all comments