lostmsu t1_iwnoxf0 wrote on November 16, 2022 at 11:59 PM

Have they mentioned Efficient Zero?

I think the author is severely behind of the current SOTA.

Singularian2501 OP t1_iwq1iph wrote on November 17, 2022 at 2:29 PM

https://www.lesswrong.com/posts/mRwJce3npmzbKfxws/efficientzero-how-it-works

A lesswrong article I have found that explains how efficient zero works.

In my opinion the author wants to say that systems like efficient zero are more efficient in their data usage and could be used for llm also to increase their sample efficiency.

To be honest I hope that my post gets so much attention that the author of the paper can answer our questions.

Singularian2501 OP t1_iwnpy8m wrote on November 17, 2022 at 12:07 AM

Yes they mentioned it at the end of their blog article. But I think it was only meant as an example how better sample efficiency could be achieved and not SOTA related.

13ass13ass t1_iwo4lan wrote on November 17, 2022 at 2:02 AM

Efficient zero is for RL with atari games though. How does it apply to things like large language models?

lostmsu t1_iws6anl wrote on November 17, 2022 at 11:08 PM

The point is there are many models that use the same technique.

[deleted] t1_iwo4us9 wrote on November 17, 2022 at 2:04 AM

[deleted]