disastorm t1_je8lm7w wrote on March 30, 2023 at 5:07 AM

I have a question about reinforcement learning, or more specifically gym-retro ( i know gym is pretty old now I guess ).

In the case of gym-retro, if you give a reward to the AI, are they actually looking at a set of variables and saying like "oh I pressed this button while all of these variables were these values and got this reward, so I should press it when all these variables are similar" or are they just saying like "oh I pressed this button and got this reward, so I should press it more often"?