Submitted by AutoModerator t3_122oxap in MachineLearning
disastorm t1_je8lm7w wrote
I have a question about reinforcement learning, or more specifically gym-retro ( i know gym is pretty old now I guess ).
In the case of gym-retro, if you give a reward to the AI, are they actually looking at a set of variables and saying like "oh I pressed this button while all of these variables were these values and got this reward, so I should press it when all these variables are similar" or are they just saying like "oh I pressed this button and got this reward, so I should press it more often"?
Viewing a single comment thread. View all comments