Submitted by verbigratia t3_zsvsic in MachineLearning
Every lunar lander tutorial or example I've found so far uses deep RL. Is classical Q learning such an obviously bad idea that no-one bothers with it? I've had some success recently applying Q learning to lunar lander (converting the continuous observations into discrete values) and am surprised there aren't more tutorials about this approach. Am I missing something?
fnbr t1_j1bc8i3 wrote
The main problem with tabular Q-learning (I'm assuming that by classical, you mean tabular) is that for most environments that are interesting, the state space is massive, so we can't actually store all states in memory.
In particular for lunar lander, you have a continuous observation space, so you need to apply some sort of discretization; at that point, you might as well just use tile coding or some sort of other function approximator.