Submitted by Valachio t3_10ack6h in MachineLearning
ElectronicCress3132 t1_j45f3yo wrote
Reply to comment by VirtualHat in [D] Has ML become synonymous with AI? by Valachio
> And integrating good-old-fashioned-ai > (GOFAI) with more modern ML is becoming an area of increasing research interest.
Any papers you recommend on this topic?
gaymuslimsocialist t1_j45gwsl wrote
Any of the papers in the AlphaZero family do this: they combine tree search with learned policy and value functions.
FallUpJV t1_j45sb9f wrote
Would you recommend any of those or is there one in particular that introduces well such techniques?
Edit : I'm asking this because I'm not very familiar with what tree search means in the first place
cdsmith t1_j460nf2 wrote
Tree search means precisely that: searching a tree. In the context of AlphaZero, the tree is the game tree. That is:
- I can move my pawn to e4. Then:
- You could move your knight to c6
- ...
- Or you could move your pawn to e6
- ...
- Or ...
- You could move your knight to c6
- Or, I could move my pawn to d4. Then:
- You could take my pawn with your pawn on c5.
- ...
- Or you could move your knight to c6.
- ...
- Or you could move your pawn to d5.
- ...
- Or ...
- You could take my pawn with your pawn on c5.
- Or, I could ...
That's it. The possible moves at each game state, and the game states that they lead to, form a tree. (Actually more like a DAG, since transpositions are possible, but it's often simplified by calling it a tree.) Searching that tree up to a certain depth amounts to thinking forward that many moves in the game. The way you search the tree is some variation on minimax: that is, you want to choose the best move for yourself now, but that means at the next level down, you want to pessimistically only consider the best move for your opponent (which is the worst one for you), etc. Variations come in terms of what order you visit the various nodes of the tree. You could just do a straight-forward depth-first traversal up to a certain depth, in which case this is traditional minimax search. You can refuse to ever visit some nodes, because you know they can't possibly matter, and that's alpha-beta pruning. You could even visit nodes in a random order, changing the likelihood of visiting each node based on a constantly updated estimate of how likely it is to matter, and that's roughly what happens in monte carlo tree search. Either way, you're just traversing that tree in some order.
AlphaZero combines this with machine learning by using two empirically trained machine learning algorithms to tweak the traversal order of the tree, by identifying moves that seem likely to be good, as well as to evaluate partially completed games to estimate how good they look for each player. But ultimately, the machine learning models just plug into certain holes in the tree search algorithm.
dr3aminc0de t1_j46y5o7 wrote
How does this translate to the other Alpha* AIs? I’m thinking of AlphaCraft (star craft) which has too many move options to model it as a tree.
K3tchM t1_j46kidw wrote
You have this survey about ML for Combinatorial Optimization from Bengio, Lodi, and Provost.
OpenAI's paper about a robot hand learning to solve a rubik's cube
Also check-out anything mentioning neurosymbolic AI, which aims to combine neural network learning with logic-based reasoning.
Gary Marcus wrote an extensive note on the subject that I recommend as well
Viewing a single comment thread. View all comments