MetaAI_Official OP t1_izfawoe wrote
Reply to comment by ClayStep in [D] We're the Meta AI research team behind CICERO, the first AI agent to achieve human-level performance in the game Diplomacy. We’ll be answering your questions on December 8th starting at 10am PT. Ask us anything! by MetaAI_Official
Actually the language model was capable of suggesting good moves to a human player *because* the planning side of CICERO had determined these to be good moves for that player and supplied those moves in an *intent* that it conditioned the language model to talk about. CICERO uses the same planning engine to find moves for itself and to find mutually beneficial moves to suggest to other players. Within the planning side, as described in our paper, we *do* use a finetuned language model to propose possible actions for both Cicero and the other players - this model is trained to predict actions directly, rather than dialogue. This gives a good starting point, but contains many bad moves as well, this is why we run a planning/search algorithm on top. -DW
ClayStep t1_izfv6fi wrote
Ah this was my misunderstanding then - I did not realize the language model was conditioned on intent (it makes perfect sense that it is). Thanks for the clarification!
Viewing a single comment thread. View all comments