MetaAI_Official OP t1_izfjgik wrote on December 8, 2022 at 7:06 PM

Reply to comment by pyepyepie in [D] We're the Meta AI research team behind CICERO, the first AI agent to achieve human-level performance in the game Diplomacy. We’ll be answering your questions on December 8th starting at 10am PT. Ask us anything! by MetaAI_Official

Figuring out how to get strong control over the language model by grounding in "intents"/plans was one of the major challenges of this work. Fig. 4 in the paper shows we achieved relatively strong control in this sense: prior to any filters, ~93% of messages generated by CICERO were consistent with intents and ~87% were consistent with the game state. As you note, however, the model is not perfect, and we relied on a suite of classifiers to help filter additional mistakes. Many of the mistakes CICERO made were relative to information that was *not* directly represented in its input (and thus required additional reasoning steps), e.g., reasoning further-into-the-future states or counterfactual past states, discussing plans for third parties, etc. We could have considered grounding CICERO in a richer representation of "intents" (e.g., including plans for third parties) or of the game state (e.g., explicitly representing past states), but in practice we found that (i) richer intents would be harder to annotate/select and often take the language model out of distribution and (ii) we had to balance the trade off between richer game state representation with the dialogue history representation. Exploring ways to get stronger control/improve the reasoning capabilities of language models is an interesting future direction. -ED

pyepyepie t1_izgms77 wrote on December 8, 2022 at 11:29 PM

Interesting. I was completely surprised by the results (I honestly thought Diplomacy will take 10 years) - it's a great demo of how to utilize large language models without messing up :) Congrats.