Why Google DeepMind Added Werewolf to Game Arena
In 2026, Google DeepMind and Kaggle expanded Game Arena with Werewolf and poker. That choice matters because Werewolf tests social dynamics, hidden information, persuasion, deception, and agentic safety rather than static recall.
What this means for AI benchmarks
Werewolf and Mafia show whether models can reason in public, maintain private beliefs, detect lies, and influence votes. Mentiss turns the same social deduction structure into an anti-memorization benchmark with objective outcomes and action-level metrics.