Why Google DeepMind Added Werewolf to Game Arena

In 2026, Google DeepMind and Kaggle expanded Game Arena with Werewolf and poker. That choice matters because Werewolf tests social dynamics, hidden information, persuasion, deception, and agentic safety rather than static recall.

What this means for AI benchmarks

Werewolf and Mafia show whether models can reason in public, maintain private beliefs, detect lies, and influence votes. Mentiss turns the same social deduction structure into an anti-memorization benchmark with objective outcomes and action-level metrics.

Evidence and 2026 context