Revac: A Social Deduction Reasoning Agent

Aditya Ranjan; Avinash Anish; Mihir Shriniwas Arya

arxiv: 2604.19523 · v1 · submitted 2026-04-21 · 💻 cs.AI

Revac: A Social Deduction Reasoning Agent

Mihir Shriniwas Arya , Avinash Anish , Aditya Ranjan This is my paper

Pith reviewed 2026-05-10 01:51 UTC · model grok-4.3

classification 💻 cs.AI

keywords social deductionAI agentMafia gameplayer profilingsocial graph analysisadaptive communicationreasoning under uncertainty

0 comments

The pith

A multi-module AI agent for social deduction games uses memory profiling, social graph analysis, and dynamic tone selection to win first place in competition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes Revac-8, an AI agent for games like Mafia where players must draw inferences from incomplete information and intentional deception. It began as a basic two-stage reasoner but expanded into separate modules that maintain histories of other players, map patterns of accusations and defenses, and adjust communication style according to context. This architecture produced first-place results in the Social Deduction track of the MindGames Arena competition. A sympathetic reader would see the work as showing that explicit memory and social modeling can substitute for perfect information when agents must operate amid uncertainty.

Core claim

Revac-8 evolved from a simple two-stage reasoning system into a multi-module architecture that integrates memory-based player profiling, social-graph analysis of accusations and defenses, and dynamic tone selection for communication, where it achieved first place.

What carries the argument

The multi-module architecture integrating memory-based player profiling to track histories, social-graph analysis to examine accusation and defense patterns, and dynamic tone selection to adjust communication.

If this is right

Memory of prior statements enables agents to build profiles that improve role inference over time.
Social-graph analysis reveals consistent or suspicious interaction patterns that aid elimination decisions.
Dynamic tone selection increases the agent's ability to gather information and influence group outcomes.
These components together produce stronger results than simpler two-stage reasoning in deceptive environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same modular structure could be tested in other interactive domains that involve trust and misinformation, such as automated negotiation.
Controlled experiments matching the agent against human teams would clarify whether competition performance generalizes beyond AI opponents.
Adding deeper language models to the communication module might further close gaps in interpreting subtle human cues.

Load-bearing premise

Success against other AI agents in one specific competition demonstrates effective reasoning under uncertainty and deception that transfers to other settings or against human players.

What would settle it

Direct head-to-head games between Revac-8 and human players in Mafia, with recorded win rates and accuracy at identifying hidden roles.

Figures

Figures reproduced from arXiv: 2604.19523 by Aditya Ranjan, Avinash Anish, Mihir Shriniwas Arya.

**Figure 2.** Figure 2: Architecture of Revac_8 4 Evaluation The Revac_8 agent was evaluated using a two-fold approach, reflecting the dual challenge of the Social Mafia game: strategic reasoning (internal deduction) and effective communication (external action). 4.1 Competition Results The Revac agent achieved first place in the Open Division of the Social Deduction Track of Mindgames NeurIPS 2025. 5 [PITH_FULL_IMAGE:figures/fu… view at source ↗

read the original abstract

Social deduction games such as Mafia present a unique AI challenge: players must reason under uncertainty, interpret incomplete and intentionally misleading information, evaluate human-like communication, and make strategic elimination decisions. Unlike deterministic board games, success in Mafia depends not on perfect information or brute-force search, but on inference, memory, and adaptability in the presence of deception. This work presents the design and evaluation of Revac-8, an AI agent developed for the Social Deduction track of the MindGames Arena competition, where it achieved first place. The final agent evolved from a simple two-stage reasoning system into a multi-module architecture that integrates memory-based player profiling, social-graph analysis of accusations and defenses, and dynamic tone selection for communication. These results highlight the importance of structured memory and adaptive communication for achieving strong performance in high-stakes social environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Revac won a social deduction competition, but the paper offers no proof that its multi-module design caused the win.

read the letter

The paper's main takeaway is that their Revac agent won first place in the MindGames Arena social deduction track after adding memory-based player profiling, social-graph analysis of accusations, and dynamic tone selection to an earlier two-stage system. That's the concrete result they report. It does a decent job laying out why social deduction is tough for AI—deception, incomplete information, and the need to read human-like cues. The architecture they landed on makes sense as a way to handle memory and communication in those settings. Building on game AI ideas with these social elements is a natural step. The soft spot is the missing link between the design and the outcome. The description stays high-level, with no ablation studies, baseline comparisons, win rates, or statistical details from the competition. Without those, it's impossible to tell if the modules drove the success or if other factors like specific rules or variance played a bigger role. The stress-test note on this is on point. This kind of work would interest researchers working on multi-agent systems that deal with uncertainty and deception, perhaps for games or even real-world applications like group decision tools. A reader looking for architecture ideas might find it useful as a starting point, but anyone wanting to build on it or verify the claims will need more data. It deserves a serious referee because a competition win gives it some grounding, even if the current version is thin on evidence. I'd recommend sending it for review, but with clear feedback that they need to add quantitative evaluations and comparisons to make the attribution hold up.

Referee Report

2 major / 2 minor

Summary. The manuscript presents Revac-8, an AI agent for social deduction games such as Mafia in the MindGames Arena competition. It describes the agent's evolution from a simple two-stage reasoning system into a multi-module architecture incorporating memory-based player profiling, social-graph analysis of accusations and defenses, and dynamic tone selection for communication, claiming this design achieved first place in the Social Deduction track.

Significance. If the performance attribution holds after proper documentation, the work would illustrate the utility of structured memory and adaptive communication for AI reasoning under uncertainty and deception, with potential relevance to multi-agent systems. The external competition outcome provides an independent benchmark, though the absence of internal validation limits claims about component contributions or generalizability.

major comments (2)

[Abstract] Abstract: The first-place result is asserted without any evaluation metrics, baselines, win rates, statistical details, or ablation studies. This is load-bearing for the central claim that the multi-module additions (memory profiling, social-graph analysis, dynamic tone) drove the outcome, as success could arise from unstated factors such as rule-specific tuning or variance.
[Evaluation / Results (implied)] No section provides quantitative comparisons between the final multi-module agent and the initial two-stage system, or against other competition entrants. Without such data, the attribution of performance gains to the specific modules cannot be evaluated.

minor comments (2)

[Title / Abstract] The title refers to 'Revac' while the abstract uses 'Revac-8'; standardize the agent name and clarify versioning.
[Abstract] The abstract mentions 'high-stakes social environments' without defining the competition rules or game parameters, which would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on evaluation and attribution. We address each major point below and will revise the manuscript to incorporate additional discussion of limitations and available context where feasible.

read point-by-point responses

Referee: [Abstract] Abstract: The first-place result is asserted without any evaluation metrics, baselines, win rates, statistical details, or ablation studies. This is load-bearing for the central claim that the multi-module additions (memory profiling, social-graph analysis, dynamic tone) drove the outcome, as success could arise from unstated factors such as rule-specific tuning or variance.

Authors: We agree that the abstract would be strengthened by more context. The first-place result is from the MindGames Arena Social Deduction track, which functions as an external benchmark. However, the competition does not release detailed per-round win rates, statistical tests, or entrant baselines in a form suitable for inclusion. We will revise the abstract to describe the competition format and ranking more precisely while moderating language on module contributions to avoid implying isolated causal effects. This acknowledges that unstated factors could contribute to the outcome. revision: partial
Referee: [Evaluation / Results (implied)] No section provides quantitative comparisons between the final multi-module agent and the initial two-stage system, or against other competition entrants. Without such data, the attribution of performance gains to the specific modules cannot be evaluated.

Authors: We acknowledge the absence of such comparisons. The agent evolved iteratively during the competition, and no formal ablation studies or controlled experiments against the two-stage prototype were conducted due to time constraints. Detailed performance data from other entrants is limited to final rankings. We will add a dedicated subsection discussing the development stages, qualitative observations from testing, and explicit limitations on attributing gains to individual modules (memory profiling, social-graph analysis, dynamic tone). This will clarify that the competition outcome supports the overall architecture but does not enable component-level evaluation. revision: partial

Circularity Check

0 steps flagged

No significant circularity: external competition result anchors the claim

full rationale

The manuscript describes an empirical agent design process for Revac-8 in a social deduction competition, noting its evolution from a two-stage system to a multi-module architecture with memory profiling, social-graph analysis, and tone selection, culminating in a first-place finish. No equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The central performance claim is tied directly to an external, independently verifiable competition outcome rather than any internal reduction or ansatz smuggled through prior work, leaving the derivation chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is an empirical engineering project; the abstract contains no mathematical derivations, fitted parameters, axioms, or postulated entities.

pith-pipeline@v0.9.0 · 5438 in / 1090 out tokens · 55523 ms · 2026-05-10T01:51:25.288757+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

[1]

AI Wolf Contest: Development of Game AI Using Collective Intelligence

Fujio Toriumi, Hirotaka Osawa, Michimasa Inaba, Daisuke Katagami, Kosuke Shinoda, and Hitoshi Matsubara. AI Wolf Contest: Development of Game AI Using Collective Intelligence. In Computer Games, pages 101--115. Springer, 2017

work page 2017
[2]

RLupus: Cooperation through emergent communication in the Werewolf social deduction game

Nicol\`o Brandizzi, Davide Grossi, and Luca Iocchi. RLupus: Cooperation through emergent communication in the Werewolf social deduction game. Intelligenza Artificiale, 16(1):3--21, 2022

work page 2022
[3]

Finding deceivers in social context with large language models and how to find them: the case of the Mafia game

Byunghwa Yoo and Kyung-Joong Kim. Finding deceivers in social context with large language models and how to find them: the case of the Mafia game. Scientific Reports, 14, 2024

work page 2024
[4]

Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies

Zhiyang Qi and Michimasa Inaba. Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies. In Proceedings of the 2nd International AIWolfDial Workshop, pages 30--39. Association for Computational Linguistics, 2024

work page 2024
[5]

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations (ICLR), 2023

work page 2023
[6]

O'Brien, Carrie J

Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. Generative Agents: Interactive Simulacra of Human Behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1--22, 2023

work page 2023
[7]

Theory of Mind for Multi-Agent Collaboration via Large Language Models

Huao Li, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, and Katia Sycara. Theory of Mind for Multi-Agent Collaboration via Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1961--1979, 2023

work page 2023
[8]

Karen Liu, and Dorsa Sadigh

Bidipta Sarkar, Warren Xia, C. Karen Liu, and Dorsa Sadigh. Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning. In Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 1830--1839, 2025

work page 2025
[9]

Playing the werewolf game with artificial intelligence for language understanding

Hisaichi Shibata, Soichiro Miki, and Yuta Nakamura. Playing the Werewolf game with artificial intelligence for language understanding. arXiv preprint arXiv:2302.10646, 2023

work page arXiv 2023
[10]

Yannakakis, and Julian Togelius

Andrzej Liapis, Georgios N. Yannakakis, and Julian Togelius. Games for Artificial Intelligence Research: A Review and Framework. Artificial Intelligence Review, 57, 2024

work page 2024

[1] [1]

AI Wolf Contest: Development of Game AI Using Collective Intelligence

Fujio Toriumi, Hirotaka Osawa, Michimasa Inaba, Daisuke Katagami, Kosuke Shinoda, and Hitoshi Matsubara. AI Wolf Contest: Development of Game AI Using Collective Intelligence. In Computer Games, pages 101--115. Springer, 2017

work page 2017

[2] [2]

RLupus: Cooperation through emergent communication in the Werewolf social deduction game

Nicol\`o Brandizzi, Davide Grossi, and Luca Iocchi. RLupus: Cooperation through emergent communication in the Werewolf social deduction game. Intelligenza Artificiale, 16(1):3--21, 2022

work page 2022

[3] [3]

Finding deceivers in social context with large language models and how to find them: the case of the Mafia game

Byunghwa Yoo and Kyung-Joong Kim. Finding deceivers in social context with large language models and how to find them: the case of the Mafia game. Scientific Reports, 14, 2024

work page 2024

[4] [4]

Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies

Zhiyang Qi and Michimasa Inaba. Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies. In Proceedings of the 2nd International AIWolfDial Workshop, pages 30--39. Association for Computational Linguistics, 2024

work page 2024

[5] [5]

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations (ICLR), 2023

work page 2023

[6] [6]

O'Brien, Carrie J

Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. Generative Agents: Interactive Simulacra of Human Behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1--22, 2023

work page 2023

[7] [7]

Theory of Mind for Multi-Agent Collaboration via Large Language Models

Huao Li, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, and Katia Sycara. Theory of Mind for Multi-Agent Collaboration via Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1961--1979, 2023

work page 2023

[8] [8]

Karen Liu, and Dorsa Sadigh

Bidipta Sarkar, Warren Xia, C. Karen Liu, and Dorsa Sadigh. Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning. In Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 1830--1839, 2025

work page 2025

[9] [9]

Playing the werewolf game with artificial intelligence for language understanding

Hisaichi Shibata, Soichiro Miki, and Yuta Nakamura. Playing the Werewolf game with artificial intelligence for language understanding. arXiv preprint arXiv:2302.10646, 2023

work page arXiv 2023

[10] [10]

Yannakakis, and Julian Togelius

Andrzej Liapis, Georgios N. Yannakakis, and Julian Togelius. Games for Artificial Intelligence Research: A Review and Framework. Artificial Intelligence Review, 57, 2024

work page 2024