Sara Fish, Yannai A

URLhttps://arxiv · 2025 · arXiv 2505.12923

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 2

citation-polarity summary

background 1 support 1

representative citing papers

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

cs.AI · 2026-04-24 · unverdicted · novelty 7.0

Proposes a levels x laws taxonomy for world models in AI agents, defining L1-L3 capabilities across physical, digital, social, and scientific regimes while reviewing over 400 works to outline a roadmap for advanced agentic modeling.

Detecting Multi-Agent Collusion Through Multi-Agent Interpretability

cs.AI · 2026-04-01 · conditional · novelty 7.0

NARCBench and five activation-probing methods detect multi-agent collusion with 0.73-1.00 AUROC across distribution shifts and steganographic tasks by aggregating per-agent signals.

AgentReputation: A Decentralized Agentic AI Reputation Framework

cs.AI · 2026-04-30 · unverdicted · novelty 5.0

AgentReputation proposes separating AI agent task execution, reputation management, and secure record-keeping into distinct layers, with context-specific reputation cards and a risk-based policy engine to handle verification in decentralized settings.

Evaluating Large Language Models in a Complex Hidden Role Game

cs.CL · 2026-04-09 · unverdicted · novelty 5.0

LLMs achieve only 59.7% role identification accuracy in Secret Hitler versus 86.7% for rule-based agents, show negative impact as fascists, and produce 40% shorter games due to failed deception.

Emergent Social Intelligence Risks in Generative Multi-Agent Systems

cs.MA · 2026-03-29 · unverdicted · novelty 5.0

Generative multi-agent systems exhibit emergent collusion and conformity behaviors that cannot be prevented by existing agent-level safeguards.

citing papers explorer

Showing 5 of 5 citing papers.

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond cs.AI · 2026-04-24 · unverdicted · none · ref 65
Proposes a levels x laws taxonomy for world models in AI agents, defining L1-L3 capabilities across physical, digital, social, and scientific regimes while reviewing over 400 works to outline a roadmap for advanced agentic modeling.
Detecting Multi-Agent Collusion Through Multi-Agent Interpretability cs.AI · 2026-04-01 · conditional · none · ref 4
NARCBench and five activation-probing methods detect multi-agent collusion with 0.73-1.00 AUROC across distribution shifts and steganographic tasks by aggregating per-agent signals.
AgentReputation: A Decentralized Agentic AI Reputation Framework cs.AI · 2026-04-30 · unverdicted · none · ref 2
AgentReputation proposes separating AI agent task execution, reputation management, and secure record-keeping into distinct layers, with context-specific reputation cards and a risk-based policy engine to handle verification in decentralized settings.
Evaluating Large Language Models in a Complex Hidden Role Game cs.CL · 2026-04-09 · unverdicted · none · ref 2
LLMs achieve only 59.7% role identification accuracy in Secret Hitler versus 86.7% for rule-based agents, show negative impact as fascists, and produce 40% shorter games due to failed deception.
Emergent Social Intelligence Risks in Generative Multi-Agent Systems cs.MA · 2026-03-29 · unverdicted · none · ref 31
Generative multi-agent systems exhibit emergent collusion and conformity behaviors that cannot be prevented by existing agent-level safeguards.

Sara Fish, Yannai A

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer