AgentSim: A Platform for Verifiable Agent-Trace Simulation

· 2026 · cs.IR · arXiv 2604.26653

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Training trustworthy agentic LLMs requires data that shows the grounded reasoning process, not just the final answer. Existing datasets fall short: question-answering data is outcome-only, chain-of-thought data is not tied to specific documents, and web-agent datasets track interface actions rather than the core retrieval and synthesis steps of a RAG workflow. We introduce AgentSim, an open-source platform for simulating RAG agents. It generates verifiable, stepwise traces of agent reasoning over any document collection. AgentSim uses a policy to ensure the agent widely explores the document set. It combines a multi-model validation pipeline with an active human-in-the-loop process. This approach focuses human effort on difficult steps where models disagree. Using AgentSim, we construct and release the Agent-Trace Corpus (ATC), a large collection of grounded reasoning trajectories spanning three established IR benchmarks. We make three contributions: (1) the AgentSim platform with two mechanisms, Corpus-Aware Seeding and Active Validation, that improve trace diversity and quality; (2) the Agent-Trace Corpus (ATC), over 103,000 verifiable reasoning steps spanning three IR benchmarks, with 100% grounding rate on substantive answers; and (3) a comparative behavioral analysis revealing systematic differences in how state-of-the-art models approach information seeking. Platform, toolkit, and corpus are publicly available.

representative citing papers

PA-User: Simulating Trust and Verification under AI-Generated Content

cs.IR · 2026-06-22 · unverdicted · novelty 6.0

PA-User simulates user trust and verification in AI-generated content scenarios using effort budgets, Beta trust beliefs, and decision rules, showing lower trust-calibration error and regret than ablations on the HC3 corpus.

citing papers explorer

Showing 1 of 1 citing paper.

PA-User: Simulating Trust and Verification under AI-Generated Content cs.IR · 2026-06-22 · unverdicted · none · ref 21 · internal anchor
PA-User simulates user trust and verification in AI-generated content scenarios using effort budgets, Beta trust beliefs, and decision rules, showing lower trust-calibration error and regret than ablations on the HC3 corpus.

AgentSim: A Platform for Verifiable Agent-Trace Simulation

fields

years

verdicts

representative citing papers

citing papers explorer