Minor perturbations in persona format, instruction framing, and network structure shift cooperation by up to 76 percentage points and polarization metrics consistently, showing that LLM social simulations require per-claim robustness audits via the new TRAILS taxonomy.
Chateval: Towards better llm-based evaluators through multi-agent debate,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
physics.soc-ph 1years
2026 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Stop Drawing Scientific Claims from LLM Social Simulations Without Robustness Audits
Minor perturbations in persona format, instruction framing, and network structure shift cooperation by up to 76 percentage points and polarization metrics consistently, showing that LLM social simulations require per-claim robustness audits via the new TRAILS taxonomy.