SWARM replaces binary safety labels with continuous probabilistic ones in multi-agent simulations, showing that strict governance cuts welfare over 40% without reducing toxicity while optimal circuit-breaker thresholds balance the two.
Will generate high ∆engage and artificially minimize nr without creating genuine real value (∆task ≈0), resulting in severe adverse selection environments
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.MA 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Soft-Label Governance for Distributional Safety in Multi-Agent Systems
SWARM replaces binary safety labels with continuous probabilistic ones in multi-agent simulations, showing that strict governance cuts welfare over 40% without reducing toxicity while optimal circuit-breaker thresholds balance the two.