SWARM replaces binary safety labels with continuous probabilistic ones in multi-agent simulations, showing that strict governance cuts welfare over 40% without reducing toxicity while optimal circuit-breaker thresholds balance the two.
They generate strongly negative signatures (∆task < 0, high rejection rates) to crash the system’s overall positive surplus
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.MA 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Soft-Label Governance for Distributional Safety in Multi-Agent Systems
SWARM replaces binary safety labels with continuous probabilistic ones in multi-agent simulations, showing that strict governance cuts welfare over 40% without reducing toxicity while optimal circuit-breaker thresholds balance the two.