SWARM replaces binary safety labels with continuous probabilistic ones in multi-agent simulations, showing that strict governance cuts welfare over 40% without reducing toxicity while optimal circuit-breaker thresholds balance the two.
If the threshold is θCB, they mathematically scale random generation variables untilp≈θ CB +ϵ
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.MA 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Soft-Label Governance for Distributional Safety in Multi-Agent Systems
SWARM replaces binary safety labels with continuous probabilistic ones in multi-agent simulations, showing that strict governance cuts welfare over 40% without reducing toxicity while optimal circuit-breaker thresholds balance the two.