Paired evaluation shows 26.5% decision instability for code-mixed inputs, with review rates rising from 0.138 to 0.297 and non-hate false-flag rates from 0.069 to 0.104.
Classification with abstention but without disparities,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
When Surface Form Changes Moderation Decisions: A Paired Study of Code-Mixed Workflow Instability
Paired evaluation shows 26.5% decision instability for code-mixed inputs, with review rates rising from 0.138 to 0.297 and non-hate false-flag rates from 0.069 to 0.104.