Softer label and rationale representations outperform hard ones on predictive, distributional, plausibility, faithfulness, and complexity metrics when models are re-implemented across representation spaces in hate speech detection.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Disagreeing Rationales: Rethinking Classification and Explainability Evaluation in Hate Speech Detection
Softer label and rationale representations outperform hard ones on predictive, distributional, plausibility, faithfulness, and complexity metrics when models are re-implemented across representation spaces in hate speech detection.