Agreement-based clustering of annotators improves performance on subjective NLP tasks by capturing diverse perspectives better than majority voting or per-annotator modeling.
C onv A buse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
STABLEVAL produces stable AI system rankings by modeling latent correctness and annotator confusion rather than majority vote aggregation.
citing papers explorer
-
Beyond Majority Voting: Agreement-Based Clustering to Model Annotator Perspectives in Subjective NLP Tasks
Agreement-based clustering of annotators improves performance on subjective NLP tasks by capturing diverse perspectives better than majority voting or per-annotator modeling.
-
STABLEVAL: Disagreement-Aware and Stable Evaluation of AI Systems
STABLEVAL produces stable AI system rankings by modeling latent correctness and annotator confusion rather than majority vote aggregation.