LOVER creates an unsupervised logic-regularized verifier that reaches 95% of supervised verifier performance on reasoning tasks across 10 datasets.
Analyzing the Effects of Annotator Gender across NLP Tasks
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.
Disagreement in health-literacy annotations is driven by conceptual task difficulty rather than annotator differences, with social effects varying or reversing by agreement level, making perspectivist modeling necessary.
citing papers explorer
-
Logic-Regularized Verifier Elicits Reasoning from LLMs
LOVER creates an unsupervised logic-regularized verifier that reaches 95% of supervised verifier performance on reasoning tasks across 10 datasets.
-
Who and What? Using Linguistic Features and Annotator Characteristics to Analyze Annotation Variation
Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.
-
Structured Disagreement in Health-Literacy Annotation: Epistemic Stability, Conceptual Difficulty, and Agreement-Stratified Inference
Disagreement in health-literacy annotations is driven by conceptual task difficulty rather than annotator differences, with social effects varying or reversing by agreement level, making perspectivist modeling necessary.