pith. sign in

From Noise to Signal to Selbstzweck: Reframing Human Label Variation in the Era of Post-training in NLP

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Human Label Variation (HLV) refers to legitimate disagreement in annotation that reflects the diversity of human perspectives rather than mere error. Long treated in NLP as noise to be eliminated, HLV has only recently been reframed as a signal for improving model robustness. With the rise of large language models (LLMs) and post-training methods such as human feedback-based alignment, the role of HLV has become increasingly consequential. Yet current preference-learning datasets routinely collapse multiple annotations into a single label, flattening diverse perspectives into artificial consensus. Preserving HLV is necessary not only for pluralistic alignment but also for sociotechnical safety evaluation, where model behavior must be assessed in relation to human interaction and societal context. This position paper argues that preserving HLV as an embodiment of human pluralism must be treated as a Selbstzweck, an intrinsic value in itself. We analyze the limitations of existing preference datasets and propose actionable strategies for incorporating HLV into dataset construction to better preserve pluralistic human values.

fields

cs.CL 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper.

  • LP-Eval: Rubric and Dataset for Measuring the Quality of Legal Proposition Generation cs.CL · 2026-05-19 · unverdicted · none · ref 30 · internal anchor

    LP-Eval is a new expert-co-designed rubric and annotated dataset showing that LLMs mostly produce well-formed legal propositions from EU court decisions, with higher expert-rated quality for established cases and improved LLM-as-judge alignment when using the rubric.