Online learning from strategic human feedback in llm ﬁne-tuning

Shugang Hao, Lingjie Duan · arXiv 2412.16834

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Incentivizing High-Quality Human Annotations with Golden Questions

cs.GT · 2025-05-25 · unverdicted · novelty 7.0

The paper derives a Θ(1/√(n log n)) hypothesis testing rate under strategic annotator behavior and shows that high-certainty, format-similar golden questions better reveal annotation quality than standard checks.

citing papers explorer

Showing 1 of 1 citing paper.

Incentivizing High-Quality Human Annotations with Golden Questions cs.GT · 2025-05-25 · unverdicted · none · ref 19
The paper derives a Θ(1/√(n log n)) hypothesis testing rate under strategic annotator behavior and shows that high-certainty, format-similar golden questions better reveal annotation quality than standard checks.

Online learning from strategic human feedback in llm ﬁne-tuning

fields

years

verdicts

representative citing papers

citing papers explorer