An empirical study finds that human and LLM judgments on pedagogical quality of pretest questions disagree systematically, with rubric operationalization affecting alignment more than evaluation mode.
and Rahman, Shuhebur and Perkins, Kyle , title =
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Why Machines Misread Pedagogical Quality: Human-Machine Alignment in LLM-Based Pretest Question Evaluation
An empirical study finds that human and LLM judgments on pedagogical quality of pretest questions disagree systematically, with rubric operationalization affecting alignment more than evaluation mode.