AI short-answer scorers show mid-range quality degradation that lessens with more task-specific adaptation, while human agreement stays stable across the quality spectrum.
Proceedings of the fourteenth workshop on innovative use of NLP for building educational applications , pages=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation
AI short-answer scorers show mid-range quality degradation that lessens with more task-specific adaptation, while human agreement stays stable across the quality spectrum.