Derives dataset-specific theoretical and human-like QWK ceilings for AES from classical test theory reliability, showing that human-human QWK often underestimates the true achievable limit.
In: Proceedings of the International Conference on Artificial In- telligence in Education
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Has Automated Essay Scoring Reached Sufficient Accuracy? Deriving Achievable QWK Ceilings from Classical Test Theory
Derives dataset-specific theoretical and human-like QWK ceilings for AES from classical test theory reliability, showing that human-human QWK often underestimates the true achievable limit.