Derives dataset-specific theoretical and human-like QWK ceilings for AES from classical test theory reliability, showing that human-human QWK often underestimates the true achievable limit.
Psychological Methods1(1), 30–46 (1996)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Has Automated Essay Scoring Reached Sufficient Accuracy? Deriving Achievable QWK Ceilings from Classical Test Theory
Derives dataset-specific theoretical and human-like QWK ceilings for AES from classical test theory reliability, showing that human-human QWK often underestimates the true achievable limit.