and Xi, Xiaoming and Breyer, F

A Framework for Evaluation, Use of Automated Scoring , volume = · 2012 · arXiv 3992.2011

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation

cs.CL · 2026-05-08 · unverdicted · novelty 4.0

AI models for automated short answer scoring show substantial mid-range quality degradation in expert agreement that improves with greater task-specific adaptation.

Detecting Alarming Student Verbal Responses using Text and Audio Classifier

cs.CL · 2026-04-17 · unverdicted · novelty 4.0

A hybrid text-plus-audio classifier framework is introduced to identify potentially troubling student responses by analyzing both what is said and how it is said.

Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering

cs.CY · 2026-05-08 · unverdicted · novelty 3.0

LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation cs.CL · 2026-05-08 · unverdicted · none · ref 59
AI models for automated short answer scoring show substantial mid-range quality degradation in expert agreement that improves with greater task-specific adaptation.
Detecting Alarming Student Verbal Responses using Text and Audio Classifier cs.CL · 2026-04-17 · unverdicted · none · ref 19
A hybrid text-plus-audio classifier framework is introduced to identify potentially troubling student responses by analyzing both what is said and how it is said.
Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering cs.CY · 2026-05-08 · unverdicted · none · ref 129
LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.

and Xi, Xiaoming and Breyer, F

fields

years

verdicts

representative citing papers

citing papers explorer