Innovating As- sessment with Conversational Agents: A Technology- Enhanced Approach to Formative Assessments

Dai, Wei, Lin, Jionghao, Jin, Hua, Li, Tongguang, Tsai, Yi-Shan, Gašević, Dragan · 2023 · arXiv 8122.2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment

cs.CY · 2026-03-29 · unverdicted · novelty 4.0

Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.

Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering

cs.CY · 2026-05-08 · unverdicted · novelty 3.0

LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment cs.CY · 2026-03-29 · unverdicted · none · ref 17
Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.
Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering cs.CY · 2026-05-08 · unverdicted · none · ref 224
LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.

Innovating As- sessment with Conversational Agents: A Technology- Enhanced Approach to Formative Assessments

fields

years

verdicts

representative citing papers

citing papers explorer