Pre-Pilot Optimization of Conversation-Based Assessment Items Using Synthetic Response Data

Tyler Burleigh, Jing Chen, Kristen DiCerbo · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment

cs.CY · 2026-03-29 · unverdicted · novelty 4.0

Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.

citing papers explorer

Showing 1 of 1 citing paper.

Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment cs.CY · 2026-03-29 · unverdicted · none · ref 16
Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.

Pre-Pilot Optimization of Conversation-Based Assessment Items Using Synthetic Response Data

fields

years

verdicts

representative citing papers

citing papers explorer