Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.
Pre-Pilot Optimization of Conversation-Based Assessment Items Using Synthetic Response Data
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CY 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment
Verbalized confidence from small LMs enables cost-effective cascade routing for automated educational scoring, matching large-model accuracy at 76% lower cost when discrimination is strong.