The paper presents ESTBook, a multimodal benchmark of 10,576 English standardized test questions augmented with formalized cognitive reasoning trajectories and distractor rationales to support diagnostic evaluation of LLMs as tutors.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
From Test-taking to Cognitive Scaffolding: A Pedagogical Diagnostic Benchmark for LLMs on English Standardized Tests
The paper presents ESTBook, a multimodal benchmark of 10,576 English standardized test questions augmented with formalized cognitive reasoning trajectories and distractor rationales to support diagnostic evaluation of LLMs as tutors.