MathTutorBench: A benchmark for measuring open-ended pedagogical capabilities of LLM tutors

Jakub Macina, Nico Daheim, Ido Hakimi, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Are Agents Ready to Teach? A Multi-Stage Benchmark for Real-World Teaching Workflows

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

EduAgentBench is a new source-grounded benchmark that evaluates tutor agents across pedagogical judgment, situated multi-turn tutoring, and Canvas-style workflow completion, finding frontier models capable of basic judgment but inadequate for professional teaching standards.

citing papers explorer

Showing 1 of 1 citing paper.

Are Agents Ready to Teach? A Multi-Stage Benchmark for Real-World Teaching Workflows cs.AI · 2026-05-14 · unverdicted · none · ref 9
EduAgentBench is a new source-grounded benchmark that evaluates tutor agents across pedagogical judgment, situated multi-turn tutoring, and Canvas-style workflow completion, finding frontier models capable of basic judgment but inadequate for professional teaching standards.

MathTutorBench: A benchmark for measuring open-ended pedagogical capabilities of LLM tutors

fields

years

verdicts

representative citing papers

citing papers explorer