EduVideoBench is a new KSA-grounded benchmark that evaluates five frontier video generation models and finds substantial gaps in educational validity across knowledge, skills, and attitudes.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
The Tutoring Effectiveness Index (TEI) uses four signals from LLM conversations to select math tutoring responses, raising student improvement rates from 59.0% to 81.9% at N=8 on a frozen DeepSeek-R1-8B model without training or judges.
Training-free prompt optimization methods, including five new education-focused ones, surpass the strongest RL-trained baseline across five conditions on two OOD suites while showing distinct teaching behavior patterns.
Special-R1 combines two-dimensional adaptive prompts and a disability-conditioned Thinking Reward in RL training, lifting persona-aware Fit by 1.65 and SPED Helpfulness by 0.048 on a 690-dialogue test set while staying competitive on an out-of-domain benchmark.
citing papers explorer
-
The Tutoring Effectiveness Index: Predicting LLM Math Tutor Quality from Four Conversation Signals
The Tutoring Effectiveness Index (TEI) uses four signals from LLM conversations to select math tutoring responses, raising student improvement rates from 59.0% to 81.9% at N=8 on a frozen DeepSeek-R1-8B model without training or judges.
-
Reinforcement Learning for Special Education: Aligning LLM Tutors to Diverse Learners through Disability-Adaptive Training
Special-R1 combines two-dimensional adaptive prompts and a disability-conditioned Thinking Reward in RL training, lifting persona-aware Fit by 1.65 and SPED Helpfulness by 0.048 on a 690-dialogue test set while staying competitive on an out-of-domain benchmark.