Introduces PeerMathDial, the first authentic middle school peer CPS dialogue dataset with 55 dialogues and 6,406 turns, a corpus-grounded dialogue act taxonomy, and three demonstrated use cases.
arXiv preprint arXiv:2504.06460 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
support 1representative citing papers
Thinking mode in Qwen3 models improves class-level performance on planning constraints but worsens precision constraints in IFEval, with 10-20% prompt-level flips and directional consistency in Hunyuan models.
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.
citing papers explorer
-
PeerMathDial: A Middle School Dialogue Dataset for Student Collaborative Math Problem Solving
Introduces PeerMathDial, the first authentic middle school peer CPS dialogue dataset with 55 dialogues and 6,406 turns, a corpus-grounded dialogue act taxonomy, and three demonstrated use cases.
-
When Built-in Thinking Helps and Hurts: Constraint-Level Error Shifts in Instruction Following
Thinking mode in Qwen3 models improves class-level performance on planning constraints but worsens precision constraints in IFEval, with 10-20% prompt-level flips and directional consistency in Hunyuan models.