PyraMathBench reveals LLMs' weaknesses in numerical computation for math tasks and SOLVE/IRPO training delivers a 5.0 point gain on Qwen-2.5.
<answer>...<\answer>
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Transition embeddings from pretrained algebraic encoders combined with SimCSE produce problem-invariant representations of student solution strategies that encode meaningful information and correlate with short- and long-term learning outcomes.
citing papers explorer
-
PyraMathBench: Evaluating and Improving Mathematical Capability in Large Language Models
PyraMathBench reveals LLMs' weaknesses in numerical computation for math tasks and SOLVE/IRPO training delivers a 5.0 point gain on Qwen-2.5.
-
Towards Generalizable Representations of Mathematical Strategies
Transition embeddings from pretrained algebraic encoders combined with SimCSE produce problem-invariant representations of student solution strategies that encode meaningful information and correlate with short- and long-term learning outcomes.