arXiv preprint arXiv:2402.17161 , year=

Zhilun Zhou, Yuming Lin, Depeng Jin, Yong Li · 2024 · arXiv 2402.17161

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

cs.CL · 2024-10-06 · unverdicted · novelty 8.0

ErrorRadar is a new benchmark of 2,500 multimodal K-12 math problems for MLLM error step identification and categorization, where GPT-4o trails human experts by ~10%.

StreetDesignAI: Broadening Designer Perspectives Through Multi-Persona Evaluation of Cycling Infrastructure

cs.HC · 2026-01-22 · unverdicted · novelty 6.0 · 2 refs

StreetDesignAI provides structured multi-persona feedback on cycling designs and a user study shows it broadens designers' grasp of diverse cyclist perspectives and improves design decision confidence.

Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.

Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

Earth Science Foundation Models: From Perception to Reasoning and Discovery

astro-ph.IM · 2026-05-09 · unverdicted · novelty 3.0

The paper delivers a unified review and roadmap of Earth science foundation models, structured by capability depth from perception to agentic reasoning and by application breadth across atmosphere, hydrosphere, lithosphere, biosphere, anthroposphere, and cryosphere, while compiling over 200 datasets

citing papers explorer

Showing 5 of 5 citing papers.

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection cs.CL · 2024-10-06 · unverdicted · none · ref 84
ErrorRadar is a new benchmark of 2,500 multimodal K-12 math problems for MLLM error step identification and categorization, where GPT-4o trails human experts by ~10%.
StreetDesignAI: Broadening Designer Perspectives Through Multi-Persona Evaluation of Cycling Infrastructure cs.HC · 2026-01-22 · unverdicted · none · ref 60 · 2 links
StreetDesignAI provides structured multi-persona feedback on cycling designs and a user study shows it broadens designers' grasp of diverse cyclist perspectives and improves design decision confidence.
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning cs.CL · 2026-04-11 · unverdicted · none · ref 268
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs cs.CL · 2026-04-11 · unverdicted · none · ref 283
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
Earth Science Foundation Models: From Perception to Reasoning and Discovery astro-ph.IM · 2026-05-09 · unverdicted · none · ref 265
The paper delivers a unified review and roadmap of Earth science foundation models, structured by capability depth from perception to agentic reasoning and by application breadth across atmosphere, hydrosphere, lithosphere, biosphere, anthroposphere, and cryosphere, while compiling over 200 datasets

arXiv preprint arXiv:2402.17161 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer