Sci-PRM is a tool-aware process reward model trained on the SCIPRM70K dataset to provide fine-grained supervision for scientific reasoning and shown to boost foundation models via Best-of-N selection and RL.
Tim-prm: Verifying multimodal reasoning with tool-integrated prm.arXiv preprint arXiv:2511.22998, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
StainFlow proposes global entity stain tracking and local stain evidence linking modules to improve process rewards for GUI agents, reporting 3.2% relative gain in online RL success and 1.8% in judgment accuracy on AndroidWorld and OGRBench.
citing papers explorer
-
SCI-PRM: A Tool Aware Process Reward Model for Scientific Reasoning Verification
Sci-PRM is a tool-aware process reward model trained on the SCIPRM70K dataset to provide fine-grained supervision for scientific reasoning and shown to boost foundation models via Best-of-N selection and RL.
-
StainFlow: Entity-Stain Tracking and Evidence Linking for Process Rewards in GUI Agents
StainFlow proposes global entity stain tracking and local stain evidence linking modules to improve process rewards for GUI agents, reporting 3.2% relative gain in online RL success and 1.8% in judgment accuracy on AndroidWorld and OGRBench.