RARM is a lightweight visual comparator trained once on general videos that supplies dense progress rewards to RL by matching rollout clips to a reference demonstration and gating rewards on match confidence.
arXiv preprint arXiv:2502.20630 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.RO 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
SARM2 presents RM, a multi-task stage-aware reward model achieving 80% lower value-estimation MSE, which when used in SPIRAL boosts manipulation task success from ~50% to near-perfect on several benchmarks.
GTP-FA is a grasp-then-plan framework with failure attribution that diagnoses errors to optimize grasping priors and planning data collection, raising success rates across RL, IL, diffusion, and VLA methods in simulation and real robots.
citing papers explorer
-
RARM: Confidence-Gated Progress Reward Modeling for RL in Manipulation
RARM is a lightweight visual comparator trained once on general videos that supplies dense progress rewards to RL by matching rollout clips to a reference demonstration and gating rewards on match confidence.
-
SARM2: Multi-Task Stage Aware Reward Modeling for Self Improving Robotic Manipulation
SARM2 presents RM, a multi-task stage-aware reward model achieving 80% lower value-estimation MSE, which when used in SPIRAL boosts manipulation task success from ~50% to near-perfect on several benchmarks.
-
Grasp-Then-Plan with Failure Attribution: A Closed Two-Stage Framework for Precise and Generalizable Robotic Manipulation
GTP-FA is a grasp-then-plan framework with failure attribution that diagnoses errors to optimize grasping priors and planning data collection, raising success rates across RL, IL, diffusion, and VLA methods in simulation and real robots.