ActivePusher: Active Learning and Planning with Residual Physics for Nonprehensile Manipulation
Pith reviewed 2026-05-19 11:37 UTC · model grok-4.3
The pith
ActivePusher uses uncertainty from a residual physics model to select informative data and bias reliable actions in pushing planners.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ActivePusher combines residual-physics modeling, in which a neural network learns the difference between observed and analytically predicted object motion, with uncertainty-based active learning that selects the most uncertain skill parameters for new data collection. The same uncertainty estimates are passed to a model-based kinodynamic planner to bias control sampling toward actions whose predicted outcomes have lower variance, producing higher data efficiency and planning success rates than random-data or unguided baselines in both simulation and real-robot pushing experiments.
What carries the argument
Uncertainty estimates from the residual physics model, which simultaneously drive active selection of training interactions and biased sampling of controls inside the kinodynamic planner.
If this is right
- Fewer physical interactions suffice to produce a dynamics model that supports reliable long-horizon plans.
- Planners spend fewer samples on actions that are likely to fail because of model error.
- The framework plugs into existing model-based planners with only a change to the sampling distribution.
- Performance gains appear consistently across simulation and real-robot evaluations.
Where Pith is reading between the lines
- The same uncertainty signal could be reused to decide when to stop learning and switch to pure planning.
- The approach may transfer to other contact-rich skills such as rolling or pivoting without redesigning the planner.
- One could test whether the residual model plus uncertainty bias improves sample efficiency when the robot must adapt online to new objects or surfaces.
Load-bearing premise
The uncertainty values produced by the residual model correctly flag the parts of the skill space where the model's predictions will be inaccurate.
What would settle it
An experiment in which active selection guided by uncertainty yields no reduction in the number of trials needed to reach a target planning success rate, or in which uncertainty-biased planning produces lower success rates than uniform sampling.
Figures
read the original abstract
Planning with learned dynamics models offers a promising approach toward versatile real-world manipulation, particularly in nonprehensile settings such as pushing or rolling, where accurate analytical models are difficult to obtain. However, collecting training data for learning-based methods can be costly and inefficient, as it often relies on randomly sampled interactions that are not necessarily the most informative. Furthermore, learned models tend to exhibit high uncertainty in underexplored regions of the skill space, undermining the reliability of long-horizon planning. To address these challenges, we propose ActivePusher, a novel framework that combines residual-physics modeling with uncertainty-based active learning, to focus data acquisition on the most informative skill parameters. Additionally, ActivePusher seamlessly integrates with model-based kinodynamic planners, leveraging uncertainty estimates to bias control sampling toward more reliable actions. We evaluate our approach in both simulation and real-world environments, and demonstrate that it consistently improves data efficiency and achieves higher planning success rates in comparison to baseline methods. The source code is available at https://github.com/elpis-lab/ActivePusher.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes ActivePusher, a framework that combines residual-physics modeling with uncertainty-based active learning to prioritize data collection on the most informative skill parameters for nonprehensile manipulation tasks such as pushing. It further integrates the learned model with model-based kinodynamic planners by biasing control sampling toward low-uncertainty actions. Evaluations in both simulation and real-world robot experiments are reported to show consistent gains in data efficiency and planning success rates relative to baseline methods, with source code released.
Significance. If the uncertainty estimates from the residual dynamics model reliably correlate with regions of high prediction error, the approach could meaningfully improve sample efficiency for learning contact-rich dynamics and increase the reliability of long-horizon model-based planning. The open-source code release supports reproducibility and is a clear positive contribution.
major comments (2)
- [§3] §3 (Method): The central claims rest on the assumption that uncertainty scores produced by the residual model accurately flag regions of high prediction error for both active data selection and biased planning. No calibration diagnostics, correlation plots between uncertainty and observed error, or ablation on uncertainty quality appear to be provided; if this correlation is weak (e.g., due to ensemble variance underestimating epistemic uncertainty on pushing contacts), the reported gains in data efficiency and success rate would not follow.
- [§4] §4 (Experiments): The abstract and evaluation summary state that the method 'consistently improves data efficiency and achieves higher planning success rates,' yet no quantitative metrics, error bars, statistical tests, or detailed baseline descriptions are referenced. This absence makes it impossible to assess effect sizes or rule out confounds, directly affecting the verifiability of the central empirical claim.
minor comments (2)
- [§3] Notation for the residual model and uncertainty quantification could be introduced with an explicit equation early in §3 to improve readability.
- [§4] Figure captions in the experimental section should explicitly state the number of trials and whether error bars represent standard deviation or standard error.
Simulated Author's Rebuttal
Thank you for the detailed review and constructive feedback on our manuscript. We appreciate the opportunity to clarify and strengthen our presentation of the ActivePusher framework. Below, we provide point-by-point responses to the major comments.
read point-by-point responses
-
Referee: [§3] §3 (Method): The central claims rest on the assumption that uncertainty scores produced by the residual model accurately flag regions of high prediction error for both active data selection and biased planning. No calibration diagnostics, correlation plots between uncertainty and observed error, or ablation on uncertainty quality appear to be provided; if this correlation is weak (e.g., due to ensemble variance underestimating epistemic uncertainty on pushing contacts), the reported gains in data efficiency and success rate would not follow.
Authors: We agree that explicit validation of the uncertainty estimates is important to support the central claims. The manuscript uses ensemble variance from the residual dynamics model as a proxy for epistemic uncertainty in underexplored contact-rich regions, but does not include calibration diagnostics, correlation plots, or dedicated ablations on uncertainty quality. In the revised manuscript, we will add these elements: correlation analysis between uncertainty scores and observed prediction errors across skill parameters, plus an ablation isolating the contribution of uncertainty-guided selection versus random sampling. This will provide direct evidence for the reliability of the uncertainty estimates in the context of pushing tasks. revision: yes
-
Referee: [§4] §4 (Experiments): The abstract and evaluation summary state that the method 'consistently improves data efficiency and achieves higher planning success rates,' yet no quantitative metrics, error bars, statistical tests, or detailed baseline descriptions are referenced. This absence makes it impossible to assess effect sizes or rule out confounds, directly affecting the verifiability of the central empirical claim.
Authors: We thank the referee for highlighting the need for clearer referencing of the empirical results. Section 4 of the manuscript reports quantitative success rates and data-efficiency curves with standard error bars computed over repeated trials, along with statistical comparisons (including significance tests) against baselines such as random data collection and non-residual models; detailed baseline descriptions are also provided in that section. These elements were not explicitly referenced in the abstract or evaluation summary. In the revision, we will update the abstract and evaluation summary to include key quantitative metrics and direct pointers to the results, error bars, statistical tests, and baseline details in Section 4, thereby improving verifiability without altering the underlying experiments. revision: yes
Circularity Check
No circularity; derivation is self-contained with external empirical validation
full rationale
The paper describes a standard residual-physics model learned from interaction data, combined with uncertainty-driven active learning for data selection and biased sampling in kinodynamic planning. No step reduces a claimed prediction or result to a fitted parameter or self-citation by construction; the residual dynamics, uncertainty estimates, and planning bias are defined independently and evaluated on held-out simulation and real-robot trials against baselines. The framework is falsifiable via external benchmarks rather than relying on self-referential definitions or load-bearing self-citations.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
combines residual-physics modeling with uncertainty-based active learning... NTK... BAIT acquisition function
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Terminal Matters: Kinodynamic Planning with a Terminal Cost and Learned Uncertainty in Belief State-Cost Space
KiTe augments AO-RRT with terminal costs and belief-space Wasserstein minimization to improve goal-reaching reliability under learned uncertainty while preserving asymptotic optimality.
-
Terminal Matters: Kinodynamic Planning with a Terminal Cost and Learned Uncertainty in Belief State-Cost Space
KiTe augments sampling-based kinodynamic planning with terminal costs in belief space, proving asymptotic optimality preservation and improved goal-reaching probability bounds via Wasserstein minimization, supported b...
Reference graph
Works this paper leans on
-
[1]
L. P. Kaelbling and T. Lozano-P´erez. Integrated task and motion planning in belief space. The International Journal of Robotics Research, 32(9-10):1194–1227, 2013
work page 2013
-
[2]
M. T. Mason. T oward robotic manipulation. Annual Review of Control, Robotics, and Au- tonomous Systems, 1(1):1–28, 2018
work page 2018
-
[3]
A. Jacot, F. Gabriel, and C. Hongler. Neural tangent kernel: Convergence and generalization in neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper_files/paper/ 201...
work page 2018
-
[4]
F. R. Hogan and A. Rodriguez. Reactive planar non-prehensile manipulation with hybrid model predictive control. The International Journal of Robotics Research , 39(7):755–773, 2020
work page 2020
-
[5]
P. Agrawal, A. V . Nair, P. Abbeel, J. Malik, and S. Levine. L earning to poke b y poking: Experiential learning of intuitive physics. volume 29, 2016
work page 2016
- [6]
-
[7]
A. Ajay, J. Wu, N. Fazeli, M. Bauza, L. P. Kaelbling, J. B. Tenenbaum, and A. Rodriguez. Aug- menting physical simulators with stochastic neural networks: Case study of planar pushing and bouncing. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3066–3073, 2018. URL https://ieeexplore.ieee.org/document/8593995
-
[8]
Y .Gal, R. Islam, and Z. Ghahramani. Deep Bayesian active learning with image data. I n D. Precup and Y .W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1183–1192. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/gal17a.html
work page 2017
-
[9]
J. Ash, S. Goel, A. Krishnamurthy, and S. Kakade. Gone fishing: Neural active learning with fisher embeddings. In M. Ranzato, A. Beygelzimer, Y .Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 8927–8939. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/ paper/2...
work page 2021
-
[10]
X. Li and Y .Guo. Adaptive active learning for image classification. In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 859–866, 2013. doi:10.1109/CVPR.2013.116
-
[11]
D. Holzm¨uller, V .Zaverkin, J. K¨astner, and I. Steinwart. A framework and benchmark for deep batch active learning for regression. J. Mach. Learn. Res., 24(1), Jan. 2023. ISSN 1532-4435. URL https://dl.acm.org/doi/abs/10.5555/3648699.3648863
-
[12]
A. T. Taylor, T. A. Berrueta, and T. D. Murphey. Active learning in robotics: A review of control principles. Mechatronics, 77:102576, 2021. I SSN 0957-4158. URL https://www. sciencedirect.com/science/article/pii/S0957415821000659. 10
work page 2021
-
[13]
Z. Wang, C. R. Garrett, L. P. Kaelbling, and T. Lozano-P´erez. Learning compositional models of robot skills for task and motion planning. The International Journal of Robotics Research, 40(6-7):866–894, 2021. URL https://doi.org/10.1177/02783649211004615
-
[14]
A. LaGrassa, M. Lee, and O. Kroemer. Task-oriented active learning of model preconditions for inaccurate dynamics models. I n 2024 I EEE International Conference on Robo tics and Automation (ICRA), pages 16445–16445. IEEE, 2024
work page 2024
- [15]
-
[16]
J. A. Haustein, I. Arnekvist, J. Stork, K. Hang, and D. Kragic. Learning manipulation states and actions for efficient non-prehensile rearrangement planning. arXiv preprint arXiv:1901.03557, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[17]
K. Ren, P. Chanrungmaneekul, L. E. Kavraki, and K. Hang. Kinodynamic rapidly-exploring random forest for rearrangement-based nonprehensile manipulation. In 2023 IEEE International Conference on Robotics and Automation (ICRA) , pages 8127–8133. IEEE, 2023
work page 2023
-
[18]
M. Faroni and D. Berenson. Online adap tation o f sampling-based motion p lanning w ith inaccurate models. In 2024 IEEE International Conference on Robotics and Automation (ICRA), pages 2382–2388. IEEE, 2024
work page 2024
-
[19]
C. Finn and S . Levine. Deep visual foresight for planning robot motion. I n 2017 I EEE international conference on robotics and automation (ICRA) , pages 2786–2793. IEEE, 2017
work page 2017
- [20]
-
[21]
W. Zhou, B. Jiang, F. Yang, C. Paxton, and D. Held. Hacman: Learning hybrid actor-critic maps for 6d non-prehensile manipulation. In Conference on Robo t Learning, pages 241–265. PMLR, 2023
work page 2023
- [22]
-
[23]
G. Wang, K. Ren, and K. Hang. Uno push: Unified nonprehensile object pushing via non- parametric estimation and model predictive control. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages 9893–9900. IEEE, 2024
work page 2024
- [24]
-
[25]
URL https://doi.org/10.1146/ annurev-control-061623-094742
doi:10.1146/annurev-control-061623-094742. URL https://doi.org/10.1146/ annurev-control-061623-094742
-
[26]
K. M. Lynch, H. Maekawa, and K. Tanie. Manipulation and active sensing by pushing using tactile feedback. In IROS, volume 1, pages 416–421, 1992
work page 1992
-
[27]
M. A. Mohamadi, W. Bae, and D. J. Sutherland. Making look-ahead active learning strategies feasible with neural tangent kernels. Advances in Neural Information Processing Systems, 35: 12542–12553, 2022
work page 2022
-
[28]
Y . Li,Z. Littlefield, and K. E. Bekris. Asymptotically optimal sampling-based kinodynamic planning. The International Journal of Robotics Research, 35(5):528–564, 2016. doi:10.1177/ 0278364915614386. URL https://doi.org/10.1177/0278364915614386
-
[29]
G. Authors. Genesis: A universal and generative physics engine for robotics and be yond. December 2024. URL https://github.com/Genesis-Embodied-AI/Genesis. 11
work page 2024
- [30]
- [31]
-
[32]
I. A. S ¸ucan, M. Moll, and L. E. Kavraki. The Open Motion Planning Library. IEEE Robotics & Automation Magaz ine, 19(4):72–82, December 2012. doi:10.1109/MRA.2012.2205651. https://ompl.kavrakilab.org
-
[33]
S. Liu, Z. Zeng, T.Ren, F. Li, H. Zhang, J. Yang, Q. Jiang, C. Li, J. Yang, H. Su, et al. Grounding dino: Marrying dino with grounded pre-training for open-set object detection. I n European Conference on Computer Vision, pages 38–55. Springer, 2024
work page 2024
-
[34]
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Lo, et al. Segment anything. InProceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023. 12 Appendix A Active Learning Algorithms In this section, we provide a more detailed and formal explanation of the active...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.