Recognition: no theorem link
Visibility-Aware Mobile Grasping in Dynamic Environments
Pith reviewed 2026-05-12 02:23 UTC · model grok-4.3
The pith
A unified system integrates whole-body planning with active perception and behavior trees to let mobile robots grasp objects safely in unknown dynamic environments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an iterative low-level whole-body planner paired with velocity-aware active perception safely navigates dynamic unknowns while a hierarchical behavior-tree planner adaptively supplies subgoals, producing success rates of 68.8 percent in unknown static scenes and 58.0 percent in unknown dynamic scenes together with improved collision avoidance.
What carries the argument
The iterative low-level whole-body planner coupled with velocity-aware active perception, which reduces uncertainty about moving obstacles while the robot advances toward a grasp configuration.
If this is right
- The robot maintains progress toward the grasp even when new obstacles appear by updating plans from fresh velocity data.
- Behavior trees let the system switch between exploration, grasping, and recovery modes without external intervention when failures occur at runtime.
- Success and safety improve consistently across four hundred randomized simulation trials and carry over to physical deployment on a mobile manipulator.
- The approach avoids the safety failures that arise when perception and motion are decoupled in changing environments.
Where Pith is reading between the lines
- The same coupling of velocity-aware sensing and adaptive subgoal generation could support related tasks such as object placement or drawer opening in populated spaces.
- Limits would appear if obstacle speeds exceed what the sensor and replanner can track, suggesting targeted stress tests with faster or occluded movers.
- Replacing explicit velocity estimation with learned motion prediction might further tighten reaction times in highly cluttered scenes.
Load-bearing premise
The planner and sensors can locate and dodge unobserved dynamic obstacles before they reach the robot, which requires adequate sensor range and replanning speed relative to obstacle motion.
What would settle it
Run repeated trials with obstacles that enter the path from outside the current field of view at speeds exceeding the documented replanning rate and measure whether collision frequency rises sharply above the reported baseline.
Figures
read the original abstract
This paper addresses the problem of mobile grasping in dynamic, unknown environments where a robot must operate under a limited field-of-view. The fundamental challenge is the inherent trade-off between ``seeing'' around to reduce environmental uncertainty and ``moving'' the body to achieve task progress in a high-dimensional configuration space, subject to visibility constraints. Previous approaches often assume known or static environments and decouple these objectives, failing to guarantee safety when unobserved dynamic obstacles intersect the robot's path during manipulation. In this paper, we propose a unified mobile grasping system comprising two core components: (1) an iterative low-level whole-body planner coupled with velocity-aware active perception to navigate dynamic environments safely; and (2) a hierarchical high-level planner based on behavior trees that adaptively generates subgoals to guide the robot through exploration and runtime failures. We provide experimental results across 400 randomized simulation scenarios and real-world deployment on a Fetch mobile manipulator. Results show that our system achieves a success rate of 68.8\% and 58.0\% in unknown static and dynamic environments, respectively, significantly boosting success rates by 22.8\% and 18.0\% over the \nam approach in both unknown static and dynamic environments, with improved collision safety.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a unified mobile grasping system for dynamic unknown environments with limited field-of-view. It integrates (1) an iterative low-level whole-body planner coupled with velocity-aware active perception for safe navigation and (2) a hierarchical high-level behavior-tree planner that generates adaptive subgoals. Experiments across 400 randomized simulation scenarios report success rates of 68.8% (unknown static) and 58.0% (unknown dynamic), with gains of 22.8% and 18.0% over the NAM baseline, plus real-world validation on a Fetch manipulator.
Significance. If the empirical improvements and collision-safety gains hold under broader conditions, the work could advance practical mobile manipulation in unstructured settings by explicitly addressing the visibility-motion trade-off. The scale of 400 randomized trials and real-robot deployment are positive empirical strengths.
major comments (2)
- [§3 (low-level planner)] §3 (low-level planner): The central safety claim—that velocity-aware active perception plus iterative replanning can detect and avoid unobserved dynamic obstacles in time—is load-bearing for the 'safely' and 'improved collision safety' assertions, yet the manuscript supplies no quantitative bounds on obstacle speeds, sensor range, prediction horizon, or worst-case replanning latency. The 58.0% dynamic success rate therefore does not confirm that the assumption holds beyond the tested scenarios.
- [§4 (experiments)] §4 (experiments): The 18.0% improvement over the NAM baseline cannot be fully assessed because the baseline implementation, dynamic-obstacle motion models (speeds, trajectories, densities), and any statistical significance tests are not described. This weakens the cross-condition claims.
minor comments (2)
- [Abstract] Abstract: the acronym 'NAM' is undefined; expand it on first use.
- [Results] Results tables/figures: report standard deviations or confidence intervals alongside the success-rate percentages to allow readers to judge the reliability of the reported gains.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the safety analysis and experimental details. We address each major comment below and will make the indicated revisions to improve the manuscript.
read point-by-point responses
-
Referee: [§3 (low-level planner)] The central safety claim—that velocity-aware active perception plus iterative replanning can detect and avoid unobserved dynamic obstacles in time—is load-bearing for the 'safely' and 'improved collision safety' assertions, yet the manuscript supplies no quantitative bounds on obstacle speeds, sensor range, prediction horizon, or worst-case replanning latency. The 58.0% dynamic success rate therefore does not confirm that the assumption holds beyond the tested scenarios.
Authors: We agree that the manuscript would benefit from explicit quantitative bounds to support the safety claims. While the 400 randomized trials provide empirical evidence of improved collision safety in the tested dynamic scenarios, we acknowledge that bounds on obstacle speeds, sensor range, prediction horizon, and replanning latency were not stated. In the revision, we will add a new paragraph to §3.2 that supplies these bounds based on the system parameters (e.g., camera field-of-view and update rate, planner frequency) together with a worst-case timing analysis showing timely detection and avoidance for the obstacle velocities used in simulation. This will clarify the scope of the claims without overstating generality beyond the evaluated conditions. revision: yes
-
Referee: [§4 (experiments)] The 18.0% improvement over the NAM baseline cannot be fully assessed because the baseline implementation, dynamic-obstacle motion models (speeds, trajectories, densities), and any statistical significance tests are not described. This weakens the cross-condition claims.
Authors: We agree that fuller documentation of the baseline and experimental conditions is required for readers to assess the reported gains. In the revised manuscript we will expand §4 with (i) a precise description of how the NAM baseline was implemented, (ii) the exact motion models for dynamic obstacles (including speed ranges, trajectory generation, and density), and (iii) the statistical tests performed on the 400 trials per condition. These additions will allow direct evaluation of the 18.0% improvement and the static/dynamic comparisons. revision: yes
Circularity Check
No circularity: purely empirical system proposal with measured outcomes
full rationale
The paper describes a robotic grasping system with two planners and reports success rates from 400 simulation trials plus real-world tests on a Fetch manipulator. No equations, parameter fittings, or first-principles derivations appear in the provided text; the central claims are direct experimental measurements (68.8% and 58.0% success) compared against a baseline. No self-citations are invoked as load-bearing uniqueness theorems, and no predictions reduce by construction to fitted inputs or self-definitions. The work is self-contained as an engineering contribution whose validity rests on external benchmarks rather than internal reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Accurate kinematic and dynamic models of the mobile manipulator are available
- domain assumption Sensor observations can be fused in real time to update an occupancy or visibility map
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2401.12202 (2024)
P. Liu, Y . Orru, C. Paxton, N. M. M. Shafiullah, and L. Pinto, “Ok-robot: What really matters in integrating open-knowledge models for robotics,”arXiv preprint arXiv:2401.12202, 2024. [Online]. Available: https://arxiv.org/abs/2401.12202
-
[2]
Homerobot: Open-vocabulary mobile manipulation.arXiv preprint arXiv:2306.11565, 2024
S. Yenamandra, A. Ramachandran, K. Yadav, A. Wang, M. Khanna, T. Gervet, T.-Y . Yang, V . Jain, A. W. Clegg, J. Turner, Z. Kira, M. Savva, A. Chang, D. S. Chaplot, D. Batra, R. Mottaghi, Y . Bisk, and C. Paxton, “Homerobot: Open-vocabulary mobile manipulation,” 2024. [Online]. Available: https://arxiv. org/abs/2306.11565
-
[3]
F. Reister, M. Grotz, and T. Asfour, “Combining nav- igation and manipulation costs for time-efficient robot placement in mobile manipulation tasks,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9913–9920, 2022
work page 2022
-
[4]
Robot placement based on reachability inversion,
N. Vahrenkamp, T. Asfour, and R. Dillmann, “Robot placement based on reachability inversion,” in2013 IEEE International Conference on Robotics and Automation, 2013, pp. 1970–1975
work page 2013
-
[5]
M. Finean, W. Merkt, and I. Havoutis, “Where should i look optimised gaze control for whole-body collision avoidance in dynamic environments,”IEEE Robotics and Automation Letters, vol. PP, pp. 1–1, 12 2021
work page 2021
-
[6]
Mo- tions in microseconds via vectorized sampling-based planning,
W. Thomason, Z. Kingston, and L. E. Kavraki, “Mo- tions in microseconds via vectorized sampling-based planning,” in2024 IEEE International Conference on Robotics and Automation (ICRA), 2024, pp. 8749–8756
work page 2024
-
[7]
Demonstrating mobile manipulation in the wild: A metrics-driven approach,
M. Bajracharya, J. Borders, R. Cheng, D. Helmick, L. Kaul, D. Kruse, J. Leichty, J. Ma, C. Matl, F. Michel, C. Papazov, J. Petersen, K. Shankar, and M. Tjersland, “Demonstrating mobile manipulation in the wild: A metrics-driven approach,” inRobotics: Science and Systems XIX, ser. RSS2023. Robotics: Science and Systems Foundation, Jul. 2023. [Online]. Avai...
-
[8]
Go fetch: Mobile manipulation in unstructured environments,
K. Blomqvist, M. Breyer, A. Cramariuc, J. F ¨orster, M. Grinvald, F. Tschopp, J. J. Chung, L. Ott, J. Nieto, and R. Siegwart, “Go fetch: Mobile manipulation in unstructured environments,” 2020. [Online]. Available: https://arxiv.org/abs/2004.00899
-
[9]
Uncertainty-aware arm-base coordinated object grasping with a mobile manipulation platform,
D. Chen and G. v. Wichert, “Uncertainty-aware arm-base coordinated object grasping with a mobile manipulation platform,” inISR/Robotik 2014; 41st International Sym- posium on Robotics, 2014, pp. 1–6
work page 2014
-
[10]
Uncertainty aware mobile manipulator platform pose planning based on capability map,
Y . Meng, Y . Chen, and Y . Lou, “Uncertainty aware mobile manipulator platform pose planning based on capability map,” in2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), 2021, pp. 123–128
work page 2021
-
[11]
Demonstrating adaptive mobile manipulation in retail environments,
M. Spahn, C. Pezzato, C. Salmi, R. Dekker, C. Wang, C. Pek, J. Kober, J. Alonso-Mora, C. Hern´andez Corbato, and M. Wisse, “Demonstrating adaptive mobile manipulation in retail environments,” inRobotics: Science and Systems (R:SS), 2024. [Online]. Available: https://www.roboticsproceedings.org/rss20/p047.html
work page 2024
-
[12]
Robi butler: Multimodal remote interaction with a household robot assistant,
A. Xiao, N. Janaka, T. Hu, A. Gupta, K. Li, C. Yu, and D. Hsu, “Robi butler: Multimodal remote interaction with a household robot assistant,” 2025. [Online]. Available: https://arxiv.org/abs/2409.20548
-
[13]
Real-time sampling-based safe motion planning for robotic manipulators in dynamic environments,
N. Covic, B. Lacevic, D. Osmankovic, and T. Uzunovic, “Real-time sampling-based safe motion planning for robotic manipulators in dynamic environments,”IEEE Transactions on Robotics, vol. 41, p. 5287–5306, 2025. [Online]. Available: http://dx.doi.org/10.1109/TRO.2025. 3598119
-
[14]
M. N. Finean, W. Merkt, and I. Havoutis, “Simultaneous scene reconstruction and whole-body motion planning for safe operation in dynamic environments,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 3710–
work page 2021
-
[15]
Nah, Aleksei Krotov, and Dagmar Sternad
[Online]. Available: http://dx.doi.org/10.1109/ IROS51168.2021.9636860
-
[16]
The design of stretch: A compact, lightweight mobile manipulator for indoor human environments,
M. Tzes, V . Vasilopoulos, Y . Kantaros, and G. J. Pappas, “Reactive informative planning for mobile manipulation tasks under sensing and environmental uncertainty,” in2022 International Conference on Robotics and Automation (ICRA). IEEE Press, 2022, p. 7320–7326. [Online]. Available: https://doi.org/10.1109/ICRA46639. 2022.9811642
-
[17]
Reactive planning for mobile manipulation tasks in unexplored semantic environments,
V . Vasilopoulos, Y . Kantaros, G. J. Pappas, and D. E. Koditschek, “Reactive planning for mobile manipulation tasks in unexplored semantic environments,” in2021 IEEE International Conference on Robotics and Automa- tion (ICRA), 2021, pp. 6385–6392
work page 2021
-
[18]
Base placement optimization for coverage mobile manipulation tasks,
H. Zhang, K. Mi, and Z. Zhang, “Base placement optimization for coverage mobile manipulation tasks,”
-
[19]
Available: https://arxiv.org/abs/2304
[Online]. Available: https://arxiv.org/abs/2304. 08246
-
[20]
Symbolic state space optimization for long horizon mobile manipulation planning,
X. Zhang, Y . Zhu, Y . Ding, Y . Jiang, Y . Zhu, P. Stone, and S. Zhang, “Symbolic state space optimization for long horizon mobile manipulation planning,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 866–872
work page 2023
-
[21]
B ∗: Efficient and optimal base placement for fixed-base manipulators,
Z. Zhao, L. Cui, S. Xie, S. Zhang, Z. Han, L. Ruan, and Y . Zhu, “B ∗: Efficient and optimal base placement for fixed-base manipulators,”IEEE Robotics and Automation Letters, vol. 10, no. 10, pp. 10 634–10 641, Oct. 2025. [Online]. Available: http://dx.doi.org/10.1109/LRA.2025.3604741
-
[22]
Look before you sweep: Visibility-aware motion planning,
G. Goretkin, L. P. Kaelbling, and T. Lozano-P´erez, “Look before you sweep: Visibility-aware motion planning,” inProceedings of the Workshop on the Algorithmic Foundations of Robotics (WAFR). Springer, 2018, pp. 1–
work page 2018
-
[23]
Available: https://arxiv.org/abs/1901.06109
[Online]. Available: https://arxiv.org/abs/1901.06109
-
[24]
Perception-aware motion planning via multiobjective search on gpus,
B. Ichter, B. Landry, E. Schmerling, and M. Pavone, “Perception-aware motion planning via multiobjective search on gpus,” 2017. [Online]. Available: https: //arxiv.org/abs/1705.02408
-
[25]
Look as you leap: Planning simultaneous motion and perception for high-dof robots,
Q. Meng, E. Flores, C. Q.-P. na, P. Qian, Z. Kingston, S. K. Hamlin, V . Unhelkar, and L. E. Kavraki, “Look as you leap: Planning simultaneous motion and perception for high-dof robots,” 2025. [Online]. Available: https://arxiv.org/abs/2509.19610
-
[26]
Greedy but safe re- planning under kinodynamic constraints,
K. E. Bekris and L. E. Kavraki, “Greedy but safe re- planning under kinodynamic constraints,” inProceedings 2007 IEEE International Conference on Robotics and Automation, 2007, pp. 704–710
work page 2007
-
[27]
A. Bircher, M. Kamel, K. Alexis, H. Oleynikova, and R. Siegwart, “Receding horizon ”next-best-view” planner for 3d exploration,” in2016 IEEE International Con- ference on Robotics and Automation (ICRA), 2016, pp. 1462–1468
work page 2016
-
[28]
Closed-loop next-best-view planning for target-driven grasping,
M. Breyer, L. Ott, R. Siegwart, and J. J. Chung, “Closed-loop next-best-view planning for target-driven grasping,” 2022. [Online]. Available: https://arxiv.org/ abs/2207.10543
-
[29]
Multi-view picking: Next-best-view reaching for improved grasping in clutter,
D. Morrison, P. Corke, and J. Leitner, “Multi-view picking: Next-best-view reaching for improved grasping in clutter,” in2019 International Conference on Robotics and Automation (ICRA). IEEE Press, 2019, p. 8762–8768. [Online]. Available: https://doi.org/10.1109/ ICRA.2019.8793805
-
[30]
Affordance-driven next-best-view planning for robotic grasping,
X. Zhang, D. Wang, S. Han, W. Li, B. Zhao, Z. Wang, X. Duan, C. Fang, X. Li, and J. He, “Affordance-driven next-best-view planning for robotic grasping,” 2023. [Online]. Available: https://arxiv.org/abs/2309.09556
-
[31]
Hypothesis-based belief planning for dexterous grasping,
C. Zito, V . Ortenzi, M. Adjigble, M. Kopicki, R. Stolkin, and J. L. Wyatt, “Hypothesis-based belief planning for dexterous grasping,” 2019. [Online]. Available: https://arxiv.org/abs/1903.05517
-
[32]
Active- perceptive motion generation for mobile manipulation,
S. Jauhri, S. Lueth, and G. Chalvatzaki, “Active- perceptive motion generation for mobile manipulation,”
-
[33]
Available: https://arxiv.org/abs/2310
[Online]. Available: https://arxiv.org/abs/2310. 00433
-
[34]
Map space belief prediction for manipulation-enhanced mapping,
J. M. C. Marques, N. Dengler, T. Zaenker, J. Mucke, S. Wang, M. Bennewitz, and K. Hauser, “Map space belief prediction for manipulation-enhanced mapping,”
-
[35]
Available: https://arxiv.org/abs/2502
[Online]. Available: https://arxiv.org/abs/2502. 20606
-
[36]
Neo: A novel expeditious optimisation algorithm for reactive motion control of manipulators,
J. Haviland and P. Corke, “Neo: A novel expeditious optimisation algorithm for reactive motion control of manipulators,”IEEE Robotics and Automation Letters, vol. 6, no. 2, p. 1043–1050, Apr. 2021. [Online]. Available: http://dx.doi.org/10.1109/LRA.2021.3056060
-
[37]
Continuous-time gaussian process motion planning via probabilistic inference,
M. Mukadam, J. Dong, X. Yan, F. Dellaert, and B. Boots, “Continuous-time gaussian process motion planning via probabilistic inference,”The International Journal of Robotics Research, vol. 37, no. 11, pp. 1319–1340, Sep. 2018. [Online]. Available: http: //dx.doi.org/10.1177/0278364918790369
-
[38]
Neural randomized planning for whole body robot motion,
Y . Lu, Y . Ma, D. Hsu, and P. Cai, “Neural randomized planning for whole body robot motion,” 2024. [Online]. Available: https://arxiv.org/abs/2405.11317
-
[39]
C. Wu, R. Wang, M. Song, F. Gao, J. Mei, and B. Zhou, “Real-time whole-body motion planning for mobile ma- nipulators using environment-adaptive search and spatial- temporal optimization,” in2024 IEEE International Con- ference on Robotics and Automation (ICRA), 2024, pp. 1369–1375
work page 2024
-
[40]
Y . Yang, F. Meng, Z. Meng, and C. Yang, “Rampage: Toward whole-body, real-time, and agile motion planning in unknown cluttered environments for mobile manip- ulators,”IEEE Transactions on Industrial Electronics, vol. 71, no. 11, pp. 14 492–14 502, 2024
work page 2024
-
[41]
T. S. Wilson, W. Thomason, Z. Kingston, L. E. Kavraki, and J. D. Gammell, “Nearest-neighbourless asymptot- ically optimal motion planning with Fully Connected Informed Trees (FCIT *),” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025, pp. 14 140–14 146
work page 2025
-
[42]
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
M. Ahn, A. Brohan, N. Brown, Y . Chebotar, R. Cortes, B. David, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzoget al., “Do as i can, not as i say: Grounding language in robotic affordances,” inConference on Robot Learning. PMLR, 2022, pp. 287–315. [Online]. Available: https://arxiv.org/abs/2204.01691
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[43]
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
W. Huang, C. Wang, R. Zhang, Y . Li, J. Wu, and L. Fei- Fei, “V oxposer: Composable 3d value maps for robotic manipulation with language models,” inConference on Robot Learning. PMLR, 2023, pp. 540–562. [Online]. Available: https://arxiv.org/abs/2307.05973
work page internal anchor Pith review arXiv 2023
-
[44]
Dynamem: Online dynamic spatio-semantic memory for open world mobile manipulation,
P. Liu, Z. Guo, M. Warke, S. Chintala, C. Paxton, N. M. M. Shafiullah, and L. Pinto, “Dynamem: Online dynamic spatio-semantic memory for open world mobile manipulation,” 2025. [Online]. Available: https://arxiv.org/abs/2411.04999
-
[45]
Adaptive skill coordination for robotic mobile manipulation
N. Yokoyama, A. Clegg, J. Truong, E. Undersander, T.-Y . Yang, S. Arnaud, S. Ha, D. Batra, and A. Rai, “Asc: Adaptive skill coordination for robotic mobile manipulation,” 2023. [Online]. Available: https://arxiv. org/abs/2304.00410
-
[46]
Robot learning of mobile manipulation with reachability behavior priors,
S. Jauhri, J. Peters, and G. Chalvatzaki, “Robot learning of mobile manipulation with reachability behavior priors,”IEEE Robotics and Automation Letters, vol. 7, no. 3, p. 8399–8406, 2022. [Online]. Available: https://doi.org/10.1109/LRA.2022.3188109
-
[47]
Pre-grasp approach- ing on mobile robots: A pre-active layered approach,
L. Naik, S. Kalkan, and N. Kr ¨uger, “Pre-grasp approach- ing on mobile robots: A pre-active layered approach,” IEEE Robotics and Automation Letters, vol. 9, no. 3, pp. 2606–2613, 2024
work page 2024
-
[48]
Quadwbg: Generalizable quadrupedal whole-body grasping,
J. Wang, J. Rajabov, C. Xu, Y . Zheng, and H. Wang, “Quadwbg: Generalizable quadrupedal whole-body grasping,” 2025. [Online]. Available: https://arxiv.org/abs/2411.06782
-
[49]
Gamma: Graspability- aware mobile manipulation policy learning based on online grasping pose fusion,
J. Zhang, N. Gireesh, J. Wang, X. Fang, C. Xu, W. Chen, L. Dai, and H. Wang, “Gamma: Graspability- aware mobile manipulation policy learning based on online grasping pose fusion,” 2024. [Online]. Available: https://arxiv.org/abs/2309.15459
-
[50]
RT-1: Robotics Transformer for Real-World Control at Scale
A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, J. Hsuet al., “Rt-1: Robotics transformer for real-world control at scale,”arXiv preprint arXiv:2212.06817, 2022. [Online]. Available: https://arxiv.org/abs/2212.06817
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[51]
PaLM-E: An Embodied Multimodal Language Model
D. Driess, F. Xia, M. S. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yuet al., “Palm-e: An embodied multimodal language model,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 665–13 675. [Online]. Available: https://arxiv.org/abs/2303.03378
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[52]
$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization
P. Intelligence, K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, M. Y . Galliker, D. Ghosh, L. Groom, K. Hausman, B. Ichter, S. Jakubczak, T. Jones, L. Ke, D. LeBlanc, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, A. Z. Ren, L. X. Shi, L. Smith, J. T. Springenberg, K. Stachowicz, J. Tanner, Q. V...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[53]
Sg-vla: Learning spatially-grounded vision- language-action models for mobile manipulation,
R. Tu, A. Shukla, S. Yoo, X. Li, J. Li, J. Xie, H. Su, and Z. Tu, “Sg-vla: Learning spatially-grounded vision- language-action models for mobile manipulation,” 2026. [Online]. Available: https://arxiv.org/abs/2603.22760
-
[54]
Momanipvla: Transferring vision-language- action models for general mobile manipulation,
Z. Wu, Y . Zhou, X. Xu, Z. Wang, and H. Yan, “Momanipvla: Transferring vision-language- action models for general mobile manipulation,” 2025. [Online]. Available: https://arxiv.org/abs/2503.13446
-
[55]
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
B. Zitkovich, A. Apple, D. Bodnar, T. Nguyen, A. Brohan, Y . Chebotar, C. Finn, K. Hausmanet al., “Rt- 2: Vision-language-action models transferred from web- scale real-world data,”arXiv preprint arXiv:2307.15818,
work page internal anchor Pith review Pith/arXiv arXiv
-
[56]
Available: https://arxiv.org/abs/2307
[Online]. Available: https://arxiv.org/abs/2307. 15818
-
[57]
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
M. Deitke, C. Clark, S. Lee, R. Tripathi, Y . Yang, J. S. Park, M. Salehi, N. Muennighoff, K. Lo, L. Soldaini, J. Lu, T. Anderson, E. Bransom, K. Ehsani, H. Ngo, Y . Chen, A. Patel, M. Yatskar, C. Callison-Burch, A. Head, R. Hendrix, F. Bastani, E. VanderBilt, N. Lambert, Y . Chou, A. Chheda, J. Sparks, S. Skjonsberg, M. Schmitz, A. Sarnat, B. Bischoff, P...
work page internal anchor Pith review arXiv 2024
-
[58]
M. Sundermeyer, A. Mousavian, R. Triebel, and D. Fox, “Contact-graspnet: Efficient 6-dof grasp generation in cluttered scenes,” in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE Press, 2021, p. 13438–13444. [Online]. Available: https://doi.org/10.1109/ICRA48506.2021.9561877
-
[59]
Trac-ik: An open-source library for improved solving of generic inverse kinematics,
P. Beeson and B. Ames, “Trac-ik: An open-source library for improved solving of generic inverse kinematics,” in2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids). IEEE Press, 2015, p. 928–935. [Online]. Available: https://doi.org/10.1109/ HUMANOIDS.2015.7363472
-
[60]
K. Kurzer, “Path planning in unstructured environments : A real-time hybrid a* implementation for fast and deterministic path generation for the kth research concept vehicle,” Master’s thesis, KTH, Integrated Transport Research Lab, ITRL, 2016. [Online]. Available: https://www.diva-portal.org/smash/record.jsf? pid=diva2:1057261
work page 2016
-
[61]
Rrt-connect: An efficient approach to single-query path planning,
J. Kuffner and S. LaValle, “Rrt-connect: An efficient approach to single-query path planning,” inProceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Pro- ceedings (Cat. No.00CH37065), vol. 2, 2000, pp. 995– 1001 vol.2
work page 2000
-
[62]
Creating high-quality roadmaps for motion planning in virtual environments,
R. Geraerts and M. H. Overmars, “Creating high-quality roadmaps for motion planning in virtual environments,” in2006 IEEE/RSJ International Conference on Intelli- gent Robots and Systems, 2006, pp. 4355–4361
work page 2006
-
[63]
Fast smoothing of manipulator trajectories using optimal bounded- acceleration shortcuts,
K. Hauser and V . Ng-Thow-Hing, “Fast smoothing of manipulator trajectories using optimal bounded- acceleration shortcuts,” in2010 IEEE International Con- ference on Robotics and Automation, 2010, pp. 2493– 2498
work page 2010
-
[64]
Collision- free and smooth trajectory computation in cluttered environments,
J. Pan, L. Zhang, and D. Manocha, “Collision- free and smooth trajectory computation in cluttered environments,”The International Journal of Robotics Research, vol. 31, no. 10, pp. 1155–1175, 2012. [Online]. Available: https://doi.org/10.1177/0278364912453186
-
[65]
LaNoising: A data-driven approach for 903nm ToF LiDAR performance modeling under fog,
S. Macenski, F. Martin, R. White, and J. Gin ´es Clavero, “The marathon 2: A navigation system,” in2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020. [Online]. Available: https: //doi.org/10.1109/IROS45743.2020.9341207
-
[66]
S. Tao, F. Xiang, A. Shukla, Y . Qin, X. Hinrichsen, X. Yuan, C. Bao, X. Lin, Y . Liu, T. kai Chan, Y . Gao, X. Li, T. Mu, N. Xiao, A. Gurha, V . N. Rajesh, Y . W. Choi, Y .-R. Chen, Z. Huang, R. Calandra, R. Chen, S. Luo, and H. Su, “Maniskill3: Gpu parallelized robotics simulation and rendering for generalizable embodied ai,”Robotics: Science and System...
-
[67]
Habitat 2.0: Training home assistants to rearrange their habitat,
A. Szot, A. Clegg, E. Undersander, E. Wijmans, Y . Zhao, J. Turner, N. Maestre, M. Mukadam, D. Chaplot, O. Maksymets, A. Gokaslan, V . V ondrus, S. Dharur, F. Meier, W. Galuba, A. Chang, Z. Kira, V . Koltun, J. Malik, M. Savva, and D. Batra, “Habitat 2.0: Training home assistants to rearrange their habitat,” inAdvances in Neural Information Processing Sys...
work page 2021
-
[68]
Acronym: A large-scale grasp dataset based on simulation,
C. Eppner, A. Mousavian, and D. Fox, “Acronym: A large-scale grasp dataset based on simulation,” in2021 IEEE International Conference on Robotics and Automa- tion (ICRA), 2021, pp. 6222–6227. APPENDIX This appendix provides supplementary material organized as follows: implementation details for grasp sampling, pre- grasp configuration generation, kinemati...
work page 2021
-
[69]
The object’s final height is within the expected range relative to the support surface (±5 cm below to 30 cm above)
-
[70]
The object’s total displacement from spawn is less than 30 cm (no excessive rolling/bouncing)
-
[71]
No collision with existing objects at spawn time Objects failing these checks are re-sampled at alternative positions until a stable placement is found or the maximum attempts (100) are exceeded. d) Oracle Grasp Checking:Stable objects undergo gras- pability validation to ensure each benchmark object is actually graspable under ideal conditions. The proce...
-
[72]
Approaches the target object without collision
-
[73]
Executes a grasp that achieves stable contact with the object
-
[74]
Lifts the object above a height threshold (10 cm above the support surface)
-
[75]
Maintains stable grasp for 2 seconds after lifting This physics-based validation accounts for the robot’s full kinematic and dynamic constraints during execution. b) Failure Categories:We categorize failures into three groups to identify system bottlenecks: Execution Failuresoccur during physical interaction: •Collision:Any contact between the robot body ...
-
[76]
System-Level Comparisons:We evaluate two alternative system designs that make different architectural choices for integrating navigation and manipulation, complementing the ablation study in the main paper. a) CapMap Placement:This system design follows the sequential navigation-then-manipulation paradigm but incorporates manipulation-aware base placement...
-
[77]
SPL is defined as: SPL= 1 N NX i=1 Si · li max(pi, li) (14) Fig
Additional Analysis: a) Path Efficiency Analysis:We evaluate path efficiency using Success weighted by Path Length (SPL), which mea- sures how efficiently the robot reaches the target relative to the optimal path length. SPL is defined as: SPL= 1 N NX i=1 Si · li max(pi, li) (14) Fig. 13:System-level comparison.Success rates for all six methods across unk...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.