Recognition: unknown
FLASH: Fast Learning via GPU-Accelerated Simulation for High-Fidelity Deformable Manipulation in Minutes
Pith reviewed 2026-05-10 05:26 UTC · model grok-4.3
The pith
A GPU-native simulator generates high-fidelity training data for deformable robot tasks in minutes, and policies trained only on that data transfer zero-shot to physical robots on folding jobs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FLASH is a GPU-native simulation framework for contact-rich deformable manipulation, built on an accurate NCP-based solver that enforces strict contact and deformation constraints while being explicitly designed for fine-grained GPU parallelism. Rather than porting conventional solvers, FLASH redesigns the physics engine with optimized collision handling and memory layouts. It scales to over 3 million degrees of freedom at 30 FPS on a single RTX 5090 while maintaining physical accuracy. Policies trained solely on FLASH-generated synthetic data in minutes achieve robust zero-shot sim-to-real transfer on physical robots performing towel folding and garment folding without any real-world data.
What carries the argument
The NCP-based solver redesigned from the ground up for GPU architectures, including optimized collision handling and memory layouts, that enforces contact and deformation constraints at interactive speeds.
If this is right
- Robot learning for deformable manipulation can proceed at large scale using only synthetic data generated in minutes.
- Contact-rich tasks become trainable without the previous bottleneck of slow or unstable simulation.
- Zero-shot deployment on hardware is possible for folding and similar soft-object interactions.
- Labor-intensive real-world demonstration collection can be replaced by fast GPU simulation for these tasks.
Where Pith is reading between the lines
- The same GPU redesign approach might accelerate simulation in other domains with many contact constraints.
- Combining this data generation speed with existing reinforcement learning methods could further cut training time.
- Extending the framework to additional materials and tasks would test how broadly the zero-shot transfer holds.
Load-bearing premise
The redesigned solver produces simulations whose physical behavior matches real deformable objects closely enough that policies transfer without real-world data or calibration.
What would settle it
A policy trained only on FLASH data that fails to fold a physical towel or garment on a robot, or that produces contact forces and deformations measurably different from real trials, would show the transfer claim does not hold.
Figures
read the original abstract
Simulation frameworks such as Isaac Sim have enabled scalable robot learning for locomotion and rigid-body manipulation; however, contact-rich simulation remains a major bottleneck for deformable object manipulation. The continuously changing geometry of soft materials, together with large numbers of vertices and contact constraints, makes it difficult to achieve high accuracy, speed, and stability required for large-scale interactive learning. We present FLASH, a GPU-native simulation framework for contact-rich deformable manipulation, built on an accurate NCP-based solver that enforces strict contact and deformation constraints while being explicitly designed for fine-grained GPU parallelism. Rather than porting conventional single-instruction-multiple-data (SIMD) solvers to GPUs, FLASH redesigns the physics engine from the ground up to leverage modern GPU architectures, including optimized collision handling and memory layouts. As a result, FLASH scales to over 3 million degrees of freedom at 30 FPS on a single RTX 5090, while accurately simulating physical interactions. Policies trained solely on FLASH-generated synthetic data in minutes achieve robust zero-shot sim-to-real transfer, which we validate on physical robots performing challenging deformable manipulation tasks such as towel folding and garment folding, without any real-world demonstration, providing a practical alternative to labor-intensive real-world data collection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents FLASH, a GPU-native simulation framework for contact-rich deformable manipulation built on a redesigned NCP-based solver. It claims to achieve high-fidelity simulation at scale (over 3 million DOF at 30 FPS on a single RTX 5090) through optimized collision handling and memory layouts, enabling policies trained solely on synthetic data in minutes to achieve robust zero-shot sim-to-real transfer on physical robots for tasks including towel folding and garment folding, without any real-world demonstrations.
Significance. If the accuracy and transfer claims hold, the work would be significant for scalable robot learning in deformable manipulation, offering a practical alternative to labor-intensive real data collection by leveraging fast, parallel GPU simulation for contact-rich interactions.
major comments (2)
- [Experiments] Experiments section: The validation of zero-shot sim-to-real transfer relies on qualitative visual similarity and downstream RL success rates for towel/garment folding, but provides no quantitative benchmarks (e.g., vertex trajectory RMSE, contact force errors, or folding metric comparisons) against motion-capture or force-torque data from the physical setup. This is load-bearing for the central claim that the NCP solver's fidelity, rather than policy regularization, closes the sim-to-real gap.
- [Method] Method section on NCP solver redesign: The paper asserts that the GPU-redesigned solver 'enforces strict contact and deformation constraints' and maintains 'physical accuracy' at scale, yet no direct comparisons (e.g., against established deformable solvers in Isaac Sim or MuJoCo) or ablation on constraint violation metrics are reported for contact-rich cloth dynamics. Without such grounding, the speed-accuracy tradeoff enabling reliable transfer remains unverified.
minor comments (1)
- [Abstract] Abstract and introduction: The claim of 'accurately simulating physical interactions' would benefit from a brief reference to any internal validation metrics used during solver development.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and describe the revisions we will make to strengthen the validation of our claims.
read point-by-point responses
-
Referee: [Experiments] Experiments section: The validation of zero-shot sim-to-real transfer relies on qualitative visual similarity and downstream RL success rates for towel/garment folding, but provides no quantitative benchmarks (e.g., vertex trajectory RMSE, contact force errors, or folding metric comparisons) against motion-capture or force-torque data from the physical setup. This is load-bearing for the central claim that the NCP solver's fidelity, rather than policy regularization, closes the sim-to-real gap.
Authors: We agree that quantitative metrics would provide stronger grounding for the sim-to-real claims. Our current evaluation demonstrates robust transfer via repeated physical trials with high task success rates on towel and garment folding, which we view as the most relevant end-to-end metric for manipulation policies. Collecting precise motion-capture trajectories for highly deformable objects is technically challenging due to self-occlusions and non-rigid surfaces. In the revision we will expand the experiments section with detailed success-rate statistics (including standard deviations across trials), more extensive qualitative frame-by-frame comparisons, and an explicit discussion of why direct RMSE-style metrics are difficult to obtain. We will also clarify that the policies use standard RL without specialized regularization, supporting the role of simulation fidelity. revision: partial
-
Referee: [Method] Method section on NCP solver redesign: The paper asserts that the GPU-redesigned solver 'enforces strict contact and deformation constraints' and maintains 'physical accuracy' at scale, yet no direct comparisons (e.g., against established deformable solvers in Isaac Sim or MuJoCo) or ablation on constraint violation metrics are reported for contact-rich cloth dynamics. Without such grounding, the speed-accuracy tradeoff enabling reliable transfer remains unverified.
Authors: We acknowledge the value of direct comparisons. The manuscript focuses on the novel GPU-native redesign and its scaling behavior, but we agree that explicit accuracy grounding would strengthen the presentation. In the revised version we will add a new subsection with side-by-side comparisons of constraint violation (penetration depth, normal force consistency) and deformation energy metrics against MuJoCo and Isaac Sim on standardized cloth benchmarks at comparable scales. We will also include an ablation isolating the contributions of our optimized collision handling and memory layouts to these accuracy metrics. These additions will directly address the speed-accuracy tradeoff. revision: yes
Circularity Check
No circularity in claimed derivation chain
full rationale
The paper introduces a new GPU-native simulation framework (FLASH) for contact-rich deformable manipulation, with claims resting on framework redesign for parallelism, scaling results, and empirical zero-shot sim-to-real validation on physical tasks. No equations, derivations, fitted parameters, or predictions appear in the provided text that reduce by construction to inputs, self-citations, or ansatzes. Central assertions about solver accuracy and transfer performance are presented as outcomes of the design and experiments rather than self-referential definitions or renamed known results. This is a standard non-circular engineering contribution.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Airbot play: 6-dof robotic arm
Airbot. Airbot play: 6-dof robotic arm. https://airbots. online, 2024. Accessed: 2026-01-24
2024
-
[2]
Contact and friction simulation for computer graphics
Sheldon Andrews, Kenny Erleben, and Zachary Fer- guson. Contact and friction simulation for computer graphics. InACM SIGGRAPH 2022 Courses, SIG- GRAPH ’22, New York, NY , USA, 2022. Association for Computing Machinery. ISBN 9781450393621. doi: 10.1145/3532720.3535640. URL https://doi.org/10.1145/ 3532720.3535640
-
[3]
Genesis: A generative and universal physics engine for robotics and beyond, December 2024
Genesis Authors. Genesis: A generative and universal physics engine for robotics and beyond, December 2024. URL https://github.com/Genesis-Embodied-AI/Genesis
2024
-
[4]
Bifold: Bimanual cloth folding with language guidance.arXiv preprint arXiv:2501.16458, 2025
Oriol Barbany, Adrià Colomé, and Carme Torras. Bifold: Bimanual cloth folding with language guidance.arXiv preprint arXiv:2501.16458, 2025
-
[5]
Qdp: Learning to sequentially optimise quasi-static and dynamic manipulation primi- tives for robotic cloth manipulation
David Blanco-Mulero, Gokhan Alcan, Fares J Abu- Dakka, and Ville Kyrki. Qdp: Learning to sequentially optimise quasi-static and dynamic manipulation primi- tives for robotic cloth manipulation. In2023 IEEE/RSJ International Conference on Intelligent Robots and Sys- tems (IROS), pages 984–991. IEEE, 2023
2023
-
[6]
Projective dynamics: Fusing constraint projections for fast simulation
Sofien Bouaziz, Sebastian Martin, Tiantian Liu, Ladislav Kavan, and Mark Pauly. Projective dynamics: Fusing constraint projections for fast simulation. InSeminal Graphics Papers: Pushing the Boundaries, Volume 2, pages 787–797. 2023
2023
-
[7]
Qingwen Bu, Jisong Cai, Li Chen, Xiuqi Cui, Yan Ding, Siyuan Feng, Shenyuan Gao, Xindong He, Xuan Hu, Xu Huang, et al. Agibot world colosseo: A large- scale manipulation platform for scalable and intelligent embodied systems.arXiv preprint arXiv:2503.06669, 2025
work page internal anchor Pith review arXiv 2025
-
[8]
The pinocchio c++ library – a fast and flexible implementation of rigid body dynamics algorithms and their analytical derivatives
Justin Carpentier, Guilhem Saurel, Gabriele Buondonno, Joseph Mirabel, Florent Lamiraux, Olivier Stasse, and Nicolas Mansard. The pinocchio c++ library – a fast and flexible implementation of rigid body dynamics algorithms and their analytical derivatives. InIEEE International Symposium on System Integrations (SII), 2019
2019
-
[9]
Vertex block descent.ACM Transactions on Graphics (TOG), 43(4):1–16, 2024
Anka He Chen, Ziheng Liu, Yin Yang, and Cem Yuksel. Vertex block descent.ACM Transactions on Graphics (TOG), 43(4):1–16, 2024
2024
-
[10]
Haonan Chen, Junxiao Li, Ruihai Wu, Yiwei Liu, Yi- wen Hou, Zhixuan Xu, Jingxiang Guo, Chongkai Gao, Zhenyu Wei, Shensi Xu, et al. Metafold: Language- guided multi-category garment folding framework via trajectory generation and foundation model.arXiv preprint arXiv:2503.08372, 2025
-
[11]
Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects.The International Journal of Robotics Research, 43(4):389–404, 2024
Cheng Chi, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, and Shuran Song. Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects.The International Journal of Robotics Research, 43(4):389–404, 2024
2024
-
[12]
Learning to collaborate from simulation for robot-assisted dressing
Alexander Clegg, Zackory Erickson, Patrick Grady, Greg Turk, Charles C Kemp, and C Karen Liu. Learning to collaborate from simulation for robot-assisted dressing. IEEE Robotics and Automation Letters, 5(2):2746–2753, 2020
2020
-
[13]
Garfield: Addressing the visual sim-to-real gap in gar- ment manipulation with mesh-attached radiance fields
Donatien Delehelle, Darwin Caldwell, and Fei Chen. Garfield: Addressing the visual sim-to-real gap in gar- ment manipulation with mesh-attached radiance fields. In2024 IEEE International Conference on Robotics and Biomimetics (ROBIO), pages 77–84. IEEE, 2024
2024
-
[14]
General-purpose clothes manipulation with semantic keypoints
Yuhong Deng and David Hsu. General-purpose clothes manipulation with semantic keypoints. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 13181–13187. IEEE, 2025
2025
-
[15]
Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfold- ing
Huy Ha and Shuran Song. Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfold- ing. InConference on Robot Learning, pages 24–33. PMLR, 2022
2022
-
[16]
Wenkang Hu, Xincheng Tang, Yitong Li, Zhengjie Shu, Wei Li, Huamin Wang, Ruigang Yang, et al. Real gar- ment benchmark (rgbench): A comprehensive benchmark for robotic garment manipulation featuring a high-fidelity scalable simulator.arXiv preprint arXiv:2511.06434, 2025
-
[17]
Ultra- lytics YOLO, 2023
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. Ultra- lytics YOLO, 2023. URL https://github.com/ultralytics/ ultralytics. Accessed: 2026-01-19
2023
-
[18]
Kaufman, Shinjiro Sueda, Doug L
Danny M. Kaufman, Shinjiro Sueda, Doug L. James, and Dinesh K. Pai. Staggered projections for frictional contact in multibody systems.ACM Trans. Graph., 27 (5), December 2008. ISSN 0730-0301. doi: 10.1145/ 1409060.1409117
-
[19]
Learning keypoints for robotic cloth manipula- tion using synthetic data.IEEE Robotics and Automation Letters, 9(7):6528–6535, 2024
Thomas Lips, Victor-Louis De Gusseme, and Francis Wyffels. Learning keypoints for robotic cloth manipula- tion using synthetic data.IEEE Robotics and Automation Letters, 9(7):6528–6535, 2024
2024
-
[20]
Garmentlab: A unified simulation and benchmark for garment manipulation.Advances in Neu- ral Information Processing Systems, 37:11866–11903, 2024
Haoran Lu, Ruihai Wu, Yitong Li, Sijie Li, Ziyu Zhu, Chuanruo Ning, Yan Zhao, Longzan Luo, Yuanpei Chen, and Hao Dong. Garmentlab: A unified simulation and benchmark for garment manipulation.Advances in Neu- ral Information Processing Systems, 37:11866–11903, 2024
2024
-
[21]
Xpbd: position-based simulation of compliant constrained dynamics
Miles Macklin, Matthias Müller, and Nuttapong Chen- tanez. Xpbd: position-based simulation of compliant constrained dynamics. InProceedings of the 9th Inter- national Conference on Motion in Games, pages 49–54, 2016
2016
-
[22]
Non-smooth newton methods for deformable multi-body dynamics.ACM Transactions on Graphics (TOG), 38(5):1–20, 2019
Miles Macklin, Kenny Erleben, Matthias Müller, Nut- tapong Chentanez, Stefan Jeschke, and Viktor Makoviy- chuk. Non-smooth newton methods for deformable multi-body dynamics.ACM Transactions on Graphics (TOG), 38(5):1–20, 2019
2019
-
[23]
Small steps in physics simulation
Miles Macklin, Kier Storey, Michelle Lu, Pierre Terdi- man, Nuttapong Chentanez, Stefan Jeschke, and Matthias Müller. Small steps in physics simulation. InProceedings of the 18th Annual ACM SIGGRAPH/Eurographics Sym- posium on Computer Animation, SCA ’19, New York, NY , USA, 2019. Association for Computing Machinery. ISBN 9781450366779. doi: 10.1145/33094...
-
[24]
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, et al. Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021
work page internal anchor Pith review arXiv 2021
-
[25]
Real-to-sim parameter learning for deformable packages using high-fidelity sim- ulators for robotic manipulation
Omey M Manyar, Hantao Ye, Siddharth Mayya, Fan Wang, and Satyandra K Gupta. Real-to-sim parameter learning for deformable packages using high-fidelity sim- ulators for robotic manipulation. InInternational Design Engineering Technical Conferences and Computers and Information in Engineering Conference, volume 89206, page V02AT02A004. American Society of M...
2025
-
[26]
Jan Matas, Stephen James, and Andrew J. Davison. Sim-to-real reinforcement learning for deformable object manipulation. In Aude Billard, Anca Dragan, Jan Peters, and Jun Morimoto, editors,Proceedings of The 2nd Con- ference on Robot Learning, volume 87 ofProceedings of Machine Learning Research, pages 734–743. PMLR, 29–31 Oct 2018
2018
-
[27]
Learning robust perceptive locomotion for quadrupedal robots in the wild.Science robotics, 7(62):eabk2822, 2022
Takahiro Miki, Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, and Marco Hutter. Learning robust perceptive locomotion for quadrupedal robots in the wild.Science robotics, 7(62):eabk2822, 2022
2022
-
[28]
Position based dynamics.Journal of Visual Communication and Image Representation, 18(2): 109–118, 2007
Matthias Müller, Bruno Heidelberger, Marcus Hennix, and John Ratcliff. Position based dynamics.Journal of Visual Communication and Image Representation, 18(2): 109–118, 2007
2007
-
[29]
Newton: An open-source, gpu-accelerated physics engine for robotics, 2025
Newton Project Contributors. Newton: An open-source, gpu-accelerated physics engine for robotics, 2025. URL https://github.com/newton-physics/newton. Initiated by Disney Research, Google DeepMind, and NVIDIA
2025
-
[30]
and Li, Jie and Narain, Rahul , journal=
Matthew Overby, George E. Brown, Jie Li, and Rahul Narain. ADMM Projective Dynamics: Fast Simulation of Hyperelastic Models with Dynamic Constraints.IEEE Transactions on Visualization and Computer Graphics, 23(10):2222–2234, October 2017. ISSN 1077-2626. doi: 10.1109/TVCG.2017.2730875. URL http://ieeexplore. ieee.org/document/7990052/
-
[31]
Open x-embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0
Abby O’Neill, Abdul Rehman, Abhiram Maddukuri, Ab- hishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, et al. Open x-embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 6892–6903. IEEE, 2024
2024
-
[32]
Adamu humanoid robot documenta- tion
PNDbotics. Adamu humanoid robot documenta- tion. https://wiki.pndbotics.com/half_robot/half_robot,
-
[33]
Accessed: 2026-01-24
2026
-
[34]
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Ro- man Rädle, Chloe Rolland, Laura Gustafson, et al. Sam 2: Segment anything in images and videos.arXiv preprint arXiv:2408.00714, 2024
work page internal anchor Pith review arXiv 2024
-
[35]
A reduction of imitation learning and structured prediction to no-regret online learning
Stéphane Ross, Geoffrey Gordon, and Drew Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. InProceedings of the fourteenth international conference on artificial intelli- gence and statistics, pages 627–635. JMLR Workshop and Conference Proceedings, 2011
2011
-
[36]
Yingdong Ru, Lipeng Zhuang, Zhuo He, Florent P Audonnet, and Gerardo Aragon-Caramasa. Can real- to-sim approaches capture dynamic fabric behavior for robotic fabric manipulation?arXiv preprint arXiv:2503.16310, 2025
-
[37]
Learning deformable object manipulation from expert demonstrations.IEEE Robotics and Automation Letters, 7(4):8775–8782, 2022
Gautam Salhotra, I-Chun Arthur Liu, Marcus Dominguez-Kuhne, and Gaurav S Sukhatme. Learning deformable object manipulation from expert demonstrations.IEEE Robotics and Automation Letters, 7(4):8775–8782, 2022
2022
-
[38]
Deep imitation learning of sequential fabric smoothing from an algorithmic supervisor
Daniel Seita, Aditya Ganapathi, Ryan Hoque, Minho Hwang, Edward Cen, Ajay Kumar Tanwani, Ashwin Bal- akrishna, Brijen Thananjeyan, Jeffrey Ichnowski, Nawid Jamali, et al. Deep imitation learning of sequential fabric smoothing from an algorithmic supervisor. In2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 9651–9658....
2020
-
[39]
Fem simulation of 3d deformable solids: a practitioner’s guide to theory, discretization and model reduction
Eftychios Sifakis and Jernej Barbic. Fem simulation of 3d deformable solids: a practitioner’s guide to theory, discretization and model reduction. InAcm siggraph 2012 courses, pages 1–50. 2012
2012
-
[40]
D. E. Stewart and J. C. Trinkle. An Implicit Time-Stepping Scheme for Rigid Body Dynamics with Inelastic Collisions and Coulomb Friction.Inter- national Journal for Numerical Methods in Engi- neering, 39(15):2673–2691, 1996. ISSN 1097-0207. doi: 10.1002/(SICI)1097-0207(19960815)39:15<2673:: AID-NME972>3.0.CO;2-I
-
[41]
A material point method for snow simulation.ACM Transactions on Graphics (TOG), 32(4):1–10, 2013
Alexey Stomakhin, Craig Schroeder, Lawrence Chai, Joseph Teran, and Andrew Selle. A material point method for snow simulation.ACM Transactions on Graphics (TOG), 32(4):1–10, 2013
2013
-
[42]
Tongxuan Tian, Haoyang Li, Bo Ai, Xiaodi Yuan, Zhiao Huang, and Hao Su. Diffusion dynamics models with generative state estimation for cloth manipulation.arXiv preprint arXiv:2503.11999, 2025
-
[43]
Mujoco: A physics engine for model-based control
Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012
2012
-
[44]
arXiv preprint arXiv:2505.11032 , year=
Yuran Wang, Ruihai Wu, Yue Chen, Jiarui Wang, Jiaqi Liang, Ziyu Zhu, Haoran Geng, Jitendra Malik, Pieter Abbeel, and Hao Dong. Dexgarmentlab: Dexterous garment manipulation environment with generalizable policy.arXiv preprint arXiv:2505.11032, 2025
-
[45]
Fabricflownet: Bi- manual cloth manipulation with a flow-based policy
Thomas Weng, Sujay Man Bajracharya, Yufei Wang, Khush Agrawal, and David Held. Fabricflownet: Bi- manual cloth manipulation with a flow-based policy. In Conference on Robot Learning, pages 192–202. PMLR, 2022
2022
-
[46]
IOS Press Amsterdam, The Netherlands, 2007
J Westwood et al.SOFA—an open source framework for medical simulation, volume 125. IOS Press Amsterdam, The Netherlands, 2007
2007
-
[47]
Learning to manipulate deformable objects without demonstrations
Yilin Wu, Wilson Yan, Thanard Kurutach, Lerrel Pinto, and Pieter Abbeel. Learning to manipulate deformable objects without demonstrations. InRobotics: Science and Systems, 2020
2020
-
[48]
Fast but accurate: A real-time hyperelastic simulator with robust frictional contact
Ziqiu Zeng, Siyuan Luo, Fan Shi, and Zhongkai Zhang. Fast but accurate: A real-time hyperelastic simulator with robust frictional contact. 44(4), July 2025. ISSN 0730-
2025
-
[49]
doi: 10.1145/3730834. URL https://doi.org/10. 1145/3730834
-
[50]
Sheng Zhong, Thomas Power, Ashwin Gupta, and Pe- ter Mitrano. PyTorch Kinematics. https://github. com/UM-ARM-Lab/pytorch_kinematics, 2024. doi: 10.5281/zenodo.7700587
-
[51]
Clothesnet: An information-rich 3d garment model repository with simulated clothes environment
Bingyang Zhou, Haoyu Zhou, Tianhai Liang, Qiaojun Yu, Siheng Zhao, Yuwei Zeng, Jun Lv, Siyuan Luo, Qiancai Wang, Xinyuan Yu, Haonan Chen, Cewu Lu, and Lin Shao. Clothesnet: An information-rich 3d garment model repository with simulated clothes environment. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 20428–20438...
2023
-
[52]
Rt-2: Vision-language- action models transfer web knowledge to robotic control
Brianna Zitkovich, Tianhe Yu, Sichun Xu, Peng Xu, Ted Xiao, Fei Xia, Jialin Wu, Paul Wohlhart, Stefan Welker, Ayzaan Wahid, et al. Rt-2: Vision-language- action models transfer web knowledge to robotic control. InConference on Robot Learning, pages 2165–2183. PMLR, 2023. SUPPLEMENTARYMATERIALS VIII. SIMULATIONSYSTEMSPECIFICATION To demonstrate the usabili...
2023
-
[53]
•Scene Configuration (Fig
Simulation Configuration:We configure the underlying physics engine through two distinct specifications: one establishes global simulation settings (e.g., gravity, timestep), static boundaries (e.g., plane collisions), and numerical solvers, while the other defines object-specific attributes (e.g., initial transformation and mechanical properties). •Scene...
-
[54]
timestep
High-Level Python Interface:Fig. 9 (c) demonstrates how to use the Python API to drive the simulation. The workflow proceeds as follows: •Initialization:The script first loads the scene configuration usingsim.load_config()and sets the parallel environment count viasim.set_envs(). The C++ backend parses the JSON files and automatically initializes the spec...
-
[55]
a) AdamU Setup (Eye-in-Hand):To achieve a comprehensive top-down view, we mounted a ZED Mini camera via a custom rigid extension above the robot’s head
Extrinsic Alignment:We present the specific calibration procedures and validation results for our two experimental setups below. a) AdamU Setup (Eye-in-Hand):To achieve a comprehensive top-down view, we mounted a ZED Mini camera via a custom rigid extension above the robot’s head. Since the camera moves with the neck joints (yaw and pitch), we formulated ...
-
[56]
corner lift
System Identification:We aim to identify the physical parameters of the deformable object that minimize the behavioral discrepancy between simulation and reality. We focus on tuning the material properties governing deformation, specifically Young’s modulus and Poisson’s ratio. a) Data Collection:We design a canonical "corner lift" interaction to fully ex...
-
[57]
11:Visualization of System Identification.(a) Real-world experimental setup for data collection
Depth Simulation and Perception Augmentation:To ensure the policy transfers zero-shot to the real world, the perception pipeline in simulation must mimic both the sensor-level imperfections and the semantic-level segmentation errors observed in Fig. 11:Visualization of System Identification.(a) Real-world experimental setup for data collection. (b) Spatia...
-
[58]
As illustrated in Fig
Teacher Synthesis:We generate teacher actions from cloth-state information using a hierarchical finite-state machine design. As illustrated in Fig. 13, we design a low-level pattern for the end-effector (EE) to grasp and transport keypoints via: •anApproachprimitive that maintains a vertical hover while moving laterally until the EE is horizontally aligne...
-
[59]
Student Architecture:Our student policy architecture is illustrated in Fig. 14. We concatenate five-step histories of both proprioceptive and perceptual inputs. Perception is encoded with a convolutional neural network (CNN), while the remaining components are implemented with multilayer perceptrons (MLPs). We additionally train the model to reconstruct s...
-
[60]
In addition to the action distillation objectives, we include an auxiliary mean-squared error (MSE) loss for state reconstruction
Training Details:We apply DAgger [34] to distill teacher actions into deployable student policies, using the mean- absolute error (MAE) loss for position delta and the log-probability loss for EE open/close logits. In addition to the action distillation objectives, we include an auxiliary mean-squared error (MSE) loss for state reconstruction. During trai...
-
[61]
Transport Keypoint T arget
-
[62]
GRASP (Reach down & Close EE)
-
[63]
13:Low-level teacher primitives.Our teachers use the depicted low-level primitives to grasp and transport specific keypoints to target locations
TRANSPORT (EE Closed) Horizontal Dist <Threshold EE Closed & Contact Detected Object Dropped (Recovery) (a) Illustration of teacher EE trajectory (b) Low-level primitive state machine Fig. 13:Low-level teacher primitives.Our teachers use the depicted low-level primitives to grasp and transport specific keypoints to target locations. TABLE IV: High-Level T...
-
[64]
First, a lightweight YOLO detector, fine-tuned on a small set of real-world images, identifies the target object’s bounding box
Online Segmentation Pipeline:We adopt a streamlined combination ofYOLO[17] andSAM 2[33] for real-time background removal. First, a lightweight YOLO detector, fine-tuned on a small set of real-world images, identifies the target object’s bounding box. This box is then used as a spatial prompt for SAM 2 to generate a precise pixel-level mask in a zero-shot ...
-
[65]
•Perception and Inference:The ZED Mini camera is connected to the Orin NX, which serves as the unit for visual perception
Policy Deployment and Action Execution: a) AdamU Deployment:For the real-robot deployment of AdamU, we implement an onboard asynchronous multi-threading mechanism across a heterogeneous computing platform, consisting of an NVIDIA Jetson Orin NX (high-level controller) and an Intel NUC (low-level controller). •Perception and Inference:The ZED Mini camera i...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.