DeformGen: Dynamics-Based Topology Augmentation for Deformable Manipulation Policy Learning
Pith reviewed 2026-06-25 20:36 UTC · model grok-4.3
The pith
DeformGen augments demonstration data for deformable manipulation by using localized physical disturbances, forward simulation, and deformation-field warping to expand valid states and transfer trajectories.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DeformGen achieves topological diversity for deformable objects by expanding the valid state distribution through localized physical disturbances and forward simulation, and by transferring trajectories via deformation-field warping, jointly augmenting states and behaviors.
What carries the argument
DeformGen framework that applies localized physical disturbances followed by forward dynamics simulation for states and deformation-field warping to adapt source trajectories to new geometries.
If this is right
- Policies trained with the augmented data achieve higher success rates than those trained on original demonstrations.
- The generated states respect physical constraints better than those produced by rigid pose perturbations.
- Trajectory transfer maintains consistent end-effector behavior across deformed object geometries.
- Joint state and behavior augmentation improves learning across multiple high-fidelity deformable manipulation tasks.
Where Pith is reading between the lines
- The same disturbance-plus-simulation idea might apply to other non-rigid robotic tasks such as pouring or folding where state validity is hard to sample directly.
- If the forward simulator matches real material behavior closely enough, the method could lower the amount of real-world demonstration collection needed for new objects.
- Extending the warping step to handle contact-rich interactions could reveal whether the current approach breaks when objects touch multiple surfaces or each other.
Load-bearing premise
Forward-simulating dynamics from localized physical disturbances produces topology-coherent and physically plausible states that improve policy learning, and deformation-field warping transfers trajectories while preserving essential manipulation behavior.
What would settle it
Running the reported benchmark experiments and finding that policies trained on DeformGen-augmented data achieve no higher success rates than policies trained on the original demonstrations alone.
read the original abstract
Demonstration augmentation is proposed for cost-efficient data acquisition, but existing methods are fundamentally limited in deformable manipulation due to two challenges: (1) the state space is high-dimensional with physics-induced constraints, making valid configurations impossible to reach via low-dimensional pose perturbations; and (2) trajectory transfer is non-equivariant, as material points no longer move rigidly together under deformation. We present DeformGen, a dynamics-based augmentation framework that achieves topological diversity for deformable objects. For the state challenge, DeformGen expands the valid state distribution by applying localized physical disturbances and forward-simulating the dynamics to obtain topology-coherent, physically plausible deformable states. For the trajectory challenge, DeformGen transfers source manipulation trajectories via deformation-field warping, which lifts per-particle displacements into a continuous spatial function to adapt the end-effector trajectory consistently with the deformed geometry. In this way, our method jointly augments the state distribution and its associated manipulation behavior. Experiments on high-fidelity deformable manipulation benchmarks show that DeformGen generally improves policy learning compared with training on the original demonstrations alone and with rigid-style augmentation baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents DeformGen, a dynamics-based augmentation framework for deformable manipulation policy learning. It targets two challenges in demonstration augmentation: (1) high-dimensional physics-constrained state spaces unreachable by low-dimensional perturbations, addressed via localized physical disturbances followed by forward simulation to produce topology-coherent states; and (2) non-equivariant trajectory transfer under deformation, addressed via deformation-field warping that lifts per-particle displacements to a continuous spatial function for consistent end-effector adaptation. The method jointly augments states and behaviors, with experiments on high-fidelity benchmarks claiming general policy improvements over original demonstrations and rigid-style baselines.
Significance. If the empirical improvements hold under detailed scrutiny, the work could advance data-efficient learning for deformable robotics by providing a physics-grounded alternative to purely geometric augmentation, potentially reducing reliance on extensive real-world data collection while preserving physical plausibility.
major comments (3)
- [Abstract / Experiments] Abstract and experimental claims: the central assertion of 'general improvement' in policy learning is presented without any quantitative metrics, error bars, statistical tests, or details on data exclusion, which directly limits evaluation of the magnitude and reliability of the reported gains over baselines.
- [Method (state augmentation component)] Method description (state augmentation): while localized disturbances and forward simulation are proposed to expand the valid state distribution, the manuscript provides no explicit validation (e.g., via metrics on physical plausibility or topology coherence) that the generated states remain within the manifold of feasible deformable configurations without introducing artifacts.
- [Method (trajectory warping component)] Method description (trajectory transfer): the deformation-field warping is claimed to preserve essential manipulation behavior, but no analysis or ablation is given on whether the lifted continuous function introduces harmful artifacts or alters task-relevant dynamics in the transferred trajectories.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, agreeing where revisions are needed to improve clarity and rigor, and outlining specific changes we will make.
read point-by-point responses
-
Referee: [Abstract / Experiments] Abstract and experimental claims: the central assertion of 'general improvement' in policy learning is presented without any quantitative metrics, error bars, statistical tests, or details on data exclusion, which directly limits evaluation of the magnitude and reliability of the reported gains over baselines.
Authors: We acknowledge that while the experimental section reports comparative success rates on high-fidelity benchmarks against original data and rigid baselines, the presentation lacks explicit error bars, statistical tests, and data exclusion details. In the revised manuscript, we will add these elements (including standard deviations from multiple seeds, t-tests for significance, and explicit data handling protocols) and revise the abstract to reference key quantitative gains. This will directly address the concern about evaluating reliability. revision: yes
-
Referee: [Method (state augmentation component)] Method description (state augmentation): while localized disturbances and forward simulation are proposed to expand the valid state distribution, the manuscript provides no explicit validation (e.g., via metrics on physical plausibility or topology coherence) that the generated states remain within the manifold of feasible deformable configurations without introducing artifacts.
Authors: The referee is correct that the current manuscript does not include explicit quantitative validation metrics for the generated states. We will add a dedicated validation subsection (or appendix) reporting metrics such as physical property preservation (e.g., mass conservation, collision-free checks post-simulation) and topology coherence (e.g., mesh connectivity analysis), along with qualitative examples. These will confirm the states remain feasible without artifacts. revision: yes
-
Referee: [Method (trajectory warping component)] Method description (trajectory transfer): the deformation-field warping is claimed to preserve essential manipulation behavior, but no analysis or ablation is given on whether the lifted continuous function introduces harmful artifacts or alters task-relevant dynamics in the transferred trajectories.
Authors: We agree that an explicit analysis or ablation on potential artifacts from the continuous deformation-field lifting is missing. In the revision, we will incorporate an ablation study comparing trajectory fidelity (e.g., end-effector deviation metrics) and downstream policy performance with/without the lifting step, plus checks for dynamics preservation. This will substantiate the claim that essential behavior is maintained. revision: yes
Circularity Check
No significant circularity
full rationale
The paper presents a method using external forward dynamics simulation from localized disturbances to generate new states and deformation-field warping to transfer trajectories. These steps rely on standard physics engines and spatial interpolation rather than any fitted parameters, self-definitions, or self-citation chains that reduce the claimed outputs to the inputs by construction. The central claims concern empirical policy improvement on benchmarks and are not derived tautologically from the method description itself. No load-bearing steps match the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Forward simulation of localized physical disturbances produces topology-coherent, physically plausible deformable states.
Reference graph
Works this paper leans on
-
[1]
pi0: A vision-language-action flow model for general robot control
Kevin Black, Noah Brown, Danny Driess, Adnan Esmail, Michael Equi, Chelsea Finn, Niccolo Fusai, Lachy Groom, Karol Hausman, Brian Ichter, et al. pi0: A vision-language-action flow model for general robot control. arXiv preprint, 2024
2024
-
[2]
pi0.5: a vision-language-action model with open-world generalization.arXiv preprint, 2025
Physical Intelligence, Kevin Black, Noah Brown, James Darpinian, Karan Dhabalia, Danny Driess, Adnan Es- mail, Michael Equi, Chelsea Finn, Niccolo Fusai, et al. pi0.5: a vision-language-action model with open-world generalization.arXiv preprint, 2025
2025
-
[3]
Openvla: An open-source vision-language-action model.arXiv preprint, 2024
Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, et al. Openvla: An open-source vision-language-action model.arXiv preprint, 2024
2024
-
[4]
Gr00t n1: An open foundation model for generalist humanoid robots
Johan Bjorck, Fernando Castañeda, Nikita Cherniadev, Xingye Da, Runyu Ding, Linxi Fan, Yu Fang, Dieter Fox, Fengyuan Hu, Spencer Huang, et al. Gr00t n1: An open foundation model for generalist humanoid robots. arXiv preprint, 2025
2025
-
[5]
Internvla-m1: A spatially guided vision-language-action framework for generalist robot policy
Xinyi Chen, Yilun Chen, Yanwei Fu, Ning Gao, Jiaya Jia, Weiyang Jin, Hao Li, Yao Mu, Jiangmiao Pang, Yu Qiao, et al. Internvla-m1: A spatially guided vision-language-action framework for generalist robot policy. arXiv preprint arXiv:2510.13778, 2025
Pith/arXiv arXiv 2025
-
[6]
Dreamvla: A vision-language-action model dreamed with comprehensive world knowledge.arXiv preprint, 2025
Wenyao Zhang, Hongsi Liu, Zekun Qi, Yunnan Wang, Xinqiang Yu, Jiazhao Zhang, Runpei Dong, Jiawei He, He Wang, Zhizheng Zhang, et al. Dreamvla: A vision-language-action model dreamed with comprehensive world knowledge.arXiv preprint, 2025
2025
-
[7]
Wenyao Zhang, Bozhou Zhang, Zekun Qi, Wenjun Zeng, Xin Jin, and Li Zhang. Disentangled robot learning via separate forward and inverse dynamics pretraining.arXiv preprint arXiv:2604.16391, 2026
Pith/arXiv arXiv 2026
-
[8]
Jingwen Sun, Wenyao Zhang, Zekun Qi, Shaojie Ren, Zezhi Liu, Hanxin Zhu, Guangzhong Sun, Xin Jin, and Zhibo Chen. Vla-jepa: Enhancing vision-language-action model with latent world model.arXiv preprint arXiv:2602.10098, 2026
arXiv 2026
-
[9]
Discrete diffusion vla: Bringing discrete diffusion to action decoding in vision- language-action policies.arXiv preprint, 2025
Zhixuan Liang, Yizhuo Li, Tianshuo Yang, Chengyue Wu, Sitong Mao, Liuao Pei, Xiaokang Yang, Jiangmiao Pang, Yao Mu, and Ping Luo. Discrete diffusion vla: Bringing discrete diffusion to action decoding in vision- language-action policies.arXiv preprint, 2025
2025
-
[10]
Robotwin: Dual-arm robot benchmark with generative digital twins (early version)
Yao Mu, Tianxing Chen, Shijia Peng, Zanxin Chen, Zeyu Gao, Yude Zou, Lunkai Lin, Zhiqiang Xie, and Ping Luo. Robotwin: Dual-arm robot benchmark with generative digital twins (early version). InECCV, 2025
2025
-
[11]
Mimicgen: A data generation system for scalable robot learning using human demonstrations
Ajay Mandlekar, Soroush Nasiriany, Bowen Wen, Iretiayo Akinola, Yashraj Narang, Linxi Fan, Yuke Zhu, and Dieter Fox. Mimicgen: A data generation system for scalable robot learning using human demonstrations. In Conference on Robot Learning, pages 1820–1864. PMLR, 2023
2023
-
[12]
Zhengrong Xue, Shuying Deng, Zhenyang Chen, Yixuan Wang, Zhecheng Yuan, and Huazhe Xu. Demogen: Syn- thetic demonstration generation for data-efficient visuomotor policy learning.arXiv preprint arXiv:2502.16932, 2025
arXiv 2025
-
[13]
SizheYang, WenyeYu, JiaZeng, JunLv, KeruiRen, CewuLu, DahuaLin, andJiangmiaoPang. Noveldemonstra- tion generation with gaussian splatting enables robust one-shot manipulation.arXiv preprint arXiv:2504.13175, 2025
arXiv 2025
-
[14]
Yuan Xu, Jiabing Yang, Xiaofeng Wang, Yixiang Chen, Zheng Zhu, Bowen Fang, Guan Huang, Xinze Chen, Yun Ye, Qiang Zhang, et al. Egodemogen: Novel egocentric demonstration generation enables viewpoint-robust manipulation.arXiv preprint arXiv:2509.22578, 2025
arXiv 2025
-
[15]
Masoud Moghani, Mahdi Azizian, Animesh Garg, Yuke Zhu, Sean Huver, and Ajay Mandlekar. Softmimic- gen: A data generation system for scalable robot learning in deformable object manipulation.arXiv preprint arXiv:2603.25725, 2026
arXiv 2026
-
[16]
Sim1: Physics-aligned simulator as zero-shot data scaler in deformable worlds
Yunsong Zhou, Hangxu Liu, Xuekun Jiang, Xing Shen, Yuanzhen Zhou, Hui Wang, Baole Fang, Yang Tian, Mulin Yu, Qiaojun Yu, et al. Sim1: Physics-aligned simulator as zero-shot data scaler in deformable worlds. arXiv preprint arXiv:2604.08544, 2026
Pith/arXiv arXiv 2026
-
[17]
Robotic manipulation and sensing of deformable objects in domestic and industrial applications: a survey.The International Journal of Robotics Research, 37(7):688–716, 2018
Jose Sanchez, Juan-Antonio Corrales, Belhassen-Chedli Bouzgarrou, and Youcef Mezouar. Robotic manipulation and sensing of deformable objects in domestic and industrial applications: a survey.The International Journal of Robotics Research, 37(7):688–716, 2018
2018
-
[18]
Modeling, learning, perception, and control methods for deformable object manipulation.Science Robotics, 6(54):eabd8803, 2021
Hang Yin, Anastasia Varava, and Danica Kragic. Modeling, learning, perception, and control methods for deformable object manipulation.Science Robotics, 6(54):eabd8803, 2021
2021
-
[19]
Kaifeng Zhang, Shuo Sha, Hanxiao Jiang, Matthew Loper, Hyunjong Song, Guangyan Cai, Zhuo Xu, Xiaochen Hu, Changxi Zheng, and Yunzhu Li. Real-to-sim robot policy evaluation with gaussian splatting simulation of soft-body interactions.arXiv preprint arXiv:2511.04665, 2025
arXiv 2025
-
[20]
Haoyu Zhao, Cheng Zeng, Linghao Zhuang, Yaxi Zhao, Shengke Xue, Hao Wang, Xingyue Zhao, Zhongyu Li, Kehan Li, Siteng Huang, Mingxiu Chen, Xin Li, Deli Zhao, and Hua Zou. High-fidelity simulated data generation for real-world zero-shot robotic manipulation learning with gaussian splatting.IEEE Robotics and Automation Letters, 11(5):5310–5317, 2026. doi: 10...
-
[21]
Stephen James, Zicong Ma, David Rovick Arrojo, and Andrew J. Davison. RLBench: The Robot Learning Benchmark & Learning Environment.arXiv preprint arXiv:1909.12271, 2019
arXiv 1909
-
[22]
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, Katerina Fragkiadaki, Zackory Erickson, David Held, and Chuang Gan. RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation. InInternational Conference on Machine Learning, 2024
2024
-
[23]
Atsushi Kanehira, Naoki Wake, Kazuhiro Sasabuchi, Jun Takamatsu, and Katsushi Ikeuchi. Rl-driven data generation for robust vision-based dexterous grasping.arXiv preprint arXiv:2504.18084, 2025
arXiv 2025
-
[24]
Semantically controllable augmentations for generalizable robot learning.The International Journal of Robotics Research, 44(10-11):1705–1726, 2025
Zoey Chen, Zhao Mandi, Homanga Bharadhwaj, Mohit Sharma, Shuran Song, Abhishek Gupta, and Vikash Kumar. Semantically controllable augmentations for generalizable robot learning.The International Journal of Robotics Research, 44(10-11):1705–1726, 2025
2025
-
[25]
Gigabrain-0: A world model-powered vision-language-action model
GigaAI. Gigabrain-0: A world model-powered vision-language-action model. 2025. URLhttps://arxiv.org/ abs/2510.19430
arXiv 2025
-
[26]
Dexmimicgen: Automated data generation for bimanual dexterous manipulation via imitation learning
Zhenyu Jiang, Yuqi Xie, Kevin Lin, Zhenjia Xu, Weikang Wan, Ajay Mandlekar, Linxi Jim Fan, and Yuke Zhu. Dexmimicgen: Automated data generation for bimanual dexterous manipulation via imitation learning. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 16923–16930. IEEE, 2025
2025
-
[27]
Dreamgen: Unlocking generalization in robot learning through neural trajectories.arXiv preprint, 2025
Joel Jang, Seonghyeon Ye, Zongyu Lin, Jiannan Xiang, Johan Bjorck, Yu Fang, Fengyuan Hu, Spencer Huang, Kaushil Kundalia, Yen-Chen Lin, et al. Dreamgen: Unlocking generalization in robot learning through neural trajectories.arXiv preprint, 2025
2025
-
[28]
Manipdreamer3d: Synthesizing plausible robotic manipulation video with occupancy-aware 3d trajectory
Ying Li, Xiaobao Wei, Xiaowei Chi, Yuming Li, Zhongyu Zhao, Hao Wang, Ningning Ma, Ming Lu, and Sirui Han. Manipdreamer3d: Synthesizing plausible robotic manipulation video with occupancy-aware 3d trajectory. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 6644–6652, 2026
2026
-
[29]
Guanhua Ji, Harsha Polavaram, Lawrence Yunliang Chen, Sandeep Bajamahal, Zehan Ma, Simeon Adebola, ChenfengXu, andKenGoldberg. Oxe-auge: Alarge-scalerobotaugmentationofoxeforscalingcross-embodiment policy learning.arXiv preprint arXiv:2512.13100, 2025
arXiv 2025
-
[30]
Boyang Wang, Haoran Zhang, Shujie Zhang, Jinkun Hao, Mingda Jia, Qi Lv, Yucheng Mao, Zhaoyang Lyu, Jia Zeng, Xudong Xu, et al. Robovip: Multi-view video generation with visual identity prompting augments robot manipulation.arXiv preprint arXiv:2601.05241, 2026
arXiv 2026
-
[31]
One demo is worth a thousand trajectories: Action-view augmentation for visuomotor policies
Chuer Pan, Litian Liang, Dominik Bauer, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, and Shuran Song. One demo is worth a thousand trajectories: Action-view augmentation for visuomotor policies. In9th Annual Conference on Robot Learning, 2025
2025
-
[32]
Real2render2real: Scaling robot data without dynamics simulation or robot hardware,
Justin Yu, Letian Fu, Huang Huang, Karim El-Refai, Rares Andrei Ambrus, Richard Cheng, Muhammad Zubair Irshad, and Ken Goldberg. Real2render2real: Scaling robot data without dynamics simulation or robot hardware,
-
[33]
URLhttps://arxiv.org/abs/2505.09601
-
[34]
Yujie Zhao, Hongwei Fan, Di Chen, Shengcong Chen, Liliang Chen, Xiaoqi Li, Guanghui Ren, and Hao Dong. Real2edit2real: Generating robotic demonstrations via a 3d control interface.arXiv preprint arXiv:2512.19402, 2025
arXiv 2025
-
[35]
Caelan Garrett, Ajay Mandlekar, Bowen Wen, and Dieter Fox. Skillmimicgen: Automated demonstration gener- ation for efficient skill learning and deployment.arXiv preprint arXiv:2410.18907, 2024
arXiv 2024
-
[36]
3d gaussian splatting for real-time radiance field rendering.ACM Trans
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023
2023
-
[37]
Deformation constraints in a mass-spring model to describe rigid cloth behaviour
Xavier Provot et al. Deformation constraints in a mass-spring model to describe rigid cloth behaviour. InGraphics interface, pages 147–147. Canadian Information Processing Society, 1995
1995
-
[38]
Real-time elastic deformations of soft tissues for surgery simulation.IEEE transactions on Visualization and Computer Graphics, 5(1):62–73, 2002
Stéphane Cotin, Hervé Delingette, and Nicholas Ayache. Real-time elastic deformations of soft tissues for surgery simulation.IEEE transactions on Visualization and Computer Graphics, 5(1):62–73, 2002
2002
-
[39]
A moving least squares material point method with displacement discontinuity and two-way rigid body coupling.ACM Transactions on Graphics (TOG), 37(4):1–14, 2018
Yuanming Hu, Yu Fang, Ziheng Ge, Ziyin Qu, Yixin Zhu, Andre Pradhana, and Chenfanfu Jiang. A moving least squares material point method with displacement discontinuity and two-way rigid body coupling.ACM Transactions on Graphics (TOG), 37(4):1–14, 2018
2018
-
[40]
Position based dynamics.Journal of Visual Communication and Image Representation, 18(2):109–118, 2007
Matthias Müller, Bruno Heidelberger, Marcus Hennix, and John Ratcliff. Position based dynamics.Journal of Visual Communication and Image Representation, 18(2):109–118, 2007
2007
-
[41]
May, Tushar Kusnur, George Konidaris, and Laura Herlant
Sergio Orozco, Brandon B. May, Tushar Kusnur, George Konidaris, and Laura Herlant. Learning equivariant neural-augmented object dynamics from few interactions. InBeyond Rigid Worlds: Representing and Interacting with Non-Rigid Objects, 2025. URLhttps://openreview.net/forum?id=JAiJpFozaD
2025
-
[42]
Tenenbaum, David Held, and Chuang Gan
Xingyu Lin, Zhiao Huang, Yunzhu Li, Joshua B. Tenenbaum, David Held, and Chuang Gan. Diffskill: Skill ab- straction from differentiable physics for deformable object manipulations with tools. InInternational Conference on Learning Representations (ICLR), 2022
2022
-
[43]
Robocook: Long-horizon elasto-plastic object manipulation with diverse tools
Haochen Shi, Huazhe Xu, Samuel Clarke, Yunzhu Li, and Jiajun Wu. Robocook: Long-horizon elasto-plastic object manipulation with diverse tools. InConference on Robot Learning (CoRL), 2023
2023
-
[44]
Predicting object interactions with behavior primitives: An application in stowing tasks
Haonan Chen, Yilong Niu, Kaiwen Hou, Shuijing Liu, Yixuan Wang, Yunzhu Li, and Katherine Driggs-Campbell. Predicting object interactions with behavior primitives: An application in stowing tasks. InConference on Robot Learning (CoRL), 2023
2023
-
[45]
Defgraspsim: Physics-based simulation of grasp outcomes for 3d deformable objects
Isabella Huang, Yashraj Narang, Ruzena Bajcsy, Fabio Ramos, Tucker Hermans, and Dieter Fox. Defgraspsim: Physics-based simulation of grasp outcomes for 3d deformable objects. InIEEE International Conference on Robotics and Automation (ICRA), 2022
2022
-
[46]
Robotic manipulation of deformable objects: a comprehensive review.Robotic Intelligence and Automation, pages 1–16, 2026
Lijun Han and Hesheng Wang. Robotic manipulation of deformable objects: a comprehensive review.Robotic Intelligence and Automation, pages 1–16, 2026
2026
-
[47]
A perspective on open challenges in deformable object manipulation
Ryan Paul McKennaa and John Oyekan. A perspective on open challenges in deformable object manipulation. arXiv preprint arXiv:2602.22998, 2026
arXiv 2026
-
[48]
Diffusion policy: Visuomotor policy learning via action diffusion
Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, and Shuran Song. Diffusion policy: Visuomotor policy learning via action diffusion. InThe International Journal of Robotics Research, 2024
2024
-
[49]
Autoencoders as cross-modal teachers: Can pretrained 2d image transformers help 3d representation learning? InICLR, 2023
Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jianjian Sun, Zheng Ge, Li Yi, and Kaisheng Ma. Autoencoders as cross-modal teachers: Can pretrained 2d image transformers help 3d representation learning? InICLR, 2023
2023
-
[50]
Weikun Peng, Jun Lv, Yuwei Zeng, Haonan Chen, Siheng Zhao, Jichen Sun, Cewu Lu, and Lin Shao. Tiebot: Learning to knot a tie from visual demonstration through a real-to-sim-to-real approach.arXiv preprint arXiv:2407.03245, 2024
arXiv 2024
-
[51]
Robotic assembly of deformable linear objects via curriculum reinforcement learning.IEEE Robotics and Automation Letters, 2025
Kai Wu, Rongkang Chen, Qi Chen, and Weihua Li. Robotic assembly of deformable linear objects via curriculum reinforcement learning.IEEE Robotics and Automation Letters, 2025
2025
-
[52]
Checheng Yu, Chonghao Sima, Gangcheng Jiang, Hai Zhang, Haoguang Mai, Hongyang Li, Huijie Wang, Jin Chen, Kaiyang Wu, Li Chen, Lirui Zhao, Modi Shi, Ping Luo, Qingwen Bu, Shijia Peng, Tianyu Li, and Yibo Yuan.χ 0: Resource-aware robust manipulation via taming distributional inconsistencies.arXiv preprint arXiv:2602.09021, 2026
arXiv 2026
-
[53]
Deep imitation learning of sequential fabric smoothing from an algorithmic supervisor
Daniel Seita, Aditya Ganapathi, Ryan Hoque, Minho Hwang, Edward Cen, Ajay Kumar Tanwani, Ashwin Balakrishna, Brijen Thananjeyan, Jeffrey Ichnowski, Nawid Jamali, et al. Deep imitation learning of sequential fabric smoothing from an algorithmic supervisor. InIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020
2020
-
[54]
Fabricflownet: Bimanual cloth manipulation with a flow-based policy
Thomas Weng, Sujay Bajracharya, Yufei Wang, Khush Agrawal, and David Held. Fabricflownet: Bimanual cloth manipulation with a flow-based policy. InConference on Robot Learning (CoRL), 2022
2022
-
[55]
Dexgarmentlab: Dexterous garment manipulation environment with generalizable policy
Yuran Wang, Ruihai Wu, Yue Chen, Jiarui Wang, Jiaqi Liang, Ziyu Zhu, Haoran Geng, Jitendra Malik, Pieter Abbeel, and Hao Dong. Dexgarmentlab: Dexterous garment manipulation environment with generalizable policy. arXiv preprint arXiv:2505.11032, 2025
arXiv 2025
-
[56]
Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfolding
Huy Ha and Shuran Song. Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfolding. InConference on Robot Learning (CoRL), pages 24–33. PMLR, 2022
2022
-
[57]
Phystwin: Physics- informed reconstruction and simulation of deformable objects from videos.ICCV, 2025
Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, and Yunzhu Li. Phystwin: Physics- informed reconstruction and simulation of deformable objects from videos.ICCV, 2025
2025
-
[58]
Softgym: Benchmarking deep reinforcement learning for deformable object manipulation
Xingyu Lin, Yufei Wang, Jake Olkin, and David Held. Softgym: Benchmarking deep reinforcement learning for deformable object manipulation. InConference on Robot Learning, 2020
2020
-
[59]
Taichi: a language for high-performance computation on spatially sparse data structures.ACM Transactions on Graphics (TOG), 38 (6):1–16, 2019
Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand. Taichi: a language for high-performance computation on spatially sparse data structures.ACM Transactions on Graphics (TOG), 38 (6):1–16, 2019
2019
-
[60]
Warp: A high-performance python framework for gpu simulation and graphics
Miles Macklin. Warp: A high-performance python framework for gpu simulation and graphics. InNVIDIA GPU Technology Conference (GTC), volume 3, 2022
2022
-
[61]
Yang Tian, Yuyin Yang, Yiman Xie, Zetao Cai, Xu Shi, Ning Gao, Hangxu Liu, Xuekun Jiang, Zherui Qiu, Feng Yuan, et al. Interndata-a1: Pioneering high-fidelity synthetic data for pre-training generalist policy.arXiv preprint arXiv:2511.16651, 2025
arXiv 2025
-
[62]
Learning from demonstrations through the use of non-rigid registration
John Schulman, Jonathan Ho, Cameron Lee, and Pieter Abbeel. Learning from demonstrations through the use of non-rigid registration. InRobotics Research: The 16th International Symposium ISRR, pages 339–354. Springer, 2016
2016
-
[63]
Learning fine-grained bimanual manipulation with low-cost hardware.arXiv preprint, 2023
Tony Z Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. Learning fine-grained bimanual manipulation with low-cost hardware.arXiv preprint, 2023
2023
-
[64]
Diffusion policy: Visuomotor policy learning via action diffusion.The International Journal of Robotics Research, 2023
Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, and Shuran Song. Diffusion policy: Visuomotor policy learning via action diffusion.The International Journal of Robotics Research, 2023
2023
-
[65]
Smolvla: A vision-language-action model for affordable and efficient robotics.arXiv preprint, 2025
Mustafa Shukor, Dana Aubakirova, Francesco Capuano, Pepijn Kooijmans, Steven Palma, Adil Zouitine, Michel Aractingi, Caroline Pascal, Martino Russi, Andres Marafioti, et al. Smolvla: A vision-language-action model for affordable and efficient robotics.arXiv preprint, 2025. Appendix A State Augmentation Details A.1 Formal Assumption Our approach relies on ...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.