Recognition: 2 theorem links
· Lean TheoremMimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations
Pith reviewed 2026-05-17 09:41 UTC · model grok-4.3
The pith
MimicGen adapts a few hundred human demonstrations into over 50,000 varied examples that train robots for long-horizon tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MimicGen is a system for automatically synthesizing large-scale, rich datasets from only a small number of human demonstrations by adapting them to new contexts. We use MimicGen to generate over 50K demonstrations across 18 tasks with diverse scene configurations, object instances, and robot arms from just ~200 human demonstrations. We show that robot agents can be effectively trained on this generated dataset by imitation learning to achieve strong performance in long-horizon and high-precision tasks, such as multi-part assembly and coffee preparation, across broad initial state distributions.
What carries the argument
The adaptation process in MimicGen that modifies existing human demonstrations to fit new scene configurations, object instances, and robot arms while preserving the underlying task behavior.
If this is right
- Robots achieve strong performance on long-horizon and high-precision tasks when trained via imitation on the generated demonstrations.
- Training success holds across broad initial state distributions for tasks such as multi-part assembly and coffee preparation.
- The generated data compares favorably in effectiveness to collecting additional real human demonstrations.
- Robot learning becomes more economical because large datasets no longer require proportional increases in human collection effort.
Where Pith is reading between the lines
- The same adaptation idea could be applied in physical robot experiments to test whether the performance gains transfer beyond simulation.
- Combining MimicGen-style generation with other data sources might further reduce the total number of real demonstrations needed for multi-task training.
- Similar techniques could shorten iteration cycles when researchers want to explore many environment variations without repeating full human data collection.
Load-bearing premise
Data created by adapting human demonstrations to new contexts trains robots as effectively as fresh human demonstrations collected directly in those same contexts.
What would settle it
A side-by-side test in which imitation learning agents trained on an equal volume of newly collected real human demonstrations in the target contexts outperform agents trained on the MimicGen-generated data would falsify the central claim.
read the original abstract
Imitation learning from a large set of human demonstrations has proved to be an effective paradigm for building capable robot agents. However, the demonstrations can be extremely costly and time-consuming to collect. We introduce MimicGen, a system for automatically synthesizing large-scale, rich datasets from only a small number of human demonstrations by adapting them to new contexts. We use MimicGen to generate over 50K demonstrations across 18 tasks with diverse scene configurations, object instances, and robot arms from just ~200 human demonstrations. We show that robot agents can be effectively trained on this generated dataset by imitation learning to achieve strong performance in long-horizon and high-precision tasks, such as multi-part assembly and coffee preparation, across broad initial state distributions. We further demonstrate that the effectiveness and utility of MimicGen data compare favorably to collecting additional human demonstrations, making it a powerful and economical approach towards scaling up robot learning. Datasets, simulation environments, videos, and more at https://mimicgen.github.io .
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces MimicGen, a system that automatically synthesizes large-scale robot demonstration datasets by adapting a small number (~200) of human demonstrations to new scene configurations, object instances, and robot arms. It generates over 50K demonstrations across 18 tasks and claims that imitation learning agents trained on this data achieve strong performance on long-horizon, high-precision tasks such as multi-part assembly and coffee preparation across broad initial state distributions, with effectiveness and utility that compare favorably to collecting additional human demonstrations.
Significance. If the empirical results hold, the work offers a practical method to scale imitation learning data collection economically, addressing a major bottleneck in robot learning. The scale of generated data, focus on challenging long-horizon tasks, and release of datasets, environments, and videos are strengths that support reproducibility and further research.
major comments (2)
- [§4 and §5] §4 (Experiments) and §5 (Results): The claim that MimicGen data compares favorably to additional human demonstrations (and supports strong performance across broad initial states) is load-bearing but requires explicit quantitative evidence that the adaptation process preserves task success. The manuscript should report the success rate of generated trajectories (e.g., fraction of valid, collision-free executions after rigid transformation and sub-task stitching) versus fresh human data in the target contexts; without this, it is unclear whether the imitation learning results reflect high-quality data or are undermined by invalid trajectories.
- [§3.2] §3.2 (Adaptation Procedure): The description of context adaptation (object pose changes, new instances, different arms) must detail mechanisms for ensuring kinematic reachability and avoiding collisions or altered contact dynamics. If these are not addressed, the generated dataset may contain a higher fraction of failed executions than real human data, directly affecting the central scalability claim.
minor comments (2)
- [Abstract] Abstract: Include at least one key quantitative result (e.g., success rate or performance gap versus baselines) to strengthen the empirical claims made in the opening paragraph.
- [Throughout] Notation and figures: Ensure consistent use of terms such as 'context adaptation' and 'sub-task stitching' across text and figures; add a table summarizing the 18 tasks with their horizon lengths and precision requirements.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback on our manuscript. We have carefully reviewed the major comments and provide point-by-point responses below. We agree that certain clarifications and additions will strengthen the paper and plan to incorporate them in the revision.
read point-by-point responses
-
Referee: [§4 and §5] §4 (Experiments) and §5 (Results): The claim that MimicGen data compares favorably to additional human demonstrations (and supports strong performance across broad initial states) is load-bearing but requires explicit quantitative evidence that the adaptation process preserves task success. The manuscript should report the success rate of generated trajectories (e.g., fraction of valid, collision-free executions after rigid transformation and sub-task stitching) versus fresh human data in the target contexts; without this, it is unclear whether the imitation learning results reflect high-quality data or are undermined by invalid trajectories.
Authors: We agree that directly reporting the success rates of the generated trajectories would provide stronger support for the central claims. The manuscript primarily evaluates data utility through downstream imitation learning performance across broad initial state distributions, which serves as an indirect but practical measure of data quality. However, we acknowledge the value of explicit metrics on adaptation validity. In the revised manuscript, we will add quantitative results (e.g., a table in §4 or §5) reporting the fraction of MimicGen trajectories that are valid, collision-free, and task-successful after adaptation, with direct comparisons to human demonstrations collected in the target contexts. This addition will clarify that the reported IL results are based on high-quality data. revision: yes
-
Referee: [§3.2] §3.2 (Adaptation Procedure): The description of context adaptation (object pose changes, new instances, different arms) must detail mechanisms for ensuring kinematic reachability and avoiding collisions or altered contact dynamics. If these are not addressed, the generated dataset may contain a higher fraction of failed executions than real human data, directly affecting the central scalability claim.
Authors: We thank the referee for this suggestion to enhance the technical description. Section 3.2 currently focuses on the high-level adaptation pipeline (rigid transformations for poses, sub-task segmentation, and stitching). The procedure incorporates inverse kinematics feasibility checks during pose adaptation for different robot arms and uses the underlying simulator to detect and filter collisions or unreachable configurations before including trajectories in the dataset. Contact dynamics are preserved by maintaining relative end-effector trajectories within each sub-task. We will expand §3.2 with additional details on these mechanisms, including the specific reachability checks and filtering steps, to address the concern directly. revision: yes
Circularity Check
MimicGen is a practical data synthesis system with no circular derivations or modeling
full rationale
The paper introduces an engineering system for adapting a small number of human demonstrations to new scene configurations, object instances, and robot arms to synthesize large datasets. No equations, first-principles derivations, fitted parameters, or predictions are described that could reduce to inputs by construction. The central claim rests on empirical results from training imitation learning policies on the generated data, which is externally falsifiable via real-robot or simulator success rates and does not rely on self-citations, uniqueness theorems, or ansatzes from prior author work. This is a standard non-circular systems paper.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 19 Pith papers
-
Good in Bad (GiB): Sifting Through End-user Demonstrations for Learning a Better Policy
GiB filters erroneous subtasks from mixed-quality human demonstrations using self-supervised latent features and Mahalanobis distance to train more robust imitation learning policies.
-
DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation
DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
-
Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation
ReV is a referring-aware visuomotor policy using coupled diffusion heads for real-time trajectory replanning in robotic manipulation, trained solely via targeted perturbations to expert demonstrations and achieving hi...
-
Good in Bad (GiB): Sifting Through End-user Demonstrations for Learning a Better Policy
GiB uses self-supervised latent features and Mahalanobis distance to filter erroneous subtasks from mixed-quality human demonstrations, improving robot policy learning in simulation and real-world tasks.
-
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
-
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
X-WAM unifies robotic action execution and 4D world synthesis by adapting video diffusion priors with a lightweight depth branch and asynchronous noise sampling, achieving 79-91% success on robot benchmarks.
-
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
X-WAM unifies real-time robotic action execution with high-fidelity 4D world synthesis by adapting video diffusion priors through lightweight depth branches and asynchronous noise sampling, achieving 79-91% success on...
-
Unmasking the Illusion of Embodied Reasoning in Vision-Language-Action Models
State-of-the-art vision-language-action models catastrophically fail dynamic embodied reasoning due to lexical-kinematic shortcuts, behavioral inertia, and semantic feature collapse caused by architectural bottlenecks...
-
A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies
Sim-and-real co-training for robot policies is driven primarily by balanced cross-domain representation alignment and secondarily by domain-dependent action reweighting.
-
WARPED: Wrist-Aligned Rendering for Robot Policy Learning from Egocentric Human Demonstrations
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match tele...
-
Generative Simulation for Policy Learning in Physical Human-Robot Interaction
A text-to-simulation pipeline using LLMs and VLMs generates synthetic pHRI data to train vision-based imitation learning policies that achieve over 80% success in zero-shot sim-to-real transfer on real assistive tasks.
-
SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds
SIM1 converts sparse real demonstrations into high-fidelity synthetic data through physics-aligned simulation, yielding policies that match real-data performance at a 1:15 ratio with 90% zero-shot success on deformabl...
-
ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors
ExpertGen generates high-success expert policies in simulation from imperfect priors by freezing a diffusion behavior model and optimizing its initial noise via RL, then distills them for real-robot deployment.
-
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning
Single-stage fine-tuning of a video model to generate actions as latent frames plus future states and values yields state-of-the-art robot policy performance on LIBERO, RoboCasa, and bimanual tasks.
-
IGen: Scalable Data Generation for Robot Learning from Open-World Images
IGen generates realistic visuomotor training data including actions and temporally coherent visuals from unstructured open-world images via 3D reconstruction and VLM reasoning.
-
$\pi_0$: A Vision-Language-Action Flow Model for General Robot Control
π₀ is a vision-language-action flow model trained on diverse multi-platform robot data that supports zero-shot task performance, language instruction following, and efficient fine-tuning for dexterous tasks.
-
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
RoboCasa supplies a large-scale kitchen simulator, generative assets, 100 tasks, and automated data pipelines that produce a clear scaling trend in imitation learning for generalist robots.
-
EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development
EmbodiedClaw automates embodied AI development workflows through conversation, reducing manual effort and improving consistency and reproducibility.
-
World Action Models: The Next Frontier in Embodied AI
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
Reference graph
Works this paper leans on
-
[1]
Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation
T. Zhang, Z. McCarthy, O. Jow, D. Lee, K. Goldberg, and P. Abbeel, “Deep imitation learning for complex manipulation tasks from virtual reality teleoperation,” arXiv preprint arXiv:1710.04615, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[2]
RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation,
A. Mandlekar, Y . Zhu, A. Garg, J. Booher, M. Spero, A. Tung, J. Gao, J. Emmons, A. Gupta, E. Orbay, S. Savarese, and L. Fei-Fei, “RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation,” in Conference on Robot Learning, 2018
work page 2018
-
[3]
Bc- z: Zero-shot task generalization with robotic imitation learning,
E. Jang, A. Irpan, M. Khansari, D. Kappler, F. Ebert, C. Lynch, S. Levine, and C. Finn, “Bc- z: Zero-shot task generalization with robotic imitation learning,” in Conference on Robot Learning. PMLR, 2022, pp. 991–1002
work page 2022
-
[4]
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
M. Ahn, A. Brohan, N. Brown, Y . Chebotar, O. Cortes, B. David, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog et al. , “Do as i can, not as i say: Grounding language in robotic affordances,”arXiv preprint arXiv:2204.01691, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[5]
RT-1: Robotics Transformer for Real-World Control at Scale
A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, J. Hsu et al. , “Rt-1: Robotics transformer for real-world control at scale,” arXiv preprint arXiv:2212.06817, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[6]
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
F. Ebert, Y . Yang, K. Schmeckpeper, B. Bucher, G. Georgakis, K. Daniilidis, C. Finn, and S. Levine, “Bridge data: Boosting generalization of robotic skills with cross-domain datasets,” arXiv preprint arXiv:2109.13396, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[7]
What matters in learning from offline human demonstrations for robot manipulation,
A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y . Zhu, and R. Mart´ın-Mart´ın, “What matters in learning from offline human demonstrations for robot manipulation,” in Conference on Robot Learning (CoRL) , 2021
work page 2021
-
[8]
You only demonstrate once: Category-level manipulation from single visual demonstration,
B. Wen, W. Lian, K. Bekris, and S. Schaal, “You only demonstrate once: Category-level manipulation from single visual demonstration,” in Robotics: Science and Systems (RSS) , 2022
work page 2022
-
[9]
Coarse-to-fine imitation learning: Robot manipulation from a single demonstra- tion,
E. Johns, “Coarse-to-fine imitation learning: Robot manipulation from a single demonstra- tion,” in 2021 IEEE international conference on robotics and automation (ICRA) . IEEE, 2021, pp. 4613–4619
work page 2021
-
[10]
E. Valassakis, G. Papagiannis, N. Di Palo, and E. Johns, “Demonstrate once, imitate imme- diately (dome): Learning visual servoing for one-shot imitation learning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . IEEE, 2022, pp. 8614– 8621
work page 2022
-
[11]
Learning multi-stage tasks with one demonstration via self-replay,
N. Di Palo and E. Johns, “Learning multi-stage tasks with one demonstration via self-replay,” in Conference on Robot Learning. PMLR, 2022, pp. 1180–1189
work page 2022
-
[12]
Learning hand-eye coordination for robotic grasping with large-scale data collection,
S. Levine, P. Pastor, A. Krizhevsky, and D. Quillen, “Learning hand-eye coordination for robotic grasping with large-scale data collection,” in ISER, 2016, pp. 173–184
work page 2016
-
[13]
Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours,
L. Pinto and A. Gupta, “Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours,” inRobotics and Automation (ICRA), 2016 IEEE Int’l Conference on. IEEE, 2016
work page 2016
-
[14]
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation
D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V . Vanhoucke et al. , “Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation,” arXiv preprint arXiv:1806.10293, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[15]
Mt-opt: Continuous multi-task robotic reinforcement learning at scale,
D. Kalashnikov, J. Varley, Y . Chebotar, B. Swanson, R. Jonschkowski, C. Finn, S. Levine, and K. Hausman, “Mt-opt: Continuous multi-task robotic reinforcement learning at scale,” arXiv preprint arXiv:2104.08212, 2021. 9
-
[16]
More than a million ways to be pushed. a high-fidelity experimental dataset of planar pushing,
K.-T. Yu, M. Bauza, N. Fazeli, and A. Rodriguez, “More than a million ways to be pushed. a high-fidelity experimental dataset of planar pushing,” in Int’l Conference on Intelligent Robots and Systems, 2016
work page 2016
-
[17]
Robonet: Large-scale multi-robot learning,
S. Dasari, F. Ebert, S. Tian, S. Nair, B. Bucher, K. Schmeckpeper, S. Singh, S. Levine, and C. Finn, “Robonet: Large-scale multi-robot learning,” arXiv preprint arXiv:1910.11215, 2019
-
[18]
Rlbench: The robot learning benchmark & learning environment,
S. James, Z. Ma, D. R. Arrojo, and A. J. Davison, “Rlbench: The robot learning benchmark & learning environment,”IEEE Robotics and Automation Letters , vol. 5, no. 2, pp. 3019–3026, 2020
work page 2020
-
[19]
Transporter networks: Rearranging the visual world for robotic manipulation,
A. Zeng, P. Florence, J. Tompson, S. Welker, J. Chien, M. Attarian, T. Armstrong, I. Krasin, D. Duong, V . Sindhwani et al. , “Transporter networks: Rearranging the visual world for robotic manipulation,” arXiv preprint arXiv:2010.14406, 2020
-
[20]
Vima: General robot manipulation with multimodal prompts,
Y . Jiang, A. Gupta, Z. Zhang, G. Wang, Y . Dou, Y . Chen, L. Fei-Fei, A. Anandkumar, Y . Zhu, and L. Fan, “Vima: General robot manipulation with multimodal prompts,” arXiv preprint arXiv:2210.03094, 2022
-
[21]
Maniskill2: A unified benchmark for generalizable manipulation skills,
J. Gu, F. Xiang, X. Li, Z. Ling, X. Liu, T. Mu, Y . Tang, S. Tao, X. Wei, Y . Yao et al. , “Maniskill2: A unified benchmark for generalizable manipulation skills,” arXiv preprint arXiv:2302.04659, 2023
-
[22]
Imitating task and motion planning with visuomotor transformers,
M. Dalal, A. Mandlekar, C. Garrett, A. Handa, R. Salakhutdinov, and D. Fox, “Imitating task and motion planning with visuomotor transformers,”arXiv preprint arXiv:2305.16309, 2023
-
[23]
A. Mandlekar, J. Booher, M. Spero, A. Tung, A. Gupta, Y . Zhu, A. Garg, S. Savarese, and L. Fei-Fei, “Scaling robot supervision to hundreds of hours with roboturk: Robotic manip- ulation dataset through human reasoning and dexterity,” arXiv preprint arXiv:1911.04052 , 2019
-
[24]
Human-in-the- loop imitation learning using remote teleoperation,
A. Mandlekar, D. Xu, R. Mart´ın-Mart´ın, Y . Zhu, L. Fei-Fei, and S. Savarese, “Human-in-the- loop imitation learning using remote teleoperation,” arXiv preprint arXiv:2012.06733, 2020
-
[25]
Learning multi-arm manipulation through collaborative teleoperation,
A. Tung, J. Wong, A. Mandlekar, R. Mart ´ın-Mart´ın, Y . Zhu, L. Fei-Fei, and S. Savarese, “Learning multi-arm manipulation through collaborative teleoperation,” arXiv preprint arXiv:2012.06738, 2020
-
[26]
Error-aware imitation learning from teleoperation data for mobile manipulation,
J. Wong, A. Tung, A. Kurenkov, A. Mandlekar, L. Fei-Fei, S. Savarese, and R. Mart´ın-Mart´ın, “Error-aware imitation learning from teleoperation data for mobile manipulation,” inConfer- ence on Robot Learning . PMLR, 2022, pp. 1367–1378
work page 2022
-
[27]
Interactive language: Talking to robots in real time,
C. Lynch, A. Wahid, J. Tompson, T. Ding, J. Betker, R. Baruch, T. Armstrong, and P. Florence, “Interactive language: Talking to robots in real time,”arXiv preprint arXiv:2210.06407, 2022
-
[28]
Alvinn: An autonomous land vehicle in a neural network,
D. A. Pomerleau, “Alvinn: An autonomous land vehicle in a neural network,” in Advances in neural information processing systems, 1989, pp. 305–313
work page 1989
-
[29]
Movement imitation with nonlinear dynamical systems in humanoid robots,
A. J. Ijspeert, J. Nakanishi, and S. Schaal, “Movement imitation with nonlinear dynamical systems in humanoid robots,” Proceedings 2002 IEEE International Conference on Robotics and Automation, vol. 2, pp. 1398–1403 vol.2, 2002
work page 2002
-
[30]
One-shot visual imitation learning via meta-learning,
C. Finn, T. Yu, T. Zhang, P. Abbeel, and S. Levine, “One-shot visual imitation learning via meta-learning,” in Conference on robot learning. PMLR, 2017, pp. 357–368
work page 2017
-
[31]
Robot programming by demonstration,
A. Billard, S. Calinon, R. Dillmann, and S. Schaal, “Robot programming by demonstration,” in Springer Handbook of Robotics , 2008
work page 2008
-
[32]
Learning and re- production of gestures by imitation,
S. Calinon, F. D’halluin, E. L. Sauser, D. G. Caldwell, and A. Billard, “Learning and re- production of gestures by imitation,” IEEE Robotics and Automation Magazine , vol. 17, pp. 44–54, 2010
work page 2010
-
[33]
Learning to generalize across long-horizon tasks from human demonstrations,
A. Mandlekar, D. Xu, R. Mart ´ın-Mart´ın, S. Savarese, and L. Fei-Fei, “Learning to generalize across long-horizon tasks from human demonstrations,” arXiv preprint arXiv:2003.06085 , 2020
-
[34]
C. Wang, R. Wang, D. Xu, A. Mandlekar, L. Fei-Fei, and S. Savarese, “Generalization through hand-eye coordination: An action space for learning spatially-invariant visuomotor control,” arXiv preprint arXiv:2103.00375, 2021. 10
-
[35]
Data augmentation for manipulation,
P. Mitrano and D. Berenson, “Data augmentation for manipulation,” arXiv preprint arXiv:2205.02886, 2022
-
[36]
Reinforcement learning with augmented data,
M. Laskin, K. Lee, A. Stooke, L. Pinto, P. Abbeel, and A. Srinivas, “Reinforcement learning with augmented data,” arXiv preprint arXiv:2004.14990, 2020
-
[37]
Image augmentation is all you need: Regularizing deep reinforcement learning from pixels,
I. Kostrikov, D. Yarats, and R. Fergus, “Image augmentation is all you need: Regularizing deep reinforcement learning from pixels,” arXiv preprint arXiv:2004.13649, 2020
-
[38]
S. Young, D. Gandhi, S. Tulsiani, A. Gupta, P. Abbeel, and L. Pinto, “Visual imitation made easy,”arXiv e-prints, pp. arXiv–2008, 2020
work page 2008
-
[39]
A framework for efficient robotic manipulation,
A. Zhan, P. Zhao, L. Pinto, P. Abbeel, and M. Laskin, “A framework for efficient robotic manipulation,” arXiv preprint arXiv:2012.07975, 2020
-
[40]
S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics,
S. Sinha, A. Mandlekar, and A. Garg, “S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics,” in Conference on Robot Learning . PMLR, 2022, pp. 907–917
work page 2022
-
[41]
Counterfactual data augmentation using locally factored dynamics,
S. Pitis, E. Creager, and A. Garg, “Counterfactual data augmentation using locally factored dynamics,” Advances in Neural Information Processing Systems , vol. 33, pp. 3976–3990, 2020
work page 2020
-
[42]
Mocoda: Model-based counterfactual data augmentation,
S. Pitis, E. Creager, A. Mandlekar, and A. Garg, “Mocoda: Model-based counterfactual data augmentation,” arXiv preprint arXiv:2210.11287, 2022
-
[43]
Cacti: A framework for scalable multi-task multi-scene visual imitation learning,
Z. Mandi, H. Bharadhwaj, V . Moens, S. Song, A. Rajeswaran, and V . Kumar, “Cacti: A framework for scalable multi-task multi-scene visual imitation learning,” arXiv preprint arXiv:2212.05711, 2022
-
[44]
Scaling robot learning with semantically imagined experience,
T. Yu, T. Xiao, A. Stone, J. Tompson, A. Brohan, S. Wang, J. Singh, C. Tan, J. Per- alta, B. Ichter et al., “Scaling robot learning with semantically imagined experience,” arXiv preprint arXiv:2302.11550, 2023
-
[45]
Genaug: Retargeting behaviors to unseen situations via generative augmentation,
Z. Chen, S. Kiami, A. Gupta, and V . Kumar, “Genaug: Retargeting behaviors to unseen situations via generative augmentation,”arXiv preprint arXiv:2302.06671, 2023
-
[46]
Where to start? transferring simple skills to complex environ- ments,
V . V osylius and E. Johns, “Where to start? transferring simple skills to complex environ- ments,” arXiv preprint arXiv:2212.06111, 2022
-
[47]
Leveraging sequentiality in reinforce- ment learning from a single demonstration,
A. Chenu, O. Serris, O. Sigaud, and N. Perrin-Gilbert, “Leveraging sequentiality in reinforce- ment learning from a single demonstration,” arXiv preprint arXiv:2211.04786, 2022
-
[48]
Learning sensorimotor primitives of sequen- tial manipulation tasks from visual demonstrations,
J. Liang, B. Wen, K. Bekris, and A. Boularias, “Learning sensorimotor primitives of sequen- tial manipulation tasks from visual demonstrations,” in 2022 International Conference on Robotics and Automation (ICRA) . IEEE, 2022, pp. 8591–8597
work page 2022
-
[49]
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Y . Zhu, J. Wong, A. Mandlekar, and R. Mart ´ın-Mart´ın, “robosuite: A modular simulation framework and benchmark for robot learning,” inarXiv preprint arXiv:2009.12293, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[50]
Mujoco: A physics engine for model-based control,
E. Todorov, T. Erez, and Y . Tassa, “Mujoco: A physics engine for model-based control,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 5026–5033
work page 2012
-
[51]
Factory: Fast contact for robotic assembly,
Y . Narang, K. Storey, I. Akinola, M. Macklin, P. Reist, L. Wawrzyniak, Y . Guo, A. Mora- vanszky, G. State, M. Lu et al., “Factory: Fast contact for robotic assembly,” arXiv preprint arXiv:2205.03532, 2022
-
[52]
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
V . Makoviychuk, L. Wawrzyniak, Y . Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa et al. , “Isaac gym: High performance gpu-based physics simulation for robot learning,” arXiv preprint arXiv:2108.10470, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[53]
Bottom-up skill discovery from unsegmented demonstrations for long-horizon robot manipulation,
Y . Zhu, P. Stone, and Y . Zhu, “Bottom-up skill discovery from unsegmented demonstrations for long-horizon robot manipulation,” IEEE Robotics and Automation Letters , vol. 7, no. 2, pp. 4126–4133, 2022
work page 2022
-
[54]
Viola: Imitation learning for vision-based manipula- tion with object proposal priors,
Y . Zhu, A. Joshi, P. Stone, and Y . Zhu, “Viola: Imitation learning for vision-based manipula- tion with object proposal priors,” 6th Annual Conference on Robot Learning , 2022
work page 2022
-
[55]
Learning and retrieval from prior data for skill-based imitation learning,
S. Nasiriany, T. Gao, A. Mandlekar, and Y . Zhu, “Learning and retrieval from prior data for skill-based imitation learning,” in Conference on Robot Learning (CoRL) , 2022. 11
work page 2022
-
[56]
O. Mees, L. Hermann, E. Rosete-Beas, and W. Burgard, “Calvin: A benchmark for language- conditioned policy learning for long-horizon robot manipulation tasks,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 7327–7334, 2022
work page 2022
-
[57]
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
S. Levine, A. Kumar, G. Tucker, and J. Fu, “Offline reinforcement learning: Tutorial, review, and perspectives on open problems,”arXiv preprint arXiv:2005.01643, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2005
-
[58]
Sim-to-real transfer of robotic control with dynamics randomization,
X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Sim-to-real transfer of robotic control with dynamics randomization,” in 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2018, pp. 3803–3810
work page 2018
-
[59]
Sim2real transfer for reinforcement learning without dynamics randomization,
M. Kaspar, J. D. M. Osorio, and J. Bock, “Sim2real transfer for reinforcement learning without dynamics randomization,” in2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 4383–4388
work page 2020
-
[60]
Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger,
A. Allshire, M. MittaI, V . Lodaya, V . Makoviychuk, D. Makoviichuk, F. Widmaier, M. W¨uthrich, S. Bauer, A. Handa, and A. Garg, “Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . IEEE, 2022, pp. 11 802–11 809
work page 2022
-
[61]
Practical imitation learning in the real world via task consistency loss,
M. Khansari, D. Ho, Y . Du, A. Fuentes, M. Bennice, N. Sievers, S. Kirmani, Y . Bai, and E. Jang, “Practical imitation learning in the real world via task consistency loss,” arXiv preprint arXiv:2202.01862, 2022
-
[62]
Dextreme: Transfer of agile in-hand manipulation from simulation to reality,
A. Handa, A. Allshire, V . Makoviychuk, A. Petrenko, R. Singh, J. Liu, D. Makoviichuk, K. Van Wyk, A. Zhurkevich, B. Sundaralingam et al., “Dextreme: Transfer of agile in-hand manipulation from simulation to reality,”arXiv preprint arXiv:2210.13702, 2022
-
[63]
O. Khatib, “A unified approach for motion and force control of robot manipulators: The operational space formulation,” IEEE Journal on Robotics and Automation , vol. 3, no. 1, pp. 43–53, 1987
work page 1987
-
[64]
Rb2: Robotic manipulation benchmarking with a twist,
S. Dasari, J. Wang, J. Hong, S. Bahl, Y . Lin, A. Wang, A. Thankaraj, K. Chahal, B. Calli, S. Gupta et al. , “Rb2: Robotic manipulation benchmarking with a twist,” arXiv preprint arXiv:2203.08098, 2022
-
[65]
Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning,
T. Yu, D. Quillen, Z. He, R. Julian, K. Hausman, C. Finn, and S. Levine, “Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning,” inConference on robot learning. PMLR, 2020, pp. 1094–1100
work page 2020
-
[66]
Maniskill: Generalizable manipulation skill benchmark with large-scale demonstrations,
T. Mu, Z. Ling, F. Xiang, D. Yang, X. Li, S. Tao, Z. Huang, Z. Jia, and H. Su, “Maniskill: Generalizable manipulation skill benchmark with large-scale demonstrations,”arXiv preprint arXiv:2107.14483, 2021
-
[67]
Query-Efficient Imitation Learning for End-to-End Autonomous Driving
J. Zhang and K. Cho, “Query-efficient imitation learning for end-to-end autonomous driving,” arXiv preprint arXiv:1605.06450, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[68]
Lazydagger: Reducing context switching in interactive imitation learning,
R. Hoque, A. Balakrishna, C. Putterman, M. Luo, D. S. Brown, D. Seita, B. Thananjeyan, E. Novoseller, and K. Goldberg, “Lazydagger: Reducing context switching in interactive imitation learning,” in 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE). IEEE, 2021, pp. 502–509
work page 2021
-
[69]
Thriftydagger: Budget-aware novelty and risk gating for interactive imitation learning,
R. Hoque, A. Balakrishna, E. Novoseller, A. Wilcox, D. S. Brown, and K. Goldberg, “Thriftydagger: Budget-aware novelty and risk gating for interactive imitation learning,” arXiv preprint arXiv:2109.08273, 2021
-
[70]
Pato: Policy assisted teleoperation for scalable robot data collection,
S. Dass, K. Pertsch, H. Zhang, Y . Lee, J. J. Lim, and S. Nikolaidis, “Pato: Policy assisted teleoperation for scalable robot data collection,” arXiv preprint arXiv:2212.04708, 2022
-
[71]
Datasetgan: Efficient labeled data factory with minimal human effort,
Y . Zhang, H. Ling, J. Gao, K. Yin, J.-F. Lafleche, A. Barriuso, A. Torralba, and S. Fidler, “Datasetgan: Efficient labeled data factory with minimal human effort,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2021, pp. 10 145– 10 155
work page 2021
-
[72]
Bigdatasetgan: Synthesiz- ing imagenet with pixel-wise annotations,
D. Li, H. Ling, S. W. Kim, K. Kreis, S. Fidler, and A. Torralba, “Bigdatasetgan: Synthesiz- ing imagenet with pixel-wise annotations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21 330–21 340
work page 2022
-
[73]
Meta-sim: Learning to generate synthetic datasets,
A. Kar, A. Prakash, M.-Y . Liu, E. Cameracci, J. Yuan, M. Rusiniak, D. Acuna, A. Torralba, and S. Fidler, “Meta-sim: Learning to generate synthetic datasets,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 4551–4560. 12
work page 2019
-
[74]
Meta-sim2: Unsupervised learning of scene structure for synthetic data generation,
J. Devaranjan, A. Kar, and S. Fidler, “Meta-sim2: Unsupervised learning of scene structure for synthetic data generation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16. Springer, 2020, pp. 715–733
work page 2020
-
[75]
Drivegan: Towards a controllable high- quality neural simulation,
S. W. Kim, J. Philion, A. Torralba, and S. Fidler, “Drivegan: Towards a controllable high- quality neural simulation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 5820–5829
work page 2021
-
[76]
Atiss: Autoregres- sive transformers for indoor scene synthesis,
D. Paschalidou, A. Kar, M. Shugrina, K. Kreis, A. Geiger, and S. Fidler, “Atiss: Autoregres- sive transformers for indoor scene synthesis,” Advances in Neural Information Processing Systems, vol. 34, pp. 12 013–12 026, 2021
work page 2021
-
[77]
Scenegen: Learning to generate realistic traffic scenes,
S. Tan, K. Wong, S. Wang, S. Manivasagam, M. Ren, and R. Urtasun, “Scenegen: Learning to generate realistic traffic scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 892–901
work page 2021
-
[78]
Motionbenchmaker: A tool to generate and benchmark motion planning datasets,
C. Chamzas, C. Quintero-Pena, Z. Kingston, A. Orthey, D. Rakita, M. Gleicher, M. Toussaint, and L. E. Kavraki, “Motionbenchmaker: A tool to generate and benchmark motion planning datasets,” IEEE Robotics and Automation Letters , vol. 7, no. 2, pp. 882–889, 2021
work page 2021
-
[79]
Learning latent plans from play,
C. Lynch, M. Khansari, T. Xiao, V . Kumar, J. Tompson, S. Levine, and P. Sermanet, “Learning latent plans from play,” inConference on Robot Learning, 2019
work page 2019
-
[80]
Demonstration-guided reinforcement learning with learned skills,
K. Pertsch, Y . Lee, Y . Wu, and J. J. Lim, “Demonstration-guided reinforcement learning with learned skills,” in Conference on Robot Learning, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.