arxiv: 2310.17596 · v1 · submitted 2023-10-26 · 💻 cs.RO · cs.AI· cs.CV· cs.LG

Recognition: 2 theorem links

· Lean Theorem

MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations

Ajay Mandlekar , Soroush Nasiriany , Bowen Wen , Iretiayo Akinola , Yashraj Narang , Linxi Fan , Yuke Zhu , Dieter Fox

Authors on Pith no claims yet

Pith reviewed 2026-05-17 09:41 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.CVcs.LG

keywords imitation learningrobot learningdata generationhuman demonstrationslong-horizon tasksmanipulationscalable datasets

0 comments

The pith

MimicGen adapts a few hundred human demonstrations into over 50,000 varied examples that train robots for long-horizon tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MimicGen as a system that automatically synthesizes large robot training datasets by taking a small collection of human demonstrations and adapting them to new object placements, scenes, and robot arms. This addresses the bottleneck of expensive and time-consuming data collection for imitation learning, which currently limits scaling to complex behaviors. A sympathetic reader would care because the generated data lets robots learn high-precision, multi-step skills like assembly and coffee preparation across many starting conditions, without needing fresh human demonstrations for every variation. The work shows that imitation learning on the synthetic data reaches strong performance and compares favorably to using additional real demonstrations, pointing to a more practical route for building capable robot agents.

Core claim

MimicGen is a system for automatically synthesizing large-scale, rich datasets from only a small number of human demonstrations by adapting them to new contexts. We use MimicGen to generate over 50K demonstrations across 18 tasks with diverse scene configurations, object instances, and robot arms from just ~200 human demonstrations. We show that robot agents can be effectively trained on this generated dataset by imitation learning to achieve strong performance in long-horizon and high-precision tasks, such as multi-part assembly and coffee preparation, across broad initial state distributions.

What carries the argument

The adaptation process in MimicGen that modifies existing human demonstrations to fit new scene configurations, object instances, and robot arms while preserving the underlying task behavior.

If this is right

Robots achieve strong performance on long-horizon and high-precision tasks when trained via imitation on the generated demonstrations.
Training success holds across broad initial state distributions for tasks such as multi-part assembly and coffee preparation.
The generated data compares favorably in effectiveness to collecting additional real human demonstrations.
Robot learning becomes more economical because large datasets no longer require proportional increases in human collection effort.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adaptation idea could be applied in physical robot experiments to test whether the performance gains transfer beyond simulation.
Combining MimicGen-style generation with other data sources might further reduce the total number of real demonstrations needed for multi-task training.
Similar techniques could shorten iteration cycles when researchers want to explore many environment variations without repeating full human data collection.

Load-bearing premise

Data created by adapting human demonstrations to new contexts trains robots as effectively as fresh human demonstrations collected directly in those same contexts.

What would settle it

A side-by-side test in which imitation learning agents trained on an equal volume of newly collected real human demonstrations in the target contexts outperform agents trained on the MimicGen-generated data would falsify the central claim.

read the original abstract

Imitation learning from a large set of human demonstrations has proved to be an effective paradigm for building capable robot agents. However, the demonstrations can be extremely costly and time-consuming to collect. We introduce MimicGen, a system for automatically synthesizing large-scale, rich datasets from only a small number of human demonstrations by adapting them to new contexts. We use MimicGen to generate over 50K demonstrations across 18 tasks with diverse scene configurations, object instances, and robot arms from just ~200 human demonstrations. We show that robot agents can be effectively trained on this generated dataset by imitation learning to achieve strong performance in long-horizon and high-precision tasks, such as multi-part assembly and coffee preparation, across broad initial state distributions. We further demonstrate that the effectiveness and utility of MimicGen data compare favorably to collecting additional human demonstrations, making it a powerful and economical approach towards scaling up robot learning. Datasets, simulation environments, videos, and more at https://mimicgen.github.io .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces MimicGen, a system that automatically synthesizes large-scale robot demonstration datasets by adapting a small number (~200) of human demonstrations to new scene configurations, object instances, and robot arms. It generates over 50K demonstrations across 18 tasks and claims that imitation learning agents trained on this data achieve strong performance on long-horizon, high-precision tasks such as multi-part assembly and coffee preparation across broad initial state distributions, with effectiveness and utility that compare favorably to collecting additional human demonstrations.

Significance. If the empirical results hold, the work offers a practical method to scale imitation learning data collection economically, addressing a major bottleneck in robot learning. The scale of generated data, focus on challenging long-horizon tasks, and release of datasets, environments, and videos are strengths that support reproducibility and further research.

major comments (2)

[§4 and §5] §4 (Experiments) and §5 (Results): The claim that MimicGen data compares favorably to additional human demonstrations (and supports strong performance across broad initial states) is load-bearing but requires explicit quantitative evidence that the adaptation process preserves task success. The manuscript should report the success rate of generated trajectories (e.g., fraction of valid, collision-free executions after rigid transformation and sub-task stitching) versus fresh human data in the target contexts; without this, it is unclear whether the imitation learning results reflect high-quality data or are undermined by invalid trajectories.
[§3.2] §3.2 (Adaptation Procedure): The description of context adaptation (object pose changes, new instances, different arms) must detail mechanisms for ensuring kinematic reachability and avoiding collisions or altered contact dynamics. If these are not addressed, the generated dataset may contain a higher fraction of failed executions than real human data, directly affecting the central scalability claim.

minor comments (2)

[Abstract] Abstract: Include at least one key quantitative result (e.g., success rate or performance gap versus baselines) to strengthen the empirical claims made in the opening paragraph.
[Throughout] Notation and figures: Ensure consistent use of terms such as 'context adaptation' and 'sub-task stitching' across text and figures; add a table summarizing the 18 tasks with their horizon lengths and precision requirements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We have carefully reviewed the major comments and provide point-by-point responses below. We agree that certain clarifications and additions will strengthen the paper and plan to incorporate them in the revision.

read point-by-point responses

Referee: [§4 and §5] §4 (Experiments) and §5 (Results): The claim that MimicGen data compares favorably to additional human demonstrations (and supports strong performance across broad initial states) is load-bearing but requires explicit quantitative evidence that the adaptation process preserves task success. The manuscript should report the success rate of generated trajectories (e.g., fraction of valid, collision-free executions after rigid transformation and sub-task stitching) versus fresh human data in the target contexts; without this, it is unclear whether the imitation learning results reflect high-quality data or are undermined by invalid trajectories.

Authors: We agree that directly reporting the success rates of the generated trajectories would provide stronger support for the central claims. The manuscript primarily evaluates data utility through downstream imitation learning performance across broad initial state distributions, which serves as an indirect but practical measure of data quality. However, we acknowledge the value of explicit metrics on adaptation validity. In the revised manuscript, we will add quantitative results (e.g., a table in §4 or §5) reporting the fraction of MimicGen trajectories that are valid, collision-free, and task-successful after adaptation, with direct comparisons to human demonstrations collected in the target contexts. This addition will clarify that the reported IL results are based on high-quality data. revision: yes
Referee: [§3.2] §3.2 (Adaptation Procedure): The description of context adaptation (object pose changes, new instances, different arms) must detail mechanisms for ensuring kinematic reachability and avoiding collisions or altered contact dynamics. If these are not addressed, the generated dataset may contain a higher fraction of failed executions than real human data, directly affecting the central scalability claim.

Authors: We thank the referee for this suggestion to enhance the technical description. Section 3.2 currently focuses on the high-level adaptation pipeline (rigid transformations for poses, sub-task segmentation, and stitching). The procedure incorporates inverse kinematics feasibility checks during pose adaptation for different robot arms and uses the underlying simulator to detect and filter collisions or unreachable configurations before including trajectories in the dataset. Contact dynamics are preserved by maintaining relative end-effector trajectories within each sub-task. We will expand §3.2 with additional details on these mechanisms, including the specific reachability checks and filtering steps, to address the concern directly. revision: yes

Circularity Check

0 steps flagged

MimicGen is a practical data synthesis system with no circular derivations or modeling

full rationale

The paper introduces an engineering system for adapting a small number of human demonstrations to new scene configurations, object instances, and robot arms to synthesize large datasets. No equations, first-principles derivations, fitted parameters, or predictions are described that could reduce to inputs by construction. The central claim rests on empirical results from training imitation learning policies on the generated data, which is externally falsifiable via real-robot or simulator success rates and does not rely on self-citations, uniqueness theorems, or ansatzes from prior author work. This is a standard non-circular systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model or parameters are described in the abstract; the work is an applied system for data generation in robotics.

pith-pipeline@v0.9.0 · 5510 in / 1021 out tokens · 58671 ms · 2026-05-17T09:41:01.720309+00:00 · methodology

discussion (0)

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Good in Bad (GiB): Sifting Through End-user Demonstrations for Learning a Better Policy
cs.RO 2026-05 unverdicted novelty 7.0

GiB filters erroneous subtasks from mixed-quality human demonstrations using self-supervised latent features and Mahalanobis distance to train more robust imitation learning policies.
DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation
cs.RO 2026-04 unverdicted novelty 7.0

DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation
cs.RO 2026-04 unverdicted novelty 7.0

ReV is a referring-aware visuomotor policy using coupled diffusion heads for real-time trajectory replanning in robotic manipulation, trained solely via targeted perturbations to expert demonstrations and achieving hi...
Good in Bad (GiB): Sifting Through End-user Demonstrations for Learning a Better Policy
cs.RO 2026-05 unverdicted novelty 6.0

GiB uses self-supervised latent features and Mahalanobis distance to filter erroneous subtasks from mixed-quality human demonstrations, improving robot policy learning in simulation and real-world tasks.
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation
cs.RO 2026-04 unverdicted novelty 6.0

Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
cs.RO 2026-04 unverdicted novelty 6.0

X-WAM unifies robotic action execution and 4D world synthesis by adapting video diffusion priors with a lightweight depth branch and asynchronous noise sampling, achieving 79-91% success on robot benchmarks.
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
cs.RO 2026-04 unverdicted novelty 6.0

X-WAM unifies real-time robotic action execution with high-fidelity 4D world synthesis by adapting video diffusion priors through lightweight depth branches and asynchronous noise sampling, achieving 79-91% success on...
Unmasking the Illusion of Embodied Reasoning in Vision-Language-Action Models
cs.RO 2026-04 unverdicted novelty 6.0

State-of-the-art vision-language-action models catastrophically fail dynamic embodied reasoning due to lexical-kinematic shortcuts, behavioral inertia, and semantic feature collapse caused by architectural bottlenecks...
A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies
cs.RO 2026-04 unverdicted novelty 6.0

Sim-and-real co-training for robot policies is driven primarily by balanced cross-domain representation alignment and secondarily by domain-dependent action reweighting.
WARPED: Wrist-Aligned Rendering for Robot Policy Learning from Egocentric Human Demonstrations
cs.RO 2026-04 unverdicted novelty 6.0

WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match tele...
Generative Simulation for Policy Learning in Physical Human-Robot Interaction
cs.RO 2026-04 unverdicted novelty 6.0

A text-to-simulation pipeline using LLMs and VLMs generates synthetic pHRI data to train vision-based imitation learning policies that achieve over 80% success in zero-shot sim-to-real transfer on real assistive tasks.
SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds
cs.RO 2026-04 unverdicted novelty 6.0

SIM1 converts sparse real demonstrations into high-fidelity synthetic data through physics-aligned simulation, yielding policies that match real-data performance at a 1:15 ratio with 90% zero-shot success on deformabl...
ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors
cs.RO 2026-03 conditional novelty 6.0

ExpertGen generates high-success expert policies in simulation from imperfect priors by freezing a diffusion behavior model and optimizing its initial noise via RL, then distills them for real-robot deployment.
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning
cs.AI 2026-01 conditional novelty 6.0

Single-stage fine-tuning of a video model to generate actions as latent frames plus future states and values yields state-of-the-art robot policy performance on LIBERO, RoboCasa, and bimanual tasks.
IGen: Scalable Data Generation for Robot Learning from Open-World Images
cs.RO 2025-12 unverdicted novelty 6.0

IGen generates realistic visuomotor training data including actions and temporally coherent visuals from unstructured open-world images via 3D reconstruction and VLM reasoning.
$\pi_0$: A Vision-Language-Action Flow Model for General Robot Control
cs.LG 2024-10 unverdicted novelty 6.0

π₀ is a vision-language-action flow model trained on diverse multi-platform robot data that supports zero-shot task performance, language instruction following, and efficient fine-tuning for dexterous tasks.
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
cs.RO 2024-06 unverdicted novelty 6.0

RoboCasa supplies a large-scale kitchen simulator, generative assets, 100 tasks, and automated data pipelines that produce a clear scaling trend in imitation learning for generalist robots.
EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development
cs.RO 2026-04 unverdicted novelty 5.0

EmbodiedClaw automates embodied AI development workflows through conversation, reducing manual effort and improving consistency and reproducibility.
World Action Models: The Next Frontier in Embodied AI
cs.RO 2026-05 unverdicted novelty 4.0

The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.

Reference graph

Works this paper leans on

128 extracted references · 128 canonical work pages · cited by 17 Pith papers · 11 internal anchors

[1]

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

T. Zhang, Z. McCarthy, O. Jow, D. Lee, K. Goldberg, and P. Abbeel, “Deep imitation learning for complex manipulation tasks from virtual reality teleoperation,” arXiv preprint arXiv:1710.04615, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[2]

RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation,

A. Mandlekar, Y . Zhu, A. Garg, J. Booher, M. Spero, A. Tung, J. Gao, J. Emmons, A. Gupta, E. Orbay, S. Savarese, and L. Fei-Fei, “RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation,” in Conference on Robot Learning, 2018

work page 2018
[3]

Bc- z: Zero-shot task generalization with robotic imitation learning,

E. Jang, A. Irpan, M. Khansari, D. Kappler, F. Ebert, C. Lynch, S. Levine, and C. Finn, “Bc- z: Zero-shot task generalization with robotic imitation learning,” in Conference on Robot Learning. PMLR, 2022, pp. 991–1002

work page 2022
[4]

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

M. Ahn, A. Brohan, N. Brown, Y . Chebotar, O. Cortes, B. David, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog et al. , “Do as i can, not as i say: Grounding language in robotic affordances,”arXiv preprint arXiv:2204.01691, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[5]

RT-1: Robotics Transformer for Real-World Control at Scale

A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, J. Hsu et al. , “Rt-1: Robotics transformer for real-world control at scale,” arXiv preprint arXiv:2212.06817, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[6]

Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets

F. Ebert, Y . Yang, K. Schmeckpeper, B. Bucher, G. Georgakis, K. Daniilidis, C. Finn, and S. Levine, “Bridge data: Boosting generalization of robotic skills with cross-domain datasets,” arXiv preprint arXiv:2109.13396, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[7]

What matters in learning from offline human demonstrations for robot manipulation,

A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y . Zhu, and R. Mart´ın-Mart´ın, “What matters in learning from offline human demonstrations for robot manipulation,” in Conference on Robot Learning (CoRL) , 2021

work page 2021
[8]

You only demonstrate once: Category-level manipulation from single visual demonstration,

B. Wen, W. Lian, K. Bekris, and S. Schaal, “You only demonstrate once: Category-level manipulation from single visual demonstration,” in Robotics: Science and Systems (RSS) , 2022

work page 2022
[9]

Coarse-to-fine imitation learning: Robot manipulation from a single demonstra- tion,

E. Johns, “Coarse-to-fine imitation learning: Robot manipulation from a single demonstra- tion,” in 2021 IEEE international conference on robotics and automation (ICRA) . IEEE, 2021, pp. 4613–4619

work page 2021
[10]

Demonstrate once, imitate imme- diately (dome): Learning visual servoing for one-shot imitation learning,

E. Valassakis, G. Papagiannis, N. Di Palo, and E. Johns, “Demonstrate once, imitate imme- diately (dome): Learning visual servoing for one-shot imitation learning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . IEEE, 2022, pp. 8614– 8621

work page 2022
[11]

Learning multi-stage tasks with one demonstration via self-replay,

N. Di Palo and E. Johns, “Learning multi-stage tasks with one demonstration via self-replay,” in Conference on Robot Learning. PMLR, 2022, pp. 1180–1189

work page 2022
[12]

Learning hand-eye coordination for robotic grasping with large-scale data collection,

S. Levine, P. Pastor, A. Krizhevsky, and D. Quillen, “Learning hand-eye coordination for robotic grasping with large-scale data collection,” in ISER, 2016, pp. 173–184

work page 2016
[13]

Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours,

L. Pinto and A. Gupta, “Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours,” inRobotics and Automation (ICRA), 2016 IEEE Int’l Conference on. IEEE, 2016

work page 2016
[14]

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V . Vanhoucke et al. , “Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation,” arXiv preprint arXiv:1806.10293, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Mt-opt: Continuous multi-task robotic reinforcement learning at scale,

D. Kalashnikov, J. Varley, Y . Chebotar, B. Swanson, R. Jonschkowski, C. Finn, S. Levine, and K. Hausman, “Mt-opt: Continuous multi-task robotic reinforcement learning at scale,” arXiv preprint arXiv:2104.08212, 2021. 9

work page arXiv 2021
[16]

More than a million ways to be pushed. a high-fidelity experimental dataset of planar pushing,

K.-T. Yu, M. Bauza, N. Fazeli, and A. Rodriguez, “More than a million ways to be pushed. a high-fidelity experimental dataset of planar pushing,” in Int’l Conference on Intelligent Robots and Systems, 2016

work page 2016
[17]

Robonet: Large-scale multi-robot learning,

S. Dasari, F. Ebert, S. Tian, S. Nair, B. Bucher, K. Schmeckpeper, S. Singh, S. Levine, and C. Finn, “Robonet: Large-scale multi-robot learning,” arXiv preprint arXiv:1910.11215, 2019

work page arXiv 1910
[18]

Rlbench: The robot learning benchmark & learning environment,

S. James, Z. Ma, D. R. Arrojo, and A. J. Davison, “Rlbench: The robot learning benchmark & learning environment,”IEEE Robotics and Automation Letters , vol. 5, no. 2, pp. 3019–3026, 2020

work page 2020
[19]

Transporter networks: Rearranging the visual world for robotic manipulation,

A. Zeng, P. Florence, J. Tompson, S. Welker, J. Chien, M. Attarian, T. Armstrong, I. Krasin, D. Duong, V . Sindhwani et al. , “Transporter networks: Rearranging the visual world for robotic manipulation,” arXiv preprint arXiv:2010.14406, 2020

work page arXiv 2010
[20]

Vima: General robot manipulation with multimodal prompts,

Y . Jiang, A. Gupta, Z. Zhang, G. Wang, Y . Dou, Y . Chen, L. Fei-Fei, A. Anandkumar, Y . Zhu, and L. Fan, “Vima: General robot manipulation with multimodal prompts,” arXiv preprint arXiv:2210.03094, 2022

work page arXiv 2022
[21]

Maniskill2: A unified benchmark for generalizable manipulation skills,

J. Gu, F. Xiang, X. Li, Z. Ling, X. Liu, T. Mu, Y . Tang, S. Tao, X. Wei, Y . Yao et al. , “Maniskill2: A unified benchmark for generalizable manipulation skills,” arXiv preprint arXiv:2302.04659, 2023

work page arXiv 2023
[22]

Imitating task and motion planning with visuomotor transformers,

M. Dalal, A. Mandlekar, C. Garrett, A. Handa, R. Salakhutdinov, and D. Fox, “Imitating task and motion planning with visuomotor transformers,”arXiv preprint arXiv:2305.16309, 2023

work page arXiv 2023
[23]

Scaling robot supervision to hundreds of hours with roboturk: Robotic manip- ulation dataset through human reasoning and dexterity,

A. Mandlekar, J. Booher, M. Spero, A. Tung, A. Gupta, Y . Zhu, A. Garg, S. Savarese, and L. Fei-Fei, “Scaling robot supervision to hundreds of hours with roboturk: Robotic manip- ulation dataset through human reasoning and dexterity,” arXiv preprint arXiv:1911.04052 , 2019

work page arXiv 1911
[24]

Human-in-the- loop imitation learning using remote teleoperation,

A. Mandlekar, D. Xu, R. Mart´ın-Mart´ın, Y . Zhu, L. Fei-Fei, and S. Savarese, “Human-in-the- loop imitation learning using remote teleoperation,” arXiv preprint arXiv:2012.06733, 2020

work page arXiv 2012
[25]

Learning multi-arm manipulation through collaborative teleoperation,

A. Tung, J. Wong, A. Mandlekar, R. Mart ´ın-Mart´ın, Y . Zhu, L. Fei-Fei, and S. Savarese, “Learning multi-arm manipulation through collaborative teleoperation,” arXiv preprint arXiv:2012.06738, 2020

work page arXiv 2012
[26]

Error-aware imitation learning from teleoperation data for mobile manipulation,

J. Wong, A. Tung, A. Kurenkov, A. Mandlekar, L. Fei-Fei, S. Savarese, and R. Mart´ın-Mart´ın, “Error-aware imitation learning from teleoperation data for mobile manipulation,” inConfer- ence on Robot Learning . PMLR, 2022, pp. 1367–1378

work page 2022
[27]

Interactive language: Talking to robots in real time,

C. Lynch, A. Wahid, J. Tompson, T. Ding, J. Betker, R. Baruch, T. Armstrong, and P. Florence, “Interactive language: Talking to robots in real time,”arXiv preprint arXiv:2210.06407, 2022

work page arXiv 2022
[28]

Alvinn: An autonomous land vehicle in a neural network,

D. A. Pomerleau, “Alvinn: An autonomous land vehicle in a neural network,” in Advances in neural information processing systems, 1989, pp. 305–313

work page 1989
[29]

Movement imitation with nonlinear dynamical systems in humanoid robots,

A. J. Ijspeert, J. Nakanishi, and S. Schaal, “Movement imitation with nonlinear dynamical systems in humanoid robots,” Proceedings 2002 IEEE International Conference on Robotics and Automation, vol. 2, pp. 1398–1403 vol.2, 2002

work page 2002
[30]

One-shot visual imitation learning via meta-learning,

C. Finn, T. Yu, T. Zhang, P. Abbeel, and S. Levine, “One-shot visual imitation learning via meta-learning,” in Conference on robot learning. PMLR, 2017, pp. 357–368

work page 2017
[31]

Robot programming by demonstration,

A. Billard, S. Calinon, R. Dillmann, and S. Schaal, “Robot programming by demonstration,” in Springer Handbook of Robotics , 2008

work page 2008
[32]

Learning and re- production of gestures by imitation,

S. Calinon, F. D’halluin, E. L. Sauser, D. G. Caldwell, and A. Billard, “Learning and re- production of gestures by imitation,” IEEE Robotics and Automation Magazine , vol. 17, pp. 44–54, 2010

work page 2010
[33]

Learning to generalize across long-horizon tasks from human demonstrations,

A. Mandlekar, D. Xu, R. Mart ´ın-Mart´ın, S. Savarese, and L. Fei-Fei, “Learning to generalize across long-horizon tasks from human demonstrations,” arXiv preprint arXiv:2003.06085 , 2020

work page arXiv 2003
[34]

Generalization through hand-eye coordination: An action space for learning spatially-invariant visuomotor control,

C. Wang, R. Wang, D. Xu, A. Mandlekar, L. Fei-Fei, and S. Savarese, “Generalization through hand-eye coordination: An action space for learning spatially-invariant visuomotor control,” arXiv preprint arXiv:2103.00375, 2021. 10

work page arXiv 2021
[35]

Data augmentation for manipulation,

P. Mitrano and D. Berenson, “Data augmentation for manipulation,” arXiv preprint arXiv:2205.02886, 2022

work page arXiv 2022
[36]

Reinforcement learning with augmented data,

M. Laskin, K. Lee, A. Stooke, L. Pinto, P. Abbeel, and A. Srinivas, “Reinforcement learning with augmented data,” arXiv preprint arXiv:2004.14990, 2020

work page arXiv 2004
[37]

Image augmentation is all you need: Regularizing deep reinforcement learning from pixels,

I. Kostrikov, D. Yarats, and R. Fergus, “Image augmentation is all you need: Regularizing deep reinforcement learning from pixels,” arXiv preprint arXiv:2004.13649, 2020

work page arXiv 2004
[38]

Visual imitation made easy,

S. Young, D. Gandhi, S. Tulsiani, A. Gupta, P. Abbeel, and L. Pinto, “Visual imitation made easy,”arXiv e-prints, pp. arXiv–2008, 2020

work page 2008
[39]

A framework for efficient robotic manipulation,

A. Zhan, P. Zhao, L. Pinto, P. Abbeel, and M. Laskin, “A framework for efficient robotic manipulation,” arXiv preprint arXiv:2012.07975, 2020

work page arXiv 2012
[40]

S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics,

S. Sinha, A. Mandlekar, and A. Garg, “S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics,” in Conference on Robot Learning . PMLR, 2022, pp. 907–917

work page 2022
[41]

Counterfactual data augmentation using locally factored dynamics,

S. Pitis, E. Creager, and A. Garg, “Counterfactual data augmentation using locally factored dynamics,” Advances in Neural Information Processing Systems , vol. 33, pp. 3976–3990, 2020

work page 2020
[42]

Mocoda: Model-based counterfactual data augmentation,

S. Pitis, E. Creager, A. Mandlekar, and A. Garg, “Mocoda: Model-based counterfactual data augmentation,” arXiv preprint arXiv:2210.11287, 2022

work page arXiv 2022
[43]

Cacti: A framework for scalable multi-task multi-scene visual imitation learning,

Z. Mandi, H. Bharadhwaj, V . Moens, S. Song, A. Rajeswaran, and V . Kumar, “Cacti: A framework for scalable multi-task multi-scene visual imitation learning,” arXiv preprint arXiv:2212.05711, 2022

work page arXiv 2022
[44]

Scaling robot learning with semantically imagined experience,

T. Yu, T. Xiao, A. Stone, J. Tompson, A. Brohan, S. Wang, J. Singh, C. Tan, J. Per- alta, B. Ichter et al., “Scaling robot learning with semantically imagined experience,” arXiv preprint arXiv:2302.11550, 2023

work page arXiv 2023
[45]

Genaug: Retargeting behaviors to unseen situations via generative augmentation,

Z. Chen, S. Kiami, A. Gupta, and V . Kumar, “Genaug: Retargeting behaviors to unseen situations via generative augmentation,”arXiv preprint arXiv:2302.06671, 2023

work page arXiv 2023
[46]

Where to start? transferring simple skills to complex environ- ments,

V . V osylius and E. Johns, “Where to start? transferring simple skills to complex environ- ments,” arXiv preprint arXiv:2212.06111, 2022

work page arXiv 2022
[47]

Leveraging sequentiality in reinforce- ment learning from a single demonstration,

A. Chenu, O. Serris, O. Sigaud, and N. Perrin-Gilbert, “Leveraging sequentiality in reinforce- ment learning from a single demonstration,” arXiv preprint arXiv:2211.04786, 2022

work page arXiv 2022
[48]

Learning sensorimotor primitives of sequen- tial manipulation tasks from visual demonstrations,

J. Liang, B. Wen, K. Bekris, and A. Boularias, “Learning sensorimotor primitives of sequen- tial manipulation tasks from visual demonstrations,” in 2022 International Conference on Robotics and Automation (ICRA) . IEEE, 2022, pp. 8591–8597

work page 2022
[49]

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

Y . Zhu, J. Wong, A. Mandlekar, and R. Mart ´ın-Mart´ın, “robosuite: A modular simulation framework and benchmark for robot learning,” inarXiv preprint arXiv:2009.12293, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2009
[50]

Mujoco: A physics engine for model-based control,

E. Todorov, T. Erez, and Y . Tassa, “Mujoco: A physics engine for model-based control,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 5026–5033

work page 2012
[51]

Factory: Fast contact for robotic assembly,

Y . Narang, K. Storey, I. Akinola, M. Macklin, P. Reist, L. Wawrzyniak, Y . Guo, A. Mora- vanszky, G. State, M. Lu et al., “Factory: Fast contact for robotic assembly,” arXiv preprint arXiv:2205.03532, 2022

work page arXiv 2022
[52]

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

V . Makoviychuk, L. Wawrzyniak, Y . Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa et al. , “Isaac gym: High performance gpu-based physics simulation for robot learning,” arXiv preprint arXiv:2108.10470, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[53]

Bottom-up skill discovery from unsegmented demonstrations for long-horizon robot manipulation,

Y . Zhu, P. Stone, and Y . Zhu, “Bottom-up skill discovery from unsegmented demonstrations for long-horizon robot manipulation,” IEEE Robotics and Automation Letters , vol. 7, no. 2, pp. 4126–4133, 2022

work page 2022
[54]

Viola: Imitation learning for vision-based manipula- tion with object proposal priors,

Y . Zhu, A. Joshi, P. Stone, and Y . Zhu, “Viola: Imitation learning for vision-based manipula- tion with object proposal priors,” 6th Annual Conference on Robot Learning , 2022

work page 2022
[55]

Learning and retrieval from prior data for skill-based imitation learning,

S. Nasiriany, T. Gao, A. Mandlekar, and Y . Zhu, “Learning and retrieval from prior data for skill-based imitation learning,” in Conference on Robot Learning (CoRL) , 2022. 11

work page 2022
[56]

Calvin: A benchmark for language- conditioned policy learning for long-horizon robot manipulation tasks,

O. Mees, L. Hermann, E. Rosete-Beas, and W. Burgard, “Calvin: A benchmark for language- conditioned policy learning for long-horizon robot manipulation tasks,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 7327–7334, 2022

work page 2022
[57]

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

S. Levine, A. Kumar, G. Tucker, and J. Fu, “Offline reinforcement learning: Tutorial, review, and perspectives on open problems,”arXiv preprint arXiv:2005.01643, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2005
[58]

Sim-to-real transfer of robotic control with dynamics randomization,

X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Sim-to-real transfer of robotic control with dynamics randomization,” in 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2018, pp. 3803–3810

work page 2018
[59]

Sim2real transfer for reinforcement learning without dynamics randomization,

M. Kaspar, J. D. M. Osorio, and J. Bock, “Sim2real transfer for reinforcement learning without dynamics randomization,” in2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 4383–4388

work page 2020
[60]

Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger,

A. Allshire, M. MittaI, V . Lodaya, V . Makoviychuk, D. Makoviichuk, F. Widmaier, M. W¨uthrich, S. Bauer, A. Handa, and A. Garg, “Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . IEEE, 2022, pp. 11 802–11 809

work page 2022
[61]

Practical imitation learning in the real world via task consistency loss,

M. Khansari, D. Ho, Y . Du, A. Fuentes, M. Bennice, N. Sievers, S. Kirmani, Y . Bai, and E. Jang, “Practical imitation learning in the real world via task consistency loss,” arXiv preprint arXiv:2202.01862, 2022

work page arXiv 2022
[62]

Dextreme: Transfer of agile in-hand manipulation from simulation to reality,

A. Handa, A. Allshire, V . Makoviychuk, A. Petrenko, R. Singh, J. Liu, D. Makoviichuk, K. Van Wyk, A. Zhurkevich, B. Sundaralingam et al., “Dextreme: Transfer of agile in-hand manipulation from simulation to reality,”arXiv preprint arXiv:2210.13702, 2022

work page arXiv 2022
[63]

A unified approach for motion and force control of robot manipulators: The operational space formulation,

O. Khatib, “A unified approach for motion and force control of robot manipulators: The operational space formulation,” IEEE Journal on Robotics and Automation , vol. 3, no. 1, pp. 43–53, 1987

work page 1987
[64]

Rb2: Robotic manipulation benchmarking with a twist,

S. Dasari, J. Wang, J. Hong, S. Bahl, Y . Lin, A. Wang, A. Thankaraj, K. Chahal, B. Calli, S. Gupta et al. , “Rb2: Robotic manipulation benchmarking with a twist,” arXiv preprint arXiv:2203.08098, 2022

work page arXiv 2022
[65]

Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning,

T. Yu, D. Quillen, Z. He, R. Julian, K. Hausman, C. Finn, and S. Levine, “Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning,” inConference on robot learning. PMLR, 2020, pp. 1094–1100

work page 2020
[66]

Maniskill: Generalizable manipulation skill benchmark with large-scale demonstrations,

T. Mu, Z. Ling, F. Xiang, D. Yang, X. Li, S. Tao, Z. Huang, Z. Jia, and H. Su, “Maniskill: Generalizable manipulation skill benchmark with large-scale demonstrations,”arXiv preprint arXiv:2107.14483, 2021

work page arXiv 2021
[67]

Query-Efficient Imitation Learning for End-to-End Autonomous Driving

J. Zhang and K. Cho, “Query-efficient imitation learning for end-to-end autonomous driving,” arXiv preprint arXiv:1605.06450, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[68]

Lazydagger: Reducing context switching in interactive imitation learning,

R. Hoque, A. Balakrishna, C. Putterman, M. Luo, D. S. Brown, D. Seita, B. Thananjeyan, E. Novoseller, and K. Goldberg, “Lazydagger: Reducing context switching in interactive imitation learning,” in 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE). IEEE, 2021, pp. 502–509

work page 2021
[69]

Thriftydagger: Budget-aware novelty and risk gating for interactive imitation learning,

R. Hoque, A. Balakrishna, E. Novoseller, A. Wilcox, D. S. Brown, and K. Goldberg, “Thriftydagger: Budget-aware novelty and risk gating for interactive imitation learning,” arXiv preprint arXiv:2109.08273, 2021

work page arXiv 2021
[70]

Pato: Policy assisted teleoperation for scalable robot data collection,

S. Dass, K. Pertsch, H. Zhang, Y . Lee, J. J. Lim, and S. Nikolaidis, “Pato: Policy assisted teleoperation for scalable robot data collection,” arXiv preprint arXiv:2212.04708, 2022

work page arXiv 2022
[71]

Datasetgan: Efficient labeled data factory with minimal human effort,

Y . Zhang, H. Ling, J. Gao, K. Yin, J.-F. Lafleche, A. Barriuso, A. Torralba, and S. Fidler, “Datasetgan: Efficient labeled data factory with minimal human effort,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2021, pp. 10 145– 10 155

work page 2021
[72]

Bigdatasetgan: Synthesiz- ing imagenet with pixel-wise annotations,

D. Li, H. Ling, S. W. Kim, K. Kreis, S. Fidler, and A. Torralba, “Bigdatasetgan: Synthesiz- ing imagenet with pixel-wise annotations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21 330–21 340

work page 2022
[73]

Meta-sim: Learning to generate synthetic datasets,

A. Kar, A. Prakash, M.-Y . Liu, E. Cameracci, J. Yuan, M. Rusiniak, D. Acuna, A. Torralba, and S. Fidler, “Meta-sim: Learning to generate synthetic datasets,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 4551–4560. 12

work page 2019
[74]

Meta-sim2: Unsupervised learning of scene structure for synthetic data generation,

J. Devaranjan, A. Kar, and S. Fidler, “Meta-sim2: Unsupervised learning of scene structure for synthetic data generation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16. Springer, 2020, pp. 715–733

work page 2020
[75]

Drivegan: Towards a controllable high- quality neural simulation,

S. W. Kim, J. Philion, A. Torralba, and S. Fidler, “Drivegan: Towards a controllable high- quality neural simulation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 5820–5829

work page 2021
[76]

Atiss: Autoregres- sive transformers for indoor scene synthesis,

D. Paschalidou, A. Kar, M. Shugrina, K. Kreis, A. Geiger, and S. Fidler, “Atiss: Autoregres- sive transformers for indoor scene synthesis,” Advances in Neural Information Processing Systems, vol. 34, pp. 12 013–12 026, 2021

work page 2021
[77]

Scenegen: Learning to generate realistic traffic scenes,

S. Tan, K. Wong, S. Wang, S. Manivasagam, M. Ren, and R. Urtasun, “Scenegen: Learning to generate realistic traffic scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 892–901

work page 2021
[78]

Motionbenchmaker: A tool to generate and benchmark motion planning datasets,

C. Chamzas, C. Quintero-Pena, Z. Kingston, A. Orthey, D. Rakita, M. Gleicher, M. Toussaint, and L. E. Kavraki, “Motionbenchmaker: A tool to generate and benchmark motion planning datasets,” IEEE Robotics and Automation Letters , vol. 7, no. 2, pp. 882–889, 2021

work page 2021
[79]

Learning latent plans from play,

C. Lynch, M. Khansari, T. Xiao, V . Kumar, J. Tompson, S. Levine, and P. Sermanet, “Learning latent plans from play,” inConference on Robot Learning, 2019

work page 2019
[80]

Demonstration-guided reinforcement learning with learned skills,

K. Pertsch, Y . Lee, Y . Wu, and J. J. Lim, “Demonstration-guided reinforcement learning with learned skills,” in Conference on Robot Learning, 2021

work page 2021

Showing first 80 references.