robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

Abhiram Maddukuri; Abhishek Joshi; Ajay Mandlekar; Josiah Wong; Kevin Lin; Roberto Mart\'in-Mart\'in; Soroush Nasiriany; Yifeng Zhu; Yuke Zhu

arxiv: 2009.12293 · v3 · submitted 2020-09-25 · 💻 cs.RO · cs.AI· cs.LG

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

Yuke Zhu , Josiah Wong , Ajay Mandlekar , Roberto Mart\'in-Mart\'in , Abhishek Joshi , Kevin Lin , Abhiram Maddukuri , Soroush Nasiriany

show 1 more author

Yifeng Zhu

This is my paper

Pith reviewed 2026-05-12 22:31 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.LG

keywords robot learningsimulation frameworkMuJoCobenchmark environmentsmodular designreproducible researchrobotic tasks

0 comments

The pith

robosuite is a modular simulation framework powered by MuJoCo that supplies benchmark environments for reproducible robot learning research.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents robosuite as a simulation framework for robot learning built on the MuJoCo physics engine. It emphasizes a modular design that lets users assemble and customize robotic tasks from reusable components. The release also includes a collection of standard benchmark environments intended to make experimental results comparable across different research groups. A reader would care because robot learning experiments often rely on bespoke simulation setups that prevent direct comparisons and slow collective progress.

Core claim

The authors establish that robosuite v1.5 delivers key system modules supporting modular task creation alongside a suite of benchmark environments, enabling researchers to define custom robotic tasks and run reproducible learning experiments without rebuilding simulation infrastructure from scratch.

What carries the argument

The modular system modules for assembling robotic tasks, combined with the provided suite of benchmark environments.

If this is right

Researchers can compose new robotic tasks by combining existing modules instead of starting from zero.
Standard benchmark environments allow direct side-by-side comparison of different learning algorithms.
Reproducible simulation setups reduce the time spent on infrastructure and increase time available for algorithm development.
Consistent environments support cumulative progress because results from one paper can be verified or extended by others.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Widespread use could reduce duplication of effort across labs by providing a shared simulation base.
The same modular structure might later support easier sim-to-real transfer once real-robot interfaces are added.
Benchmark results could serve as a common reference point for comparing learning methods that currently rely on private environments.

Load-bearing premise

That researchers will adopt the modular architecture and benchmark environments without needing to write substantial additional custom code for their own tasks.

What would settle it

A survey or usage study in which most researchers report that they must still implement large amounts of custom simulation code to match their experimental needs, or in which benchmark results prove difficult to reproduce across independent implementations.

read the original abstract

robosuite is a simulation framework for robot learning powered by the MuJoCo physics engine. It offers a modular design for creating robotic tasks as well as a suite of benchmark environments for reproducible research. This paper discusses the key system modules and the benchmark environments of our new release robosuite v1.5.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

robosuite v1.5 is a practical modular extension to MuJoCo for standardizing robot learning benchmarks and task creation, helpful for the subfield but incremental rather than transformative.

read the letter

robosuite v1.5 is a modular simulation framework built on MuJoCo that provides tools for creating robotic tasks and a collection of benchmark environments aimed at making robot learning research more reproducible. The main contribution here is the updated release with its emphasis on modularity. This allows researchers to assemble different components like robots, objects, and controllers in a flexible way rather than starting from scratch each time. The benchmark suite includes standard tasks that can serve as common testbeds, which is the sort of thing that helps when comparing different learning algorithms across papers. The description of the system modules is straightforward and should make it easier for users to extend the framework. That said, the paper is primarily a description of the software rather than a report of new experiments or theoretical results. There's no detailed validation of how well the modularity reduces setup time or improves reproducibility in practice, which is typical for these kinds of releases but leaves some questions about real-world adoption. The assumption that this will be sufficient for most users without additional custom code might hold for basic tasks but could be tested further. This paper is for anyone working on robot learning who uses simulation for training policies or testing algorithms. It provides a practical resource that could save time on infrastructure. I think it deserves serious peer review because shared benchmarks and tools like this organize the field and enable better comparisons, even if the advance is incremental.

Referee Report

0 major / 3 minor

Summary. The paper introduces robosuite v1.5, a modular simulation framework for robot learning powered by the MuJoCo physics engine. It describes the key system modules for task creation and presents a suite of benchmark environments intended to support reproducible research in the field.

Significance. If the described modular architecture and benchmarks function as outlined, the framework could provide a standardized platform that reduces the need for custom simulation code, thereby improving reproducibility across robot learning studies. The release of an open tool with explicit benchmark support is a practical contribution to the community.

minor comments (3)

[Abstract] Abstract: the claim that the framework offers 'a suite of benchmark environments for reproducible research' would be strengthened by briefly noting the specific tasks included (e.g., manipulation, locomotion) and any quantitative validation of their stability or fidelity.
The manuscript should include a dedicated section or table comparing robosuite v1.5 features against prior versions or alternative simulators (e.g., PyBullet, Gazebo) to clarify incremental advances.
Ensure that all module descriptions cite the corresponding source files or API references so readers can directly inspect the implementation.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary and significance assessment of our work on robosuite v1.5. The recommendation for minor revision is noted. As no specific major comments were provided in the report, we have no substantive points to address and believe the manuscript requires no technical revisions.

Circularity Check

0 steps flagged

No circularity: purely descriptive software framework paper

full rationale

The manuscript is a software release note for robosuite v1.5. It describes the modular architecture, MuJoCo integration, task-creation utilities, and benchmark environments without any derivations, equations, fitted parameters, predictions, or uniqueness theorems. No load-bearing self-citations or ansatzes appear; the central claim is simply that the described interfaces exist and are exposed. This is self-contained descriptive documentation rather than a chain of inferences that could reduce to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software framework description paper containing no mathematical derivations, fitted parameters, or postulated entities.

pith-pipeline@v0.9.0 · 5375 in / 1044 out tokens · 46507 ms · 2026-05-12T22:31:41.630161+00:00 · methodology

discussion (0)

Forward citations

Cited by 60 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning
cs.RO 2026-05 accept novelty 8.0

TAVIS is a released benchmark showing active vision improves imitation learning in a task-dependent manner, multi-task policies struggle with shifts, and imitation produces human-like anticipatory gaze.
RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies
cs.RO 2026-04 unverdicted novelty 8.0

RoboLab is a new simulation benchmark with 120 tasks across visual, procedural, and relational axes that quantifies generalization gaps and perturbation sensitivity in task-generalist robotic policies.
BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation
cs.RO 2024-03 accept novelty 8.0

BEHAVIOR-1K introduces a benchmark of 1,000 human everyday activities in realistic simulated scenes together with the OMNIGIBSON physics simulator to evaluate embodied AI.
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
cs.AI 2023-06 conditional novelty 8.0

LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.
CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models
cs.CV 2026-05 unverdicted novelty 7.0

Capability vectors extracted from parameter differences between standard and auxiliary-finetuned VLA models can be merged into pretrained weights to match auxiliary-training performance while reducing computational ov...
CoRAL: Contact-Rich Adaptive LLM-based Control for Robotic Manipulation
cs.RO 2026-05 unverdicted novelty 7.0

CoRAL lets LLMs act as adaptive cost designers for motion planners while using VLM priors and online identification to handle unknown physics, achieving over 50% higher success rates than baselines in unseen contact-r...
Atomic-Probe Governance for Skill Updates in Compositional Robot Policies
cs.RO 2026-04 unverdicted novelty 7.0

A cross-version swap protocol reveals dominant skills that swing composition success by up to 50 percentage points, and an atomic probe with selective revalidation governs updates at lower cost than always re-testing ...
HANDFUL: Sequential Grasp-Conditioned Dexterous Manipulation with Resource Awareness
cs.RO 2026-04 unverdicted novelty 7.0

HANDFUL learns resource-aware grasps using finger contact rewards and curriculum learning to improve success on sequential dexterous tasks in simulation and on a real LEAP hand.
Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations
cs.RO 2026-04 unverdicted novelty 7.0

ACO-MoE employs agent-centric mixture-of-experts to decouple task-relevant features from dynamic visual perturbations in RL, recovering 95.3% of clean performance on the new VDCS benchmark.
Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations
cs.RO 2026-04 unverdicted novelty 7.0

ACO-MoE recovers 95.3% of clean-input performance in visual control tasks under Markov-switching corruptions by routing restoration experts and anchoring representations to clean foreground masks.
BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination
cs.RO 2026-04 conditional novelty 7.0

BiCoord is a new benchmark for long-horizon tightly coordinated bimanual manipulation that includes quantitative metrics and shows existing policies like DP, RDT, Pi0 and OpenVLA-OFT struggle on such tasks.
Towards Generalizable Robotic Manipulation in Dynamic Environments
cs.CV 2026-03 unverdicted novelty 7.0

DOMINO dataset and PUMA architecture enable better dynamic robotic manipulation by incorporating motion history, delivering 6.3% higher success rates than prior VLA models.
ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs
cs.RO 2026-02 unverdicted novelty 7.0

ST-BiBench reveals a coordination paradox in which MLLMs show strong high-level strategic reasoning yet fail at fine-grained 16-dimensional bimanual action synthesis and multi-stream fusion.
MIMIC-D: Multi-modal Imitation for MultI-agent Coordination with Decentralized Diffusion Policies
cs.RO 2025-09 unverdicted novelty 7.0

MIMIC-D enables multi-modal multi-agent coordination via joint training of decentralized diffusion policies using only local information.
Voyager: An Open-Ended Embodied Agent with Large Language Models
cs.AI 2023-05 unverdicted novelty 7.0

Voyager achieves superior lifelong learning in Minecraft by combining an automatic exploration curriculum, a library of executable skills, and iterative LLM prompting with environment feedback, yielding 3.3x more uniq...
Behavior-Consistent Deep Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 6.0

QED bounds cross-run KL divergence in Boltzmann policies by setting temperature proportional to Q-disagreement and reduces return variance by two orders of magnitude on 18 continuous-control tasks without performance loss.
Behavior-Consistent Deep Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 6.0

QED sets state-dependent temperature proportional to double-critic disagreement to bound pairwise KL divergence between Boltzmann policies, cutting cross-run divergence by two orders of magnitude on 18 continuous-cont...
Beyond Action Residuals: Real-World Robot Policy Steering via Bottleneck Latent Reinforcement Learning
cs.RO 2026-05 unverdicted novelty 6.0

ZPRL adapts frozen flow-matching imitation policies via RL perturbations on a task-relevant bottleneck latent, yielding 33.7% higher average success on four real-world manipulation tasks than action-residual baselines.
COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones
cs.RO 2026-05 conditional novelty 6.0

COBALT enables scalable crowdsourced teleoperation of robots using smartphones, supporting concurrent users with low latency and yielding a 7500+ demonstration dataset validated on imitation learning tasks.
COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones
cs.RO 2026-05 unverdicted novelty 6.0

COBALT provides scalable cloud infrastructure for crowdsourced robot teleoperation via smartphones, supporting concurrent users with low latency and enabling collection of a 7500+ demonstration dataset validated throu...
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo
cs.RO 2026-05 conditional novelty 6.0

DexJoCo is a benchmark and toolkit with 11 functionally grounded tasks, 1.1K trajectories, and empirical benchmarks for task-oriented dexterous manipulation on MuJoCo.
Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making
cs.LG 2026-05 unverdicted novelty 6.0

Ada-Diffuser is a causal diffusion model that jointly learns observed interaction structure and underlying latent dynamics from minimal observations for adaptive planning and policy learning.
HeteroGenManip: Generalizable Manipulation For Heterogeneous Object Interactions
cs.RO 2026-05 unverdicted novelty 6.0

A task-conditioned two-stage system decouples grasp localization from interaction trajectory planning using specialized foundation models to improve generalization across heterogeneous object types.
HeteroGenManip: Generalizable Manipulation For Heterogeneous Object Interactions
cs.RO 2026-05 unverdicted novelty 6.0

HeteroGenManip decouples grasp localization from interaction planning using task-conditioned foundation models and multi-model diffusion policies, delivering 31% average gains in broad simulation tasks and 36.7% in fo...
Kintsugi: Learning Policies by Repairing Executable Knowledge Bases
cs.LG 2026-05 unverdicted novelty 6.0

Kintsugi learns policies by repairing composable executable knowledge bases through agentic diagnosis, localized typed edits, and deterministic verification gates that admit only improvements.
BEACON: Cross-Domain Co-Training of Generative Robot Policies via Best-Effort Adaptation
cs.RO 2026-05 unverdicted novelty 6.0

BEACON uses discrepancy-aware importance reweighting to co-train generative robot policies from abundant source and limited target demonstrations, yielding better robustness and implicit feature alignment.
BEACON: Cross-Domain Co-Training of Generative Robot Policies via Best-Effort Adaptation
cs.RO 2026-05 unverdicted novelty 6.0

BEACON uses discrepancy-aware importance reweighting to jointly train diffusion-based robot policies and source sample weights, improving performance over target-only and fixed-ratio baselines in cross-domain manipula...
How to Utilize Failure Demo Data?: Effective Data Selection for Imitation Learning Using Distribution Differences in Attention Mechanism
cs.RO 2026-05 unverdicted novelty 6.0

The method uses attention discrepancy metrics on latent success-failure representations to select beneficial failure data for imitation learning, raising task success rates in simulations.
Atomic-Probe Governance for Skill Updates in Compositional Robot Policies
cs.RO 2026-04 unverdicted novelty 6.0

Empirical study on robosuite tasks reveals a dominant-skill effect in compositions and shows that an atomic probe approximates full revalidation for skill updates at much lower cost.
GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning
cs.RO 2026-04 unverdicted novelty 6.0

GS-Playground delivers a high-throughput photorealistic simulator for vision-informed robot learning via parallel physics integrated with batch 3D Gaussian Splatting at 10^4 FPS and an automated Real2Sim workflow for ...
Visual-Tactile Peg-in-Hole Assembly Learning from Peg-out-of-Hole Disassembly
cs.RO 2026-04 unverdicted novelty 6.0

A visual-tactile RL method learns peg-in-hole assembly from reversed peg-out-of-hole disassembly trajectories, reaching 87.5% success on seen objects and 77.1% on unseen objects while lowering contact forces.
A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies
cs.RO 2026-04 unverdicted novelty 6.0

Sim-and-real co-training for robot policies is driven primarily by balanced cross-domain representation alignment and secondarily by domain-dependent action reweighting.
RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies
cs.RO 2026-04 unverdicted novelty 6.0

RoboLab is a photorealistic simulation benchmark with 120 tasks and perturbation analysis to evaluate true generalization and robustness of robotic foundation models.
Learning Without Losing Identity: Capability Evolution for Embodied Agents
cs.RO 2026-04 unverdicted novelty 6.0

Embodied agents maintain a persistent identity while evolving capabilities via modular ECMs, raising simulated task success from 32.4% to 91.3% over 20 iterations with zero policy drift or safety violations.
Learning Without Losing Identity: Capability Evolution for Embodied Agents
cs.RO 2026-04 unverdicted novelty 6.0

Embodied agents maintain persistent identity while evolving modular capabilities through a closed-loop process, raising simulated task success from 32.4% to 91.3% with zero policy drift.
RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies
cs.RO 2026-03 unverdicted novelty 6.0

RoboMME is a new benchmark with 16 tasks and 14 memory-augmented VLA variants that shows memory effectiveness is highly task-dependent.
Unify Robot Actions in Camera Frame
cs.RO 2025-11 conditional novelty 6.0

CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.
RoboEval: Where Robotic Manipulation Meets Structured and Scalable Evaluation
cs.RO 2025-07 unverdicted novelty 6.0

RoboEval is a new benchmark providing eight bimanual tasks, thousands of expert demonstrations, and standardized metrics for efficiency, coordination, safety, and failure localization in robotic manipulation.
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
cs.RO 2025-06 unverdicted novelty 6.0

RoboTwin 2.0 automates diverse synthetic data creation for dual-arm robots via MLLMs and five-axis domain randomization, leading to 228-367% gains in manipulation success.
From Action Labels to Sets: Rethinking Action Supervision for Imitation Learning from Corrective Feedback
cs.RO 2025-02 unverdicted novelty 6.0

CLIC uses set-valued action targets from interactive human corrections instead of pointwise labels to train more robust imitation learning policies.
RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields
cs.RO 2024-12 unverdicted novelty 6.0

A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
cs.RO 2024-06 unverdicted novelty 6.0

RoboCasa supplies a large-scale kitchen simulator, generative assets, 100 tasks, and automated data pipelines that produce a clear scaling trend in imitation learning for generalist robots.
Evaluating Real-World Robot Manipulation Policies in Simulation
cs.RO 2024-05 conditional novelty 6.0

SIMPLER simulated environments yield policy performance that correlates strongly with real-world robot manipulation results and captures similar sensitivity to distribution shifts.
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
cs.RO 2024-03 unverdicted novelty 6.0

DP3 uses compact 3D representations from sparse point clouds inside diffusion policies to learn generalizable visuomotor skills from few demonstrations, reporting 24% gains in simulation and 85% success on real robots.
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
cs.RO 2021-08 accept novelty 6.0

A comprehensive benchmark study of offline imitation learning methods on multi-stage robot manipulation tasks identifies key sensitivities to algorithm design, data quality, and stopping criteria while releasing all d...
ComPose: When to Trust Hands for Object Pose Tracking
cs.CV 2026-05 unverdicted novelty 5.0

ComPose tracks object poses in hand-occluded RGB videos by adaptively fusing cues from object and hand foundation models, selecting informative joints, and enforcing temporal consistency without external smoothing.
stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation
cs.LG 2026-05 unverdicted novelty 5.0

The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.
OrbiSim: World Models as Differentiable Physics Engines for Embodied Intelligence
cs.RO 2026-05 unverdicted novelty 5.0

OrbiSim builds a differentiable physics engine from world models to support gradient-based policy optimization and contact modeling in robotics.
Nautilus: From One Prompt to Plug-and-Play Robot Learning
cs.RO 2026-05 unverdicted novelty 5.0

NAUTILUS is a prompt-driven harness that automates plug-and-play adapters, typed contracts, and validation for policies, benchmarks, and robots in learning research.
How to Utilize Failure Demo Data?: Effective Data Selection for Imitation Learning Using Distribution Differences in Attention Mechanism
cs.RO 2026-05 unverdicted novelty 5.0

A method for imitation learning that learns latent success-failure discrepancy representations in attention and uses an attention-based metric to select beneficial failure demonstrations for improved task performance ...
CoRAL: Contact-Rich Adaptive LLM-based Control for Robotic Manipulation
cs.RO 2026-05 unverdicted novelty 5.0

CoRAL lets LLMs design objective functions for robot motion planners and uses vision-language models plus real-time identification to adapt to unknown physical properties, raising success rates by over 50 percent on n...
E$^2$DT: Efficient and Effective Decision Transformer with Experience-Aware Sampling for Robotic Manipulation
cs.RO 2026-04 unverdicted novelty 5.0

E²DT couples a Decision Transformer with a k-Determinantal Point Process that scores trajectories on return-to-go quantiles, predictive uncertainty, and stage coverage to improve sample efficiency and policy quality i...
AEGIS: Anchor-Enforced Gradient Isolation for Knowledge-Preserving Vision-Language-Action Fine-Tuning
cs.LG 2026-04 unverdicted novelty 5.0

AEGIS uses a pre-computed Gaussian anchor and layer-wise Gram-Schmidt orthogonal projections to isolate destructive gradients during VLA fine-tuning, preserving VQA performance without co-training or replay.
EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development
cs.RO 2026-04 unverdicted novelty 5.0

EmbodiedClaw automates embodied AI development workflows through conversation, reducing manual effort and improving consistency and reproducibility.
From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments
cs.AI 2026-03 unverdicted novelty 5.0

An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.
What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?
cs.AI 2025-12 unverdicted novelty 5.0

An empirical study of JEPA world models identifies architecture, training objective, and planning choices that yield a model outperforming DINO-WM and V-JEPA-2-AC on navigation and manipulation tasks.
SlotVLA: Towards Modeling of Object-Relation Representations in Robotic Manipulation
cs.RO 2025-11 unverdicted novelty 5.0

SlotVLA uses slot attention to model object-relation representations for multitask robotic manipulation, reducing visual tokens while achieving competitive generalization on the new LIBERO+ benchmark.
Robust and Resilient Soft Robotic Object Insertion with Compliance-Enabled Contact Formation and Failure Recovery
cs.RO 2025-09 unverdicted novelty 5.0

A passively compliant soft wrist structures insertion as sequential contact formations and uses a VLM to recover from failures, reaching 83% success in simulation across randomized grasp, pose, friction, and shape var...
A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation
cs.RO 2025-07 accept novelty 5.0

Multi-task pretraining of diffusion policies on diverse robot data produces more successful, robust, and data-efficient policies for dexterous manipulation than single-task baselines, with performance scaling with pre...
Unreal Robotics Lab: A High-Fidelity Robotics Simulator with Advanced Physics and Rendering
cs.RO 2025-04 unverdicted novelty 5.0

Unreal Robotics Lab integrates Unreal Engine rendering with MuJoCo physics to enable high-fidelity simulation for robotics perception, control, and benchmarking under diverse conditions.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · cited by 54 Pith papers · 3 internal anchors

[1]

OpenAI Gym

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[2]

CARLA: An Open Urban Driving Simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. Carla: An open urban driving simulator. arXiv preprint arXiv:1711.03938, 2017

work page Pith review arXiv 2017
[3]

Surreal: Open-source 17 reinforcement learning framework and robot manipulation benchmark

Linxi Fan*, Yuke Zhu*, Jiren Zhu, Zihua Liu, Orien Zeng, Anchit Gupta, Joan Creus-Costa, Silvio Savarese, and Li Fei-Fei. Surreal: Open-source 17 reinforcement learning framework and robot manipulation benchmark. In Conference on Robot Learning , 2018

work page 2018
[4]

Soft Actor-Critic Algorithms and Applications

Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Se- hoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, et al. Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905, 2018

work page internal anchor Pith review arXiv 2018
[5]

Deep reinforcement learning that matters

Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. Deep reinforcement learning that matters. In AAAI, 2018

work page 2018
[6]

Inertial properties in robotic manipulation: An object- level framework

Oussama Khatib. Inertial properties in robotic manipulation: An object- level framework. The international journal of robotics research , 14(1):19– 36, 1995

work page 1995
[7]

Reinforcement learning in robotics: A survey

Jens Kober, J Andrew Bagnell, and Jan Peters. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research , 32(11):1238–1274, 2013

work page 2013
[8]

AI2-THOR: An Interactive 3D Environment for Visual AI

Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. AI2-THOR: An interactive 3d environment for visual AI. arXiv preprint arXiv:1712.05474, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[9]

A review of robot learning for manipulation: Challenges, representations, and algorithms

Oliver Kroemer, Scott Niekum, and George Konidaris. A review of robot learning for manipulation: Challenges, representations, and algorithms. arXiv preprint arXiv:1907.03146 , 2019

work page arXiv 1907
[10]

Roboturk: A crowdsourcing platform for robotic skill learning through im- itation

Ajay Mandlekar, Yuke Zhu, Animesh Garg, Jonathan Booher, Max Spero, Albert Tung, Julian Gao, John Emmons, Anchit Gupta, Emre Orbay, et al. Roboturk: A crowdsourcing platform for robotic skill learning through im- itation. In Conference on Robot Learning , pages 879–893, 2018

work page 2018
[11]

Variable impedance control in end- effector space: An action space for reinforcement learning in contact-rich tasks

Roberto Mart´ ın-Mart´ ın, Michelle A Lee, Rachel Gardner, Silvio Savarese, Jeannette Bohg, and Animesh Garg. Variable impedance control in end- effector space: An action space for reinforcement learning in contact-rich tasks. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages 1010–1017. IEEE, 2019

work page 2019
[12]

Recent advances in robot learning from demonstration

Harish Ravichandar, Athanasios S Polydoros, Sonia Chernova, and Aude Billard. Recent advances in robot learning from demonstration. Annual Review of Control, Robotics, and Autonomous Systems , 3, 2020

work page 2020
[13]

Reinforcement learning: An in- troduction

Richard S Sutton and Andrew G Barto. Reinforcement learning: An in- troduction. MIT press, 2018

work page 2018
[14]

Lillicrap and Nicolas Heess , title =

Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, and Nico- las Heess. dm control: Software and tasks for continuous control. arXiv preprint arXiv:2006.12983, 2020. 18

work page arXiv 2006
[15]

Mujoco: A physics engine for model-based control

Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In International Conference on Intelligent Robots and Systems , pages 5026–5033, 2012

work page 2012
[16]

Interactive gibson benchmark: A benchmark for interactive navigation in cluttered environments

Fei Xia, William B Shen, Chengshu Li, Priya Kasimbeg, Micael Edmond Tchapmi, Alexander Toshev, Roberto Mart´ ın-Mart´ ın, and Silvio Savarese. Interactive gibson benchmark: A benchmark for interactive navigation in cluttered environments. IEEE Robotics and Automation Letters , 5(2):713– 720, 2020

work page 2020
[17]

Mink: Python inverse kinematics based on MuJoCo, July 2024

Kevin Zakka. Mink: Python inverse kinematics based on MuJoCo, July 2024

work page 2024
[18]

Reinforcement and imitation learning for diverse visuomotor skills

Yuke Zhu, Ziyu Wang, Josh Merel, Andrei Rusu, Tom Erez, Serkan Cabi, Saran Tunyasuvunakool, J´ anos Kram´ ar, Raia Hadsell, Nando de Freitas, et al. Reinforcement and imitation learning for diverse visuomotor skills. Robotics: Science and Systems , 2018. 19

work page 2018

[1] [1]

OpenAI Gym

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[2] [2]

CARLA: An Open Urban Driving Simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. Carla: An open urban driving simulator. arXiv preprint arXiv:1711.03938, 2017

work page Pith review arXiv 2017

[3] [3]

Surreal: Open-source 17 reinforcement learning framework and robot manipulation benchmark

Linxi Fan*, Yuke Zhu*, Jiren Zhu, Zihua Liu, Orien Zeng, Anchit Gupta, Joan Creus-Costa, Silvio Savarese, and Li Fei-Fei. Surreal: Open-source 17 reinforcement learning framework and robot manipulation benchmark. In Conference on Robot Learning , 2018

work page 2018

[4] [4]

Soft Actor-Critic Algorithms and Applications

Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Se- hoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, et al. Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905, 2018

work page internal anchor Pith review arXiv 2018

[5] [5]

Deep reinforcement learning that matters

Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. Deep reinforcement learning that matters. In AAAI, 2018

work page 2018

[6] [6]

Inertial properties in robotic manipulation: An object- level framework

Oussama Khatib. Inertial properties in robotic manipulation: An object- level framework. The international journal of robotics research , 14(1):19– 36, 1995

work page 1995

[7] [7]

Reinforcement learning in robotics: A survey

Jens Kober, J Andrew Bagnell, and Jan Peters. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research , 32(11):1238–1274, 2013

work page 2013

[8] [8]

AI2-THOR: An Interactive 3D Environment for Visual AI

Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. AI2-THOR: An interactive 3d environment for visual AI. arXiv preprint arXiv:1712.05474, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[9] [9]

A review of robot learning for manipulation: Challenges, representations, and algorithms

Oliver Kroemer, Scott Niekum, and George Konidaris. A review of robot learning for manipulation: Challenges, representations, and algorithms. arXiv preprint arXiv:1907.03146 , 2019

work page arXiv 1907

[10] [10]

Roboturk: A crowdsourcing platform for robotic skill learning through im- itation

Ajay Mandlekar, Yuke Zhu, Animesh Garg, Jonathan Booher, Max Spero, Albert Tung, Julian Gao, John Emmons, Anchit Gupta, Emre Orbay, et al. Roboturk: A crowdsourcing platform for robotic skill learning through im- itation. In Conference on Robot Learning , pages 879–893, 2018

work page 2018

[11] [11]

Variable impedance control in end- effector space: An action space for reinforcement learning in contact-rich tasks

Roberto Mart´ ın-Mart´ ın, Michelle A Lee, Rachel Gardner, Silvio Savarese, Jeannette Bohg, and Animesh Garg. Variable impedance control in end- effector space: An action space for reinforcement learning in contact-rich tasks. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages 1010–1017. IEEE, 2019

work page 2019

[12] [12]

Recent advances in robot learning from demonstration

Harish Ravichandar, Athanasios S Polydoros, Sonia Chernova, and Aude Billard. Recent advances in robot learning from demonstration. Annual Review of Control, Robotics, and Autonomous Systems , 3, 2020

work page 2020

[13] [13]

Reinforcement learning: An in- troduction

Richard S Sutton and Andrew G Barto. Reinforcement learning: An in- troduction. MIT press, 2018

work page 2018

[14] [14]

Lillicrap and Nicolas Heess , title =

Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, and Nico- las Heess. dm control: Software and tasks for continuous control. arXiv preprint arXiv:2006.12983, 2020. 18

work page arXiv 2006

[15] [15]

Mujoco: A physics engine for model-based control

Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In International Conference on Intelligent Robots and Systems , pages 5026–5033, 2012

work page 2012

[16] [16]

Interactive gibson benchmark: A benchmark for interactive navigation in cluttered environments

Fei Xia, William B Shen, Chengshu Li, Priya Kasimbeg, Micael Edmond Tchapmi, Alexander Toshev, Roberto Mart´ ın-Mart´ ın, and Silvio Savarese. Interactive gibson benchmark: A benchmark for interactive navigation in cluttered environments. IEEE Robotics and Automation Letters , 5(2):713– 720, 2020

work page 2020

[17] [17]

Mink: Python inverse kinematics based on MuJoCo, July 2024

Kevin Zakka. Mink: Python inverse kinematics based on MuJoCo, July 2024

work page 2024

[18] [18]

Reinforcement and imitation learning for diverse visuomotor skills

Yuke Zhu, Ziyu Wang, Josh Merel, Andrei Rusu, Tom Erez, Serkan Cabi, Saran Tunyasuvunakool, J´ anos Kram´ ar, Raia Hadsell, Nando de Freitas, et al. Reinforcement and imitation learning for diverse visuomotor skills. Robotics: Science and Systems , 2018. 19

work page 2018