arxiv: 2511.04831 · v1 · submitted 2025-11-06 · 💻 cs.RO · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning

NVIDIA: Mayank Mittal , Pascal Roth , James Tigue , Antoine Richard , Octi Zhang , Peter Du , Antonio Serrano-Mu\~noz , Xinjie Yao

show 97 more authors

Ren\'e Zurbr\"ugg Nikita Rudin Lukasz Wawrzyniak Milad Rakhsha Alain Denzler Eric Heiden Ales Borovicka Ossama Ahmed Iretiayo Akinola Abrar Anwar Mark T. Carlson Ji Yuan Feng Animesh Garg Renato Gasoto Lionel Gulich Yijie Guo M. Gussert Alex Hansen Mihir Kulkarni Chenran Li Wei Liu Viktor Makoviychuk Grzegorz Malczyk Hammad Mazhar Masoud Moghani Adithyavairavan Murali Michael Noseworthy Alexander Poddubny Nathan Ratliff Welf Rehberg Clemens Schwarke Ritvik Singh James Latham Smith Bingjie Tang Ruchik Thaker Matthew Trepte Karl Van Wyk Fangzhou Yu Alex Millane Vikram Ramasamy Remo Steiner Sangeeta Subramanian Clemens Volk CY Chen Neel Jawale Ashwin Varghese Kuruttukulam Michael A. Lin Ajay Mandlekar Karsten Patzwaldt John Welsh Huihua Zhao Fatima Anes Jean-Francois Lafleche Nicolas Mo\"enne-Loccoz Soowan Park Rob Stepinski Dirk Van Gelder Chris Amevor Jan Carius Jumyung Chang Anka He Chen Pablo de Heras Ciechomski Gilles Daviet Mohammad Mohajerani Julia von Muralt Viktor Reutskyy Michael Sauter Simon Schirm Eric L. Shi Pierre Terdiman Kenny Vilella Tobias Widmer Gordon Yeoman Tiffany Chen Sergey Grizan Cathy Li Lotus Li Connor Smith Rafael Wiltz Kostas Alexis Yan Chang David Chu Linxi "Jim" Fan Farbod Farshidian Ankur Handa Spencer Huang Marco Hutter Yashraj Narang Soha Pouya Shiwei Sheng Yuke Zhu Miles Macklin Adam Moravanszky Philipp Reist Yunrong Guo David Hoeller Gavriel State

Authors on Pith no claims yet

Pith reviewed 2026-05-11 10:17 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords GPU simulationrobot learningreinforcement learningimitation learningphysics enginephotorealistic renderingdomain randomizationmulti-modal robotics

0 comments

The pith

Isaac Lab unifies GPU-accelerated physics, photorealistic rendering, and modular policy training into one extensible robotics platform.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Isaac Lab as the successor to earlier GPU simulation tools, extending them to handle large-scale multi-modal robot learning. It integrates parallel high-fidelity physics with rendering, actuator models, multi-frequency sensors, data pipelines, and domain randomization tools. This modular architecture lets users compose environments and train both reinforcement learning and imitation learning policies for varied robot tasks. The authors apply the system to whole-body control, cross-embodiment mobility, contact-rich manipulation, and learning from human demonstrations. They also outline a path toward differentiable physics integration for gradient-based methods.

Core claim

Isaac Lab combines high-fidelity GPU parallel physics, photorealistic rendering, and a modular, composable architecture for designing environments and training robot policies, unifying best practices for reinforcement and imitation learning at scale within a single extensible platform.

What carries the argument

A modular composable architecture that integrates GPU parallel physics, photorealistic rendering, actuator models, multi-frequency sensor simulation, data collection pipelines, and domain randomization tools into one platform for environment design and policy training.

Load-bearing premise

The integration of physics, rendering, actuators, sensors, and randomization will deliver practical performance gains and successful policy transfer without major unstated overheads or compatibility problems.

What would settle it

Controlled side-by-side benchmarks on standard robot tasks that measure training speed, final policy success rate, or sim-to-real transfer success when using Isaac Lab versus prior separate simulation and rendering tools.

read the original abstract

We present Isaac Lab, the natural successor to Isaac Gym, which extends the paradigm of GPU-native robotics simulation into the era of large-scale multi-modal learning. Isaac Lab combines high-fidelity GPU parallel physics, photorealistic rendering, and a modular, composable architecture for designing environments and training robot policies. Beyond physics and rendering, the framework integrates actuator models, multi-frequency sensor simulation, data collection pipelines, and domain randomization tools, unifying best practices for reinforcement and imitation learning at scale within a single extensible platform. We highlight its application to a diverse set of challenges, including whole-body control, cross-embodiment mobility, contact-rich and dexterous manipulation, and the integration of human demonstrations for skill acquisition. Finally, we discuss upcoming integration with the differentiable, GPU-accelerated Newton physics engine, which promises new opportunities for scalable, data-efficient, and gradient-based approaches to robot learning. We believe Isaac Lab's combination of advanced simulation capabilities, rich sensing, and data-center scale execution will help unlock the next generation of breakthroughs in robotics research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Isaac Lab is a practical framework update from the Isaac Gym team with added rendering and modularity, but it supplies no benchmarks or comparisons to back its scalability claims.

read the letter

Isaac Lab is NVIDIA's successor to Isaac Gym, extending GPU-parallel physics with photorealistic rendering, multi-frequency sensors, actuator models, domain randomization, and unified pipelines for reinforcement and imitation learning. The modular architecture and listed applications in whole-body control, dexterous manipulation, cross-embodiment transfer, and human demo integration make the intended scope clear. The forward mention of the Newton engine for differentiable simulation is a useful signal about future directions they are exploring.

Referee Report

2 major / 1 minor

Summary. The manuscript presents Isaac Lab as the successor to Isaac Gym, a GPU-accelerated simulation framework for multi-modal robot learning. It describes the combination of high-fidelity GPU-parallel physics, photorealistic rendering, a modular composable architecture for environments and policies, actuator models, multi-frequency sensor simulation, data collection pipelines, and domain randomization tools. The framework is claimed to unify best practices for large-scale reinforcement and imitation learning, with applications to whole-body control, cross-embodiment mobility, contact-rich and dexterous manipulation, and human demonstration integration; future support for the differentiable Newton physics engine is also discussed.

Significance. If the described integration of physics, rendering, sensing, and randomization components can be realized with the claimed scalability and without major unstated overheads, Isaac Lab would represent a meaningful evolution of GPU-native robotics simulation platforms. It could streamline large-scale policy training and sim-to-real workflows for the robotics community by providing a single extensible environment that incorporates established practices from RL and IL.

major comments (2)

[Abstract] Abstract: The central claim that the framework 'unifies best practices for reinforcement and imitation learning at scale within a single extensible platform' is load-bearing for the contribution, yet the manuscript supplies no quantitative benchmarks on simulation throughput (steps/sec), training wall-clock time, policy performance metrics, or head-to-head comparisons against Isaac Gym or other simulators.
[Applications section] Applications section (whole-body control, dexterous manipulation, cross-embodiment transfer, human demo integration): All described use cases are presented qualitatively with no reported results on policy returns, sim-to-real transfer rates, ablation studies on domain randomization or sensor models, or verification that the modular architecture incurs no compatibility or performance costs.

minor comments (1)

[Framework description] The manuscript would benefit from explicit references to prior sensor modeling and actuator literature when introducing multi-frequency sensor simulation and actuator models.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript on Isaac Lab. We appreciate the recognition of the framework's potential and the specific points raised regarding the need for quantitative support. We address each major comment below, indicating where revisions will be made to strengthen the paper.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the framework 'unifies best practices for reinforcement and imitation learning at scale within a single extensible platform' is load-bearing for the contribution, yet the manuscript supplies no quantitative benchmarks on simulation throughput (steps/sec), training wall-clock time, policy performance metrics, or head-to-head comparisons against Isaac Gym or other simulators.

Authors: We agree that the manuscript does not currently include the requested quantitative benchmarks or head-to-head comparisons. This paper is structured as a system description of the framework's architecture, integration of components, and design philosophy rather than a benchmarking study. Performance data and comparisons with Isaac Gym are available in the project's documentation and in papers that apply Isaac Lab to specific tasks. To directly address the concern and better support the central claim, we will add a dedicated performance evaluation subsection in the revised manuscript, including throughput measurements, wall-clock training times, and basic comparisons. revision: yes
Referee: [Applications section] Applications section (whole-body control, dexterous manipulation, cross-embodiment transfer, human demo integration): All described use cases are presented qualitatively with no reported results on policy returns, sim-to-real transfer rates, ablation studies on domain randomization or sensor models, or verification that the modular architecture incurs no compatibility or performance costs.

Authors: The applications section is intended to illustrate the range of robotics challenges that the unified platform can support, rather than to report new experimental results. Detailed quantitative metrics, policy returns, sim-to-real outcomes, and ablations are provided in the individual research papers that employ Isaac Lab for each use case. We acknowledge that the current presentation lacks these supporting numbers and verification of modular overhead. In revision, we will add citations to the relevant papers with summary metrics where appropriate, include a brief discussion of domain randomization and sensor model impacts drawn from those works, and provide a short analysis of the modular architecture's performance characteristics based on our internal testing. revision: partial

Circularity Check

0 steps flagged

No circularity: framework description paper contains no derivations, equations, or predictions that reduce to inputs

full rationale

The paper presents Isaac Lab as a software framework extending Isaac Gym with GPU physics, rendering, actuators, sensors, and domain randomization tools. It describes architecture, applications to robotics tasks, and future Newton engine integration through qualitative text only. No equations, fitted parameters, predictions, or derivation chains appear in the abstract or full text. The central claims are descriptive of integration and capabilities rather than derived results, so no steps reduce by construction to prior inputs or self-citations. This is a standard non-circular framework announcement.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The description rests on standard assumptions in GPU computing and robotics simulation; no free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (2)

domain assumption GPU parallelization can deliver high-fidelity physics at data-center scale
Invoked to support large-scale training claims.
domain assumption Photorealistic rendering and sensor simulation improve policy transfer
Underlying the multi-modal learning unification.

pith-pipeline@v0.9.0 · 5978 in / 1307 out tokens · 50754 ms · 2026-05-11T10:17:06.737728+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Foundation.DAlembert.Inevitability bilinear_family_forced unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

unifying best practices for reinforcement and imitation learning at scale within a single extensible platform

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 47 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning
cs.RO 2026-05 accept novelty 8.0

TAVIS is a released benchmark showing active vision improves imitation learning in a task-dependent manner, multi-task policies struggle with shifts, and imitation produces human-like anticipatory gaze.
RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies
cs.RO 2026-04 unverdicted novelty 8.0

RoboLab is a new simulation benchmark with 120 tasks across visual, procedural, and relational axes that quantifies generalization gaps and perturbation sensitivity in task-generalist robotic policies.
On Surprising Effects of Risk-Aware Domain Randomization for Contact-Rich Sampling-based Predictive Control
cs.RO 2026-05 unverdicted novelty 7.0

Risk-aware domain randomization in contact-rich sampling-based predictive control reshapes the basin of attraction around contact-producing actions in the optimizer's effective cost landscape.
VUDA: Breaking CUDA-Vulkan Isolation for Spatial Sharing of Compute and Graphics on the Same GPU
cs.OS 2026-05 unverdicted novelty 7.0

VUDA enables spatial sharing between CUDA and Vulkan on GPUs via channel redirection and page-table grafting, achieving up to 85% higher throughput than temporal baselines in embodied AI tasks.
HANDFUL: Sequential Grasp-Conditioned Dexterous Manipulation with Resource Awareness
cs.RO 2026-04 unverdicted novelty 7.0

HANDFUL learns resource-aware grasps using finger contact rewards and curriculum learning to improve success on sequential dexterous tasks in simulation and on a real LEAP hand.
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
cs.AI 2026-04 unverdicted novelty 7.0

Proposes a levels x laws taxonomy for world models in AI agents, defining L1-L3 capabilities across physical, digital, social, and scientific regimes while reviewing over 400 works to outline a roadmap for advanced ag...
FingerEye: Continuous and Unified Vision-Tactile Sensing for Dexterous Manipulation
cs.RO 2026-04 unverdicted novelty 7.0

FingerEye delivers continuous vision-tactile sensing via binocular RGB cameras and marker-tracked compliant ring deformation, supporting imitation learning policies that generalize across object variations for tasks l...
Bounded Ratio Reinforcement Learning
cs.LG 2026-04 conditional novelty 7.0

BRRL derives an analytic optimal policy for regularized constrained RL that guarantees monotonic improvement and yields the BPO algorithm that matches or exceeds PPO.
Point2Pose: Occlusion-Recovering 6D Pose Tracking and 3D Reconstruction for Multiple Unknown Objects Via 2D Point Trackers
cs.CV 2026-04 unverdicted novelty 7.0

A model-free system uses 2D point trackers to achieve causal 6D pose tracking and incremental 3D reconstruction for multiple unseen rigid objects from RGB-D video, with recovery from complete occlusions.
Tune to Learn: How Controller Gains Shape Robot Policy Learning
cs.RO 2026-04 conditional novelty 7.0

Controller gains affect learnability differently for behavior cloning, RL from scratch, and sim-to-real transfer, so optimal gains depend on the learning paradigm rather than desired task behavior.
TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning
cs.RO 2026-05 unverdicted novelty 6.0

TMRL bridges behavioral cloning pretraining and RL finetuning via diffusion noise and timestep modulation to enable controlled exploration, improving sample efficiency and enabling real-world robot training in under one hour.
When Does Non-Uniform Replay Matter in Reinforcement Learning?
cs.LG 2026-05 unverdicted novelty 6.0

Non-uniform replay helps off-policy RL mainly at low replay volumes, high-entropy sampling matters even at similar recency, and Truncated Geometric replay offers a low-overhead practical solution.
Explicit Stair Geometry Conditioning for Robust Humanoid Locomotion
cs.RO 2026-05 unverdicted novelty 6.0

Explicit conditioning of a PPO policy on interpretable stair parameters (height, depth, yaw) yields improved generalization to unseen stairs and reliable real-world traversal on the Unitree G1, including 33 consecutiv...
Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 6.0

Plan2Cleanse frames RL backdoor detection as a Monte Carlo planning problem to achieve over 61 percentage point gains in trigger detection and improved win rates in competitive environments.
DexSynRefine: Synthesizing and Refining Human-Object Interaction Motion for Physically Feasible Dexterous Robot Actions
cs.RO 2026-05 unverdicted novelty 6.0

DexSynRefine synthesizes HOI motions with an extended manifold method, refines them via task-space residual RL, and adapts for sim-to-real transfer, outperforming kinematic retargeting by 50-70 percentage points on fi...
LineRides: Line-Guided Reinforcement Learning for Bicycle Robot Stunts
cs.RO 2026-05 unverdicted novelty 6.0

LineRides enables a bicycle robot to learn five commandable stunts from spatial guidelines and key orientations via RL without demonstrations or timing.
LineRides: Line-Guided Reinforcement Learning for Bicycle Robot Stunts
cs.RO 2026-05 unverdicted novelty 6.0

LineRides enables commandable bicycle robot stunts via line-guided RL that uses spatial guidelines, a tracking margin for feasibility, distance-based progress, and sparse key-orientations.
Learning Reactive Dexterous Grasping via Hierarchical Task-Space RL Planning and Joint-Space QP Control
cs.RO 2026-05 conditional novelty 6.0

A multi-agent RL high-level planner outputs task-space velocities that a GPU-parallel QP low-level controller converts to joint velocities while enforcing limits and collisions, yielding robust sim-to-real dexterous g...
Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation
cs.RO 2026-05 unverdicted novelty 6.0

A new training approach for robot navigation allows multiple collisions per episode before reset, accelerating early learning and improving success rates over traditional single-collision resets.
Stability of Control Lyapunov Function Guided Reinforcement Learning
eess.SY 2026-05 conditional novelty 6.0

CLF-guided RL yields exponentially stable optimal controllers, with proofs in continuous and discrete time, numerical checks on double integrator and cart-pole, and implementation on a walking humanoid.
Reconstruction by Generation: 3D Multi-Object Scene Reconstruction from Sparse Observations
cs.CV 2026-04 unverdicted novelty 6.0

RecGen achieves state-of-the-art 3D multi-object scene reconstruction from sparse RGB-D views by combining compositional synthetic scene generation with strong 3D shape priors, outperforming SAM3D by 30%+ in shape qua...
GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning
cs.RO 2026-04 unverdicted novelty 6.0

GS-Playground delivers a high-throughput photorealistic simulator for vision-informed robot learning via parallel physics integrated with batch 3D Gaussian Splatting at 10^4 FPS and an automated Real2Sim workflow for ...
Unleashing the Agility of Wheeled-Legged Robots for High-Dynamic Reflexive Obstacle Evasion
cs.RO 2026-04 unverdicted novelty 6.0

AWARE is a hierarchical RL framework that enables wheeled-legged robots to perform high-dynamic reflexive obstacle evasion with emergent gaits in simulation and on the real M20 platform.
Learning-augmented robotic automation for real-world manufacturing
cs.RO 2026-04 conditional novelty 6.0

A learning-augmented robotic system automated deformable cable insertion and soldering on a live electric-motor production line for 5 hours 10 minutes, producing 108 motors at 99.4% pass rate with under 20 minutes of ...
A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies
cs.RO 2026-04 unverdicted novelty 6.0

Sim-and-real co-training for robot policies is driven primarily by balanced cross-domain representation alignment and secondarily by domain-dependent action reweighting.
ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation
cs.RO 2026-04 unverdicted novelty 6.0

A framework using 3D Gaussian Splatting for visual domain randomization enables robust monocular RGB-based dexterous in-hand reorientation on real hardware for multiple objects under varied lighting.
{\Psi}-Map: Panoptic Surface Integrated Mapping Enables Real2Sim Transfer
cs.RO 2026-04 unverdicted novelty 6.0

Ψ-Map combines plane-constrained Gaussian surfels from LiDAR with end-to-end panoptic lifting to deliver high-precision geometric and semantic reconstruction in large-scale environments at real-time speeds.
DexWorldModel: Causal Latent World Modeling towards Automated Learning of Embodied Tasks
cs.CV 2026-04 unverdicted novelty 6.0

CLWM with DINOv3 targets, O(1) TTT memory, SAI latency masking, and EmbodiChain training achieves SOTA dual-arm simulation performance and zero-shot sim-to-real transfer that beats real-data finetuned baselines.
Trajectory-based actuator identification via differentiable simulation
cs.RO 2026-04 unverdicted novelty 6.0

Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locom...
RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies
cs.RO 2026-04 unverdicted novelty 6.0

RoboLab is a photorealistic simulation benchmark with 120 tasks and perturbation analysis to evaluate true generalization and robustness of robotic foundation models.
Toward Hardware-Agnostic Quadrupedal World Models via Morphology Conditioning
cs.RO 2026-04 unverdicted novelty 6.0

Morphology-conditioned quadrupedal world model enables zero-shot generalization to new robot embodiments for locomotion tasks.
PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC
cs.LG 2026-04 unverdicted novelty 6.0

PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
cs.LG 2026-04 unverdicted novelty 6.0

FlashSAC scales up Soft Actor-Critic with fewer updates, larger models, higher data throughput, and norm bounds to deliver faster, more stable training than PPO on high-dimensional robot control tasks across dozens of...
Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation?
cs.RO 2026-04 unverdicted novelty 6.0

Veo-3 video predictions enable approximate task-level robot trajectories in zero-shot settings but require hierarchical integration with low-level VLA policies for reliable manipulation performance.
Evidence of an Emergent "Self" in Continual Robot Learning
cs.RO 2026-03 unverdicted novelty 6.0

Continual learning robots form a significantly more stable invariant subnetwork than constant-task controls, and preserving it improves adaptation while damaging it hurts performance.
ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors
cs.RO 2026-03 conditional novelty 6.0

ExpertGen generates high-success expert policies in simulation from imperfect priors by freezing a diffusion behavior model and optimizing its initial noise via RL, then distills them for real-robot deployment.
Simulation Distillation: Pretraining World Models in Simulation for Rapid Real-World Adaptation
cs.RO 2026-03 unverdicted novelty 6.0

SimDist pretrains world models in simulation and adapts them to real-world robots by updating only the latent dynamics model, enabling rapid improvement on contact-rich tasks where prior methods fail.
Nautilus: From One Prompt to Plug-and-Play Robot Learning
cs.RO 2026-05 unverdicted novelty 5.0

NAUTILUS is a prompt-driven harness that automates plug-and-play adapters, typed contracts, and validation for policies, benchmarks, and robots in learning research.
When Does Non-Uniform Replay Matter in Reinforcement Learning?
cs.LG 2026-05 unverdicted novelty 5.0

Non-uniform replay improves RL sample efficiency mainly in low replay-volume regimes, with high-entropy sampling being key even at comparable recency.
A High-Throughput Compute-Efficient POMDP Hide-And-Seek-Engine (HASE) for Multi-Agent Operations
cs.MA 2026-04 unverdicted novelty 5.0

A C++ Dec-POMDP simulator using data-oriented design and zero-copy PyTorch integration achieves up to 33 million steps per second on a 16-core CPU, enabling multi-agent policy training in minutes with PPO, DQN, and SAC.
Learning Versatile Humanoid Manipulation with Touch Dreaming
cs.RO 2026-04 conditional novelty 5.0

HTD, a multimodal transformer policy trained with behavioral cloning and touch dreaming to predict future tactile latents, achieves a 90.9% relative success rate improvement over baselines on five real-world contact-r...
Reliability-Guided Depth Fusion for Glare-Resilient Navigation Costmaps
cs.RO 2026-04 unverdicted novelty 5.0

Reliability modeling of depth measurements enables glare-resilient occupancy grid costmaps for mobile robots.
CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment
cs.RO 2026-04 unverdicted novelty 5.0

CoEnv introduces a compositional environment that integrates real and simulated spaces for multi-agent robotic collaboration, using real-to-sim reconstruction, VLM action synthesis, and validated sim-to-real transfer ...
Web-Gewu: A Browser-Based Interactive Playground for Robot Reinforcement Learning
cs.RO 2026-04 unverdicted novelty 4.0

Web-Gewu delivers a scalable browser-accessible playground for robot reinforcement learning by offloading simulation and training to edge nodes while using the cloud only as a signaling relay for low-latency P2P streaming.
Fuzzy Logic Theory-based Adaptive Reward Shaping for Robust Reinforcement Learning (FARS)
cs.RO 2026-04 unverdicted novelty 4.0

Fuzzy logic-based adaptive reward shaping improves RL convergence speed, reduces variability, and boosts success rates by up to 5% in drone racing simulations compared to standard rewards.
Vision-Language Navigation for Aerial Robots: Towards the Era of Large Language Models
cs.RO 2026-04 unverdicted novelty 4.0

This survey organizes aerial vision-language navigation methods into five architectural categories, critically reviews evaluation infrastructure, and synthesizes seven open problems for LLM/VLM integration.
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence
cs.RO 2026-03 conditional novelty 4.0

CARLA-Air unifies CARLA urban driving and AirSim drone flight into one high-fidelity simulation with preserved APIs for air-ground embodied AI research.

Reference graph

Works this paper leans on

121 extracted references · 121 canonical work pages · cited by 44 Pith papers · 3 internal anchors

[1]

Legged locomotion in challenging terrains using egocentric vision

Ananye Agarwal, Ashish Kumar, Jitendra Malik, and Deepak Pathak. Legged locomotion in challenging terrains using egocentric vision. InConference on Robot Learning (CoRL), pages 403–415. PMLR, 2023. 2

work page 2023
[2]

TacSL: A library for visuotactile sensor simulation and learning.IEEE Transactions on Robotics (T-RO), 41:2645–2661, 2025

Iretiayo Akinola, Jie Xu, Jan Carius, Dieter Fox, and Yashraj Narang. TacSL: A library for visuotactile sensor simulation and learning.IEEE Transactions on Robotics (T-RO), 41:2645–2661, 2025. 14, 33

work page 2025
[3]

Arthur Allshire, Mayank Mittal, Varun Lodaya, Viktor Makoviychuk, Denys Makoviichuk, Felix Widmaier, Manuel Wüthrich, Stefan Bauer, Ankur Handa, and Animesh Garg. Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger.Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022. 2

work page 2022
[4]

Sonogym: High performance simulation for challenging surgical tasks with robotic ultrasound

Yunke Ao, Masoud Moghani, Mayank Mittal, Manish Prajapat, Luohong Wu, Frederic Giraud, Fabio Carrillo, Andreas Krause, and Philipp Fürnstahl. Sonogym: High performance simulation for challenging surgical tasks with robotic ultrasound. InProceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2025. 36

work page 2025
[5]

Efficient learning-based control of a legged robot in lunar gravity.arXiv preprint arXiv:2509.10128,

Philip Arm, Oliver Fischer, Joseph Church, Adrian Fuhrer, Hendrik Kolvenbach, and Marco Hutter. Efficient learning-based control of a legged robot in lunar gravity.arXiv preprint arXiv:2509.10128,

work page arXiv
[6]

Leva: A high-mobility logistic vehicle with legged suspension.Proceedings of the IEEE International Conference on Robotics and Automation (ICRA),

Marco Arnold, Lukas Hildebrandt, Kaspar Janssen, Efe Ongan, Pascal Bürge, Ádám Gyula Gábriel, James Kennedy, Rishi Lolla, Quanisha Oppliger, Micha Schaaf, et al. Leva: A high-mobility logistic vehicle with legged suspension.Proceedings of the IEEE International Conference on Robotics and Automation (ICRA),

work page
[7]

Homie: Humanoid loco-manipulation with isomorphic exoskeleton cockpit

Qingwei Ben, Feiyu Jia, Jia Zeng, Junting Dong, Dahua Lin, and Jiangmiao Pang. Homie: Humanoid loco-manipulation with isomorphic exoskeleton cockpit. InRobotics: Science and Systems (RSS), 2025. 29

work page 2025
[8]

Towards bridging the gap: Systematic sim-to- real transfer for diverse legged robots.arXiv preprint arXiv:2509.06342, 2025

Filip Bjelonic, Fabian Tischhauser, and Marco Hutter. Towards bridging the gap: Systematic sim-to-real transfer for diverse legged robots.arXiv preprint arXiv:2509.06342, 2025. 30

work page arXiv 2025
[9]

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

Johan Bjorck, Fernando Castañeda, Nikita Cherniadev, Xingye Da, Runyu Ding, Linxi Fan, Yu Fang, Dieter Fox, Fengyuan Hu, Spencer Huang, et al. Gr00t n1: An open foundation model for generalist humanoid robots.arXiv preprint arXiv:2503.14734, 2025. 37

work page internal anchor Pith review Pith/arXiv arXiv 2025
[10]

Walk, Run, Crawl, RL Fun | Boston Dynamics | Atlas.https://www.youtube.com/ watch?v=I44_zbEwz_w, 2025

Boston Dynamics. Walk, Run, Crawl, RL Fun | Boston Dynamics | Atlas.https://www.youtube.com/ watch?v=I44_zbEwz_w, 2025. Accessed: 2025-09-04. 32

work page 2025
[11]

Lerobot: State-of-the-art machine learning for real-world robotics in pytorch.https://github.com/huggingface/lerobot, 2024

Remi Cadene, Simon Alibert, Alexander Soare, Quentin Gallouedec, Adil Zouitine, Steven Palma, Pepijn Kooijmans, Michel Aractingi, Mustafa Shukor, Dana Aubakirova, Martino Russi, Francesco Capuano, Caroline Pascal, Jade Choghari, Jess Moss, and Thomas Wolf. Lerobot: State-of-the-art machine learning for real-world robotics in pytorch.https://github.com/hug...

work page 2024
[12]

GraspDataGen

Mark Carlson. GraspDataGen. https://github.com/NVlabs/GraspDataGen, 2025. Isaac Lab-based Grasp Generation. 34

work page 2025
[13]

Divide, discover, deploy: Fac- torized skill learning with symmetry and style priors.Conference on Robot Learning (CoRL), 2025

Rafael Cathomen, Mayank Mittal, Marin Vlastelica, and Marco Hutter. Divide, discover, deploy: Fac- torized skill learning with symmetry and style priors.Conference on Robot Learning (CoRL), 2025. 30

work page 2025
[14]

Matterport3d: Learning from rgb-d data in indoor environments

Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. International Conference on 3D Vision (3DV), 2017. 16 45 Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning

work page 2017
[15]

Vertex block descent.arXiv preprint, arXiv:2403.06321, 2024

Anka He Chen, Ziheng Liu, Yin Yang, and Cem Yuksel. Vertex block descent.arXiv preprint, arXiv:2403.06321, 2024. doi: 10.48550/arXiv.2403.06321. URL https://arxiv.org/abs/2403. 06321. 37

work page doi:10.48550/arxiv.2403.06321 2024
[16]

Learning by cheating

Dian Chen, Brady Zhou, Vladlen Koltun, and Philipp Krähenbühl. Learning by cheating. InConference on Robot Learning (CoRL), pages 66–75. PMLR, 2020. 25

work page 2020
[17]

Navila: Legged robot vision-language-action model for navigation

An-Chieh Cheng, Yandong Ji, Zhaojing Yang, Xueyan Zou, Jan Kautz, Erdem Biyik, Hongxu Yin, Sifei Liu, and Xiaolong Wang. Navila: Legged robot vision-language-action model for navigation. InRobotics: Science and Systems (RSS), 2025. 32

work page 2025
[18]

Open-television: Teleoperation with immersive active visual feedback,

Xuxin Cheng, Jialong Li, Shiqi Yang, Ge Yang, and Xiaolong Wang. Open-television: Teleoperation with immersive active visual feedback.arXiv preprint arXiv:2407.01512, 2024. 16

work page arXiv 2024
[19]

Pybullet, a python module for physics simulation for games, robotics and machine learning.http://pybullet.org, 2016–2023

Erwin Coumans and Yunfei Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning.http://pybullet.org, 2016–2023. 2

work page 2016
[20]

Ioannis Dadiotis, Mayank Mittal, Nikos Tsagarakis, and Marco Hutter. Dynamic object goal pushing with mobile manipulators through model-free constrained reinforcement learning.Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025. 31

work page 2025
[21]

A semi-implicit material point method for the continuum simulation of granular materials.ACM Transactions on Graphics, 35(4):102:1–102:13, 2016

Gilles Daviet and Florence Bertails-Descoubes. A semi-implicit material point method for the continuum simulation of granular materials.ACM Transactions on Graphics, 35(4):102:1–102:13, 2016. doi: 10.1145/2897824.2925877. URLhttps://dl.acm.org/doi/10.1145/2897824.2925877. 37

work page doi:10.1145/2897824.2925877 2016
[22]

Objaverse: A universe of annotated 3d objects

Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. In Computer Vision and Pattern Recognition, 2023. 17

work page 2023
[23]

scene_synthesizer: A python library for procedural scene generation in robot manipulation.Journal of Open Source Software, 2024

Clemens Eppner, Adithyavairavan Murali, Caelan Garrett, Rowland O’Flaherty, Tucker Hermans, Wei Yang, and Dieter Fox. scene_synthesizer: A python library for procedural scene generation in robot manipulation.Journal of Open Source Software, 2024. 16, 17

work page 2024
[24]

Trimesh.https://trimesh.org/

Dawson-Haggerty et al. Trimesh.https://trimesh.org/. Version 3.2.0, accessed 2025-09-29. 16

work page 2025
[25]

Robot dynamics algorithms.Annexe thesis digitisation project 2016 block 5, 1984

Roy Featherstone. Robot dynamics algorithms.Annexe thesis digitisation project 2016 block 5, 1984. 39

work page 2016
[26]

Deep whole-body control: learning a unified policy for manipulation and locomotion

Zipeng Fu, Xuxin Cheng, and Deepak Pathak. Deep whole-body control: learning a unified policy for manipulation and locomotion. InConference on Robot Learning (CoRL), pages 138–149. PMLR, 2023. 2

work page 2023
[27]

Skillmimicgen: Automated demonstration generation for efficient skill learning and deployment

Caelan Garrett, Ajay Mandlekar, Bowen Wen, and Dieter Fox. Skillmimicgen: Automated demonstration generation for efficient skill learning and deployment. InConference on Robot Learning (CoRL), 2024. 29

work page 2024
[28]

Srsa: Skill retrieval and adaptation for robotic assembly tasks.International Conference on Learning Representations (ICLR), 2025

Yijie Guo, Bingjie Tang, Iretiayo Akinola, Dieter Fox, Abhishek Gupta, and Yashraj Narang. Srsa: Skill retrieval and adaptation for robotic assembly tasks.International Conference on Learning Representations (ICLR), 2025. 33

work page 2025
[29]

DeXtreme: Transfer of agile in-hand manipulation from simulation to reality

Ankur Handa, Arthur Allshire, Viktor Makoviychuk, Aleksei Petrenko, Ritvik Singh, Jingzhou Liu, Denys Makoviichuk, Karl Van Wyk, Alexander Zhurkevich, Balakumar Sundaralingam, et al. DeXtreme: Transfer of agile in-hand manipulation from simulation to reality. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 5977–...

work page 2023
[30]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778,

work page
[31]

24 46 Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning

work page
[32]

Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning.Conference on Robot Learning (CoRL), 2024

Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, and Guanya Shi. Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning.Conference on Robot Learning (CoRL), 2024. 2

work page 2024
[33]

Hover: Versatile neural whole-body controller for humanoid robots

Tairan He, Wenli Xiao, Toru Lin, Zhengyi Luo, Zhenjia Xu, Zhenyu Jiang, Jan Kautz, Changliu Liu, Guanya Shi, Xiaolong Wang, et al. Hover: Versatile neural whole-body controller for humanoid robots. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025. 31

work page 2025
[34]

Learning agile and dynamic motor skills for legged robots.Science Robotics, 4(26): eaau5872, 2019

Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter. Learning agile and dynamic motor skills for legged robots.Science Robotics, 4(26): eaau5872, 2019. 11, 29

work page 2019
[35]

Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, et al

Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, et al. Population based training of neural networks.arXiv preprint arXiv:1711.09846, 2017. 26

work page arXiv 2017
[36]

Dexmimicgen: Automated data generation for bimanual dexterous manipulation via imitation learning

Zhenyu Jiang, Yuqi Xie, Kevin Lin, Zhenjia Xu, Weikang Wan, Ajay Mandlekar, Linxi Fan, and Yuke Zhu. Dexmimicgen: Automated data generation for bimanual dexterous manipulation via imitation learning. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025. 28

work page 2025
[37]

3d diffuser actor: Policy diffusion with 3d scene representations, 2024

Tsung-Wei Ke, Nikolaos Gkanatsios, and Katerina Fragkiadaki. 3d diffuser actor: Policy diffusion with 3d scene representations.arXiv preprint arXiv:2402.10885, 2024. 35

work page arXiv 2024
[38]

3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4), July 2023

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4), July 2023. URLhttps: //repo-sam.inria.fr/fungraph/3d-gaussian-splatting/. 8

work page 2023
[39]

Benchmarking protocols for evaluating small parts robotic assembly systems

Kenneth Kimble, Karl Van Wyk, Joe Falco, Elena Messina, Yu Sun, Mizuho Shibata, Wataru Uemura, and Yasuyoshi Yokokohji. Benchmarking protocols for evaluating small parts robotic assembly systems. IEEE Robotics and Automation Letters (RA-L), 5(2):883–889, 2020. 41

work page 2020
[40]

Design and use paradigms for gazebo, an open-source multi-robot simulator

Nathan Koenig and Andrew Howard. Design and use paradigms for gazebo, an open-source multi-robot simulator. InProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 3, pages 2149–2154. IEEE, 2004. 3

work page 2004
[41]

Aerial gym simulator: A framework for highly parallelized simulation of aerial robots.IEEE Robotics and Automation Letters (RA-L), 2025

Mihir Kulkarni, Welf Rehberg, and Kostas Alexis. Aerial gym simulator: A framework for highly parallelized simulation of aerial robots.IEEE Robotics and Automation Letters (RA-L), 2025. 10

work page 2025
[42]

Grey, Sehoon Ha, Tobias Kunz, Sumit Jain, Yuting Ye, Siddhartha S

Jeongseok Lee, Michael X. Grey, Sehoon Ha, Tobias Kunz, Sumit Jain, Yuting Ye, Siddhartha S. Srinivasa, Mike Stilman, and C. Karen Liu. DART: Dynamic animation and robotics toolkit.The Journal of Open Source Software, 3(22):500, Feb 2018. doi: 10.21105/joss.00500. URLhttps://doi.org/10.21105/ joss.00500. 2

work page doi:10.21105/joss.00500 2018
[43]

Learning quadrupedal locomotion over challenging terrain.Science robotics, 5(47):eabc5986, 2020

Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, and Marco Hutter. Learning quadrupedal locomotion over challenging terrain.Science robotics, 5(47):eabc5986, 2020. 25, 29

work page 2020
[44]

Control of complex maneuvers for a quadrotor uav using geometric methods on se (3).arXiv preprint arXiv:1003.2005, 2010

Taeyoung Lee, Melvin Leok, and N Harris McClamroch. Control of complex maneuvers for a quadrotor uav using geometric methods on se (3).arXiv preprint arXiv:1003.2005, 2010. 10

work page arXiv 2005
[45]

Minchen Li, Zachary Ferguson, Teseo Schneider, Timothy Langlois, Denis Zorin, Daniele Panozzo, Chenfanfu Jiang, and Danny M. Kaufman. Incremental potential contact: Intersection- and inversion- free large deformation dynamics.ACM Trans. Graph. (SIGGRAPH), 39(4), 2020. 39

work page 2020
[46]

Taccel: Scaling up vision-based tactile robotics via high-performance gpu simulation,

Yuyang Li, Wenxin Du, Chang Yu, Puhao Li, Zihang Zhao, Tengyu Liu, Chenfanfu Jiang, Yixin Zhu, and Siyuan Huang. Taccel: Scaling up vision-based tactile robotics via high-performance gpu simulation,

work page
[47]

39 47 Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning

work page
[48]

Marladona-towards cooperative team play using multi-agent reinforcement learning

Zichong Li, Filip Bjelonic, Victor Klemm, and Marco Hutter. Marladona-towards cooperative team play using multi-agent reinforcement learning. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 15014–15020. IEEE, 2025. 30

work page 2025
[49]

Compass: Cross- embodiment mobility policy via residual rl and skill synthesis.arXiv preprint arXiv:2502.16372, 2025

Wei Liu, Huihua Zhao, Chenran Li, Joydeep Biswas, Soha Pouya, and Yan Chang. Compass: Cross- embodiment mobility policy via residual rl and skill synthesis.arXiv preprint arXiv:2502.16372, 2025. 8, 32, 33

work page arXiv 2025
[50]

Li, Preston Culbertson, Krishnan Srinivasan, Aaron Ames, Mac Schwager, and Jeannette Bohg

Tyler Ga Wei Lum, Albert H. Li, Preston Culbertson, Krishnan Srinivasan, Aaron Ames, Mac Schwager, and Jeannette Bohg. Get a grip: Multi-finger grasp evaluation at scale enables robust sim-to-real transfer. InConference on Robot Learning (CoRL), 2024. 2

work page 2024
[51]

Dextrah-g: Pixels-to- action dexterous arm-hand grasping with geometric fabrics,

Tyler Ga Wei Lum, Martin Matak, Viktor Makoviychuk, Ankur Handa, Arthur Allshire, Tucker Hermans, Nathan D Ratliff, and Karl Van Wyk. Dextrah-g: Pixels-to-action dexterous arm-hand grasping with geometric fabrics.arXiv preprint arXiv:2407.02274, 2024. 20, 25, 26, 35, 41

work page arXiv 2024
[52]

Serl: A software suite for sample-efficient robotic reinforcement learning

Jianlan Luo, Zheyuan Hu, Charles Xu, You Liang Tan, Jacob Berg, Archit Sharma, Stefan Schaal, Chelsea Finn, Abhishek Gupta, and Sergey Levine. Serl: A software suite for sample-efficient robotic reinforcement learning. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 16961–16969. IEEE, 2024. 37

work page 2024
[53]

Universal humanoid motion representations for physics-based control

Zhengyi Luo, Jinkun Cao, Josh Merel, Alexander Winkler, Jing Huang, Kris Kitani, and Weipeng Xu. Universal humanoid motion representations for physics-based control. InInternational Conference on Learning Representations (ICLR), 2024. 35

work page 2024
[54]

Emergent active perception and dexterity of simulated humanoids from visual reinforcement learning.arXiv preprint arXiv:2505.12278, 2025

Zhengyi Luo, Chen Tessler, Toru Lin, Ye Yuan, Tairan He, Wenli Xiao, Yunrong Guo, Gal Chechik, Kris Kitani, Linxi Fan, et al. Emergent active perception and dexterity of simulated humanoids from visual reinforcement learning.arXiv preprint arXiv:2505.12278, 2025. 17, 26, 35

work page arXiv 2025
[55]

Warp: A high-performance python framework for gpu simulation and graphics.https: //github.com/nvidia/warp, March 2022

Miles Macklin. Warp: A high-performance python framework for gpu simulation and graphics.https: //github.com/nvidia/warp, March 2022. NVIDIA GPU Technology Conference (GTC). 38

work page 2022
[56]

Xpbd: position-basedsimulation of compliant constrained dynamics

MilesMacklin, Matthias Müller, andNuttapongChentanez. Xpbd: position-basedsimulation of compliant constrained dynamics. InProceedings of the International Conference on Motion in Games, page 49–54,

work page
[57]

URLhttps://doi.org/10.1145/2994258.2994272. 39

work page doi:10.1145/2994258.2994272
[58]

Sim-and-real co-training: A simple recipe for vision-based robotic manipulation

Abhiram Maddukuri, Zhenyu Jiang, Lawrence Yunliang Chen, Soroush Nasiriany, Yuqi Xie, Yu Fang, Wenqi Huang, Zu Wang, Zhenjia Xu, Nikita Chernyadev, Scott Reed, Ken Goldberg, Ajay Mandlekar, Linxi Fan, and Yuke Zhu. Sim-and-real co-training: A simple recipe for vision-based robotic manipulation. In Robotics: Science and Systems (RSS), 2025. 37

work page 2025
[59]

rl-games: Ahigh-performanceframeworkforreinforcement learning.https://github.com/Denys88/rl_games, May 2022

DenysMakoviichukandViktorMakoviychuk. rl-games: Ahigh-performanceframeworkforreinforcement learning.https://github.com/Denys88/rl_games, May 2022. 24

work page 2022
[60]

Isaac gym: High performance gpu based physics simulation for robot learning

Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, and Gavriel State. Isaac gym: High performance gpu based physics simulation for robot learning. InProceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1, 2021. 2, 6, 18

work page 2021
[61]

What matters in learning from offline human demonstrations for robot manipulation

Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang, Rohun Kulkarni, Li Fei-Fei, Silvio Savarese, Yuke Zhu, and Roberto Martin-Martin. What matters in learning from offline human demonstrations for robot manipulation. InConference on Robot Learning (CoRL), pages 1678–1690. PMLR, 2022. 27 48 Isaac Lab: A GPU-Accelerated Simulation Framewor...

work page 2022
[62]

Mimicgen: A data generation system for scalable robot learning using human demonstrations

Ajay Mandlekar, Soroush Nasiriany, Bowen Wen, Iretiayo Akinola, Yashraj Narang, Linxi Fan, Yuke Zhu, and Dieter Fox. Mimicgen: A data generation system for scalable robot learning using human demonstrations. InConference on Robot Learning (CoRL), 2023. 28

work page 2023
[63]

Learning robust perceptive locomotion for quadrupedal robots in the wild.Science Robotics, 7(62), 2022

Takahiro Miki, Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, and Marco Hutter. Learning robust perceptive locomotion for quadrupedal robots in the wild.Science Robotics, 7(62), 2022. 13, 29

work page 2022
[64]

nvblox: Gpu-accelerated incremental signed distance field mapping

Alexander Millane, Helen Oleynikova, Emilie Wirbel, Remo Steiner, Vikram Ramasamy, David Tingdahl, and Roland Siegwart. nvblox: Gpu-accelerated incremental signed distance field mapping. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 2698–2705, 2024. 35

work page 2024
[65]

High-performance reinforcement learning on spot: Optimizing simulation parameters with distributional measures.arXiv preprint arXiv:2504.17857, 2025

AJ Miller, Fangzhou Yu, Michael Brauckmann, and Farbod Farshidian. High-performance reinforcement learning on spot: Optimizing simulation parameters with distributional measures.arXiv preprint arXiv:2504.17857, 2025. 29, 30

work page arXiv 2025
[66]

Orbit: A unified simulation framework for interactive robot learning environments.IEEE Robotics and Automation Letters (RA-L), 8(6):3740–3747, 2023

Mayank Mittal, Calvin Yu, Qinxi Yu, Jingzhou Liu, Nikita Rudin, David Hoeller, Jia Lin Yuan, Ritvik Singh, Yunrong Guo, Hammad Mazhar, et al. Orbit: A unified simulation framework for interactive robot learning environments.IEEE Robotics and Automation Letters (RA-L), 8(6):3740–3747, 2023. 2, 43

work page 2023
[67]

Sufia-bc: Generating high quality demonstration data for visuomotor policy learning in surgical subtasks

Masoud Moghani, Nigel Nelson, Mohamed Ghanem, Andres Diaz-Pinto, Kush Hari, Mahdi Azizian, Ken Goldberg, Sean Huver, and Animesh Garg. Sufia-bc: Generating high quality demonstration data for visuomotor policy learning in surgical subtasks. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025. 36

work page 2025
[68]

Ray: A distributed framework for emerging ai applications

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. Ray: A distributed framework for emerging ai applications. In13th USENIX symposium on operating systems design and implementation (OSDI 18), pages 561–577, 2018. 24

work page 2018
[69]

Factory: Fast contact for robotic assembly

Yashraj Narang, Kier Storey, Iretiayo Akinola, Miles Macklin, Philipp Reist, Lukasz Wawrzyniak, Yunrong Guo, Adam Moravanszky, Gavriel State, Michelle Lu, et al. Factory: Fast contact for robotic assembly. Robotics: Science and Systems (RSS), 2022. 2, 19, 33

work page 2022
[70]

Newton: GPU-accelerated physics simulation for robotics and simulation research

Newton Contributors. Newton: GPU-accelerated physics simulation for robotics and simulation research. https://github.com/newton-physics/newton. Newton, a Series of LF Projects, LLC, Apache-2.0 License, accessed 2025-09-29. 3

work page 2025
[71]

Forge: Force-guided exploration for robust contact-rich manipulation under uncertainty.IEEE Robotics and Automation Letters (RA-L), 2025

Michael Noseworthy, Bingjie Tang, Bowen Wen, Ankur Handa, Chad Kessens, Nicholas Roy, Dieter Fox, Fabio Ramos, Yashraj Narang, and Iretiayo Akinola. Forge: Force-guided exploration for robust contact-rich manipulation under uncertainty.IEEE Robotics and Automation Letters (RA-L), 2025. 19, 33, 41

work page 2025
[72]

NVIDIA Isaac for Healthcare.https://github.com/isaac-for-healthcare

NVIDIA. NVIDIA Isaac for Healthcare.https://github.com/isaac-for-healthcare. Accessed 2025- 09-29. 36

work page 2025
[73]

Solving Rubik's Cube with a Robot Hand

OpenAI, Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz Litwin, Bob McGrew, Arthur Petron, Alex Paino, Matthias Plappert, Glenn Powell, Raphael Ribas, Jonas Schneider, Nikolas Tezak, Jerry Tworek, Peter Welinder, Lilian Weng, Qiming Yuan, Wojciech Zaremba, and Lei Zhang. Solving rubik’s cube with a robot hand, 2019. URLhttps://arxiv.org/abs/1910...

work page internal anchor Pith review arXiv 2019
[74]

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization.CoRR, abs/1710.06537, 2017. URLhttp://arxiv.org/ abs/1710.06537. 26 49 Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning

work page Pith review arXiv 2017
[75]

DexPBT: Scaling up dexterous manipulation for hand-arm systems with population based training

Aleksei Petrenko, Arthur Allshire, Gavriel State, Ankur Handa, and Viktor Makoviychuk. DexPBT: Scaling up dexterous manipulation for hand-arm systems with population based training. InRSS, 2023. URLhttps://arxiv.org/abs/2305.12127. 19, 26, 41

work page arXiv 2023
[76]

Universal scene description (openusd), 2016

Pixar Animation Studios. Universal scene description (openusd), 2016. URLhttps://openusd.org. Accessed: 2025-09-16. 4

work page 2016
[77]

Whole-body end-effector pose tracking.Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025

Tifanny Portela, Andrei Cramariuc, Mayank Mittal, and Marco Hutter. Whole-body end-effector pose tracking.Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025. 31

work page 2025
[78]

Stable-baselines3: Reliable reinforcement learning implementations.Journal of Machine Learning Research (JMLR), 2021

Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. Stable-baselines3: Reliable reinforcement learning implementations.Journal of Machine Learning Research (JMLR), 2021. 24

work page 2021
[79]

Reinforcement Learning Accelerates Humanoid Behavior Production

RAI Institute. Reinforcement Learning Accelerates Humanoid Behavior Production. https://www. youtube.com/watch?v=LQizdUn5Z1k, 2025. Accessed: 2025-09-22. 31, 32

work page 2025
[80]

Ultra Mobility Vehicle: Combining Wheeled Efficiency with Legged Agility

RAI Institute. Ultra Mobility Vehicle: Combining Wheeled Efficiency with Legged Agility. https: //rai-inst.com/resources/blog/designing-wheeled-robotic-systems/, 2025. Accessed: 2025- 09-03. 30

work page 2025

Showing first 80 references.