hub Canonical reference

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin · 2021 · cs.RO · arXiv 2108.10470

Canonical reference. 76% of citing Pith papers cite this work as background.

58 Pith papers citing it

Background 76% of classified citations

open full Pith review browse 58 citing papers arXiv PDF

abstract

Isaac Gym offers a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU. Both physics simulation and the neural network policy training reside on GPU and communicate by directly passing data from physics buffers to PyTorch tensors without ever going through any CPU bottlenecks. This leads to blazing fast training times for complex robotics tasks on a single GPU with 2-3 orders of magnitude improvements compared to conventional RL training that uses a CPU based simulator and GPU for neural networks. We host the results and videos at \url{https://sites.google.com/view/isaacgym-nvidia} and isaac gym can be downloaded at \url{https://developer.nvidia.com/isaac-gym}.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 13 dataset 4

citation-polarity summary

background 13 use dataset 3 unclear 1

representative citing papers

Self-Supervised On-Policy Reinforcement Learning via Contrastive Proximal Policy Optimisation

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

CPPO is an on-policy contrastive RL method that derives advantages from contrastive Q-values for PPO optimization, outperforming prior CRL baselines in 14/18 tasks and matching or exceeding reward-based PPO in 12/18 tasks.

Coordinated Diffusion: Generating Multi-Agent Behavior Without Multi-Agent Demonstrations

cs.RO · 2026-05-12 · unverdicted · novelty 7.0

CoDi decomposes the multi-agent diffusion score into pre-trained single-agent policies plus a gradient-free cost guidance term to generate coordinated behavior from single-agent data alone.

Dynamic Full-body Motion Agent with Object Interaction via Blending Pre-trained Modular Controllers

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

A two-stage framework augments HOI data with dynamic priors and blends pre-trained dynamic motion and static interaction agents via a composer network to enable long-term dynamic human-object interactions with higher success rates and reduced training time.

HiPAN: Hierarchical Posture-Adaptive Navigation for Quadruped Robots in Unstructured 3D Environments

cs.RO · 2026-04-29 · unverdicted · novelty 7.0

HiPAN enables quadruped robots to navigate unstructured 3D environments more successfully by combining a high-level posture-adaptive policy with a low-level controller and curriculum learning on depth images.

HANDFUL: Sequential Grasp-Conditioned Dexterous Manipulation with Resource Awareness

cs.RO · 2026-04-28 · unverdicted · novelty 7.0

HANDFUL learns resource-aware grasps using finger contact rewards and curriculum learning to improve success on sequential dexterous tasks in simulation and on a real LEAP hand.

Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors

cs.RO · 2026-05-21 · unverdicted · novelty 6.0

Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple rewards for mocap deployment.

ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders

cs.RO · 2026-05-19 · accept · novelty 6.0 · 2 refs

ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.

Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.

SECOND-Grasp: Semantic Contact-guided Dexterous Grasping

cs.RO · 2026-05-13 · conditional · novelty 6.0

SECOND-Grasp integrates semantic contact proposals from vision-language reasoning with geometric refinement to achieve 98%+ lifting success and improved intent-aware grasping on seen and unseen objects.

NavOL: Navigation Policy with Online Imitation Learning

cs.RO · 2026-05-12 · unverdicted · novelty 6.0

NavOL collects expert trajectory labels online from a global planner during policy rollouts in simulation to train a diffusion navigation policy, mitigating distribution shift and improving performance on visual navigation tasks.

Explicit Stair Geometry Conditioning for Robust Humanoid Locomotion

cs.RO · 2026-05-11 · unverdicted · novelty 6.0

Explicit conditioning of a PPO policy on interpretable stair parameters (height, depth, yaw) yields improved generalization to unseen stairs and reliable real-world traversal on the Unitree G1, including 33 consecutive outdoor steps.

Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching

cs.RO · 2026-05-10 · unverdicted · novelty 6.0

DRIS improves zero-shot sim-to-real transfer for reactive catching by maintaining and acting on sets of randomized dynamics instances instead of single instances per episode.

RigidFormer: Learning Rigid Dynamics using Transformers

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

RigidFormer learns mesh-free rigid dynamics from point clouds using object-centric anchors, Anchor-Vertex Pooling, Anchor-based RoPE, and differentiable Kabsch alignment to enforce rigidity.

ANO: A Principled Approach to Robust Policy Optimization

cs.AI · 2026-05-04 · unverdicted · novelty 6.0

ANO derives a robust policy optimizer from geometric principles that replaces clipping with a smooth redescending gradient, showing better performance and stability than PPO, SPO, and GRPO in MuJoCo, Atari, and RLHF experiments.

GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning

cs.RO · 2026-04-28 · unverdicted · novelty 6.0

GS-Playground delivers a high-throughput photorealistic simulator for vision-informed robot learning via parallel physics integrated with batch 3D Gaussian Splatting at 10^4 FPS and an automated Real2Sim workflow for consistent environments.

dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

cs.RO · 2026-04-24 · unverdicted · novelty 6.0

A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.

Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot

cs.RO · 2026-04-23 · unverdicted · novelty 6.0

The Weightlessness Mechanism lets humanoid robots imitate non-self-stabilizing motions by dynamically relaxing specific joints to exploit passive environmental contacts, generalizing from single demonstrations to varied setups.

ETac: A Lightweight and Efficient Tactile Simulation Framework for Learning Dexterous Manipulation

cs.RO · 2026-04-22 · unverdicted · novelty 6.0

ETac is a data-driven tactile simulation framework that matches FEM deformation accuracy at high speed, supporting 4096 parallel environments at 869 FPS and yielding 84.45% success in blind grasping across four object types.

FLASH: Fast Learning via GPU-Accelerated Simulation for High-Fidelity Deformable Manipulation in Minutes

cs.RO · 2026-04-19 · unverdicted · novelty 6.0

A new GPU-accelerated deformable simulation framework trains manipulation policies in minutes using only synthetic data, achieving robust zero-shot transfer to physical robots.

Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning

cs.LG · 2026-04-15 · unverdicted · novelty 6.0

CoUR uses LLMs for efficient RL reward design through uncertainty quantification and similarity selection, achieving better performance and lower evaluation costs on IsaacGym and Bidexterous Manipulation benchmarks.

Trajectory-based actuator identification via differentiable simulation

cs.RO · 2026-04-11 · unverdicted · novelty 6.0

Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

cs.LG · 2026-04-06 · unverdicted · novelty 6.0 · 2 refs

FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.

Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation?

cs.RO · 2026-04-06 · unverdicted · novelty 6.0

Veo-3 video predictions enable approximate task-level robot trajectories in zero-shot settings but require hierarchical integration with low-level VLA policies for reliable manipulation performance.

Physically Accurate Rigid-Body Dynamics in Particle-Based Simulation

cs.RO · 2026-03-15 · unverdicted · novelty 6.0

PBD-R adds a momentum-conservation constraint to position-based dynamics to deliver physically accurate rigid-body dynamics while remaining computationally lighter than MuJoCo.

citing papers explorer

Showing 50 of 58 citing papers.

Self-Supervised On-Policy Reinforcement Learning via Contrastive Proximal Policy Optimisation cs.LG · 2026-05-13 · unverdicted · none · ref 13 · internal anchor
CPPO is an on-policy contrastive RL method that derives advantages from contrastive Q-values for PPO optimization, outperforming prior CRL baselines in 14/18 tasks and matching or exceeding reward-based PPO in 12/18 tasks.
Coordinated Diffusion: Generating Multi-Agent Behavior Without Multi-Agent Demonstrations cs.RO · 2026-05-12 · unverdicted · none · ref 42 · internal anchor
CoDi decomposes the multi-agent diffusion score into pre-trained single-agent policies plus a gradient-free cost guidance term to generate coordinated behavior from single-agent data alone.
Dynamic Full-body Motion Agent with Object Interaction via Blending Pre-trained Modular Controllers cs.CV · 2026-05-12 · unverdicted · none · ref 31 · internal anchor
A two-stage framework augments HOI data with dynamic priors and blends pre-trained dynamic motion and static interaction agents via a composer network to enable long-term dynamic human-object interactions with higher success rates and reduced training time.
HiPAN: Hierarchical Posture-Adaptive Navigation for Quadruped Robots in Unstructured 3D Environments cs.RO · 2026-04-29 · unverdicted · none · ref 40 · internal anchor
HiPAN enables quadruped robots to navigate unstructured 3D environments more successfully by combining a high-level posture-adaptive policy with a low-level controller and curriculum learning on depth images.
HANDFUL: Sequential Grasp-Conditioned Dexterous Manipulation with Resource Awareness cs.RO · 2026-04-28 · unverdicted · none · ref 17 · internal anchor
HANDFUL learns resource-aware grasps using finger contact rewards and curriculum learning to improve success on sequential dexterous tasks in simulation and on a real LEAP hand.
Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors cs.RO · 2026-05-21 · unverdicted · none · ref 70 · internal anchor
Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple rewards for mocap deployment.
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders cs.RO · 2026-05-19 · accept · none · ref 16 · 2 links · internal anchor
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing cs.LG · 2026-05-15 · unverdicted · none · ref 145 · internal anchor
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
SECOND-Grasp: Semantic Contact-guided Dexterous Grasping cs.RO · 2026-05-13 · conditional · none · ref 23 · internal anchor
SECOND-Grasp integrates semantic contact proposals from vision-language reasoning with geometric refinement to achieve 98%+ lifting success and improved intent-aware grasping on seen and unseen objects.
NavOL: Navigation Policy with Online Imitation Learning cs.RO · 2026-05-12 · unverdicted · none · ref 8 · internal anchor
NavOL collects expert trajectory labels online from a global planner during policy rollouts in simulation to train a diffusion navigation policy, mitigating distribution shift and improving performance on visual navigation tasks.
Explicit Stair Geometry Conditioning for Robust Humanoid Locomotion cs.RO · 2026-05-11 · unverdicted · none · ref 14 · internal anchor
Explicit conditioning of a PPO policy on interpretable stair parameters (height, depth, yaw) yields improved generalization to unseen stairs and reliable real-world traversal on the Unitree G1, including 33 consecutive outdoor steps.
Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching cs.RO · 2026-05-10 · unverdicted · none · ref 27 · internal anchor
DRIS improves zero-shot sim-to-real transfer for reactive catching by maintaining and acting on sets of randomized dynamics instances instead of single instances per episode.
RigidFormer: Learning Rigid Dynamics using Transformers cs.CV · 2026-05-09 · unverdicted · none · ref 24 · internal anchor
RigidFormer learns mesh-free rigid dynamics from point clouds using object-centric anchors, Anchor-Vertex Pooling, Anchor-based RoPE, and differentiable Kabsch alignment to enforce rigidity.
ANO: A Principled Approach to Robust Policy Optimization cs.AI · 2026-05-04 · unverdicted · none · ref 17 · internal anchor
ANO derives a robust policy optimizer from geometric principles that replaces clipping with a smooth redescending gradient, showing better performance and stability than PPO, SPO, and GRPO in MuJoCo, Atari, and RLHF experiments.
GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning cs.RO · 2026-04-28 · unverdicted · none · ref 34 · internal anchor
GS-Playground delivers a high-throughput photorealistic simulator for vision-informed robot learning via parallel physics integrated with batch 3D Gaussian Splatting at 10^4 FPS and an automated Real2Sim workflow for consistent environments.
dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model cs.RO · 2026-04-24 · unverdicted · none · ref 27 · internal anchor
A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.
Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot cs.RO · 2026-04-23 · unverdicted · none · ref 20 · internal anchor
The Weightlessness Mechanism lets humanoid robots imitate non-self-stabilizing motions by dynamically relaxing specific joints to exploit passive environmental contacts, generalizing from single demonstrations to varied setups.
ETac: A Lightweight and Efficient Tactile Simulation Framework for Learning Dexterous Manipulation cs.RO · 2026-04-22 · unverdicted · none · ref 20 · internal anchor
ETac is a data-driven tactile simulation framework that matches FEM deformation accuracy at high speed, supporting 4096 parallel environments at 869 FPS and yielding 84.45% success in blind grasping across four object types.
FLASH: Fast Learning via GPU-Accelerated Simulation for High-Fidelity Deformable Manipulation in Minutes cs.RO · 2026-04-19 · unverdicted · none · ref 24 · internal anchor
A new GPU-accelerated deformable simulation framework trains manipulation policies in minutes using only synthetic data, achieving robust zero-shot transfer to physical robots.
Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning cs.LG · 2026-04-15 · unverdicted · none · ref 11 · internal anchor
CoUR uses LLMs for efficient RL reward design through uncertainty quantification and similarity selection, achieving better performance and lower evaluation costs on IsaacGym and Bidexterous Manipulation benchmarks.
Trajectory-based actuator identification via differentiable simulation cs.RO · 2026-04-11 · unverdicted · none · ref 14 · internal anchor
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control cs.LG · 2026-04-06 · unverdicted · none · ref 51 · 2 links · internal anchor
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation? cs.RO · 2026-04-06 · unverdicted · none · ref 32 · internal anchor
Veo-3 video predictions enable approximate task-level robot trajectories in zero-shot settings but require hierarchical integration with low-level VLA policies for reliable manipulation performance.
Physically Accurate Rigid-Body Dynamics in Particle-Based Simulation cs.RO · 2026-03-15 · unverdicted · none · ref 16 · internal anchor
PBD-R adds a momentum-conservation constraint to position-based dynamics to deliver physically accurate rigid-body dynamics while remaining computationally lighter than MuJoCo.
PTLD: Sim-to-real Privileged Tactile Latent Distillation for Dexterous Manipulation cs.RO · 2026-03-04 · unverdicted · none · ref 59 · internal anchor
PTLD distills real privileged tactile data into a state estimator to boost sim-to-real performance of proprioceptive dexterous manipulation policies, yielding 182% improvement on in-hand rotation and 57% on reorientation tasks.
One Hand to Rule Them All: Canonical Representations for Unified Dexterous Manipulation cs.RO · 2026-02-18 · unverdicted · none · ref 16 · internal anchor
A unified parameter space and canonical URDF enable cross-embodiment dexterous grasping policies with 81.9% zero-shot success on unseen hands like the 3-finger LEAP Hand.
Semantic-Contact Fields for Category-Level Generalizable Tactile Tool Manipulation cs.RO · 2026-02-14 · unverdicted · none · ref 20 · internal anchor
SCFields fuses semantics and contact data in a sim-to-real pipeline to enable category-level generalization for tactile tool manipulation with diffusion policies.
Phase-Aware Policy Learning for Skateboard Riding of Quadruped Robots via Feature-wise Linear Modulation cs.RO · 2026-02-10 · unverdicted · none · ref 33 · internal anchor
PAPL uses phase-conditioned FiLM layers in RL networks to create a unified policy for quadruped robots to ride skateboards by capturing phase-dependent behaviors while sharing knowledge across phases.
HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control cs.RO · 2026-02-03 · conditional · none · ref 22 · internal anchor
HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.
Learning to Plan, Planning to Learn: Adaptive Hierarchical RL-MPC for Sample-Efficient Decision Making cs.LG · 2025-12-18 · unverdicted · none · ref 7 · internal anchor
An adaptive RL-MPC framework uses RL to inform MPPI sampling and aggregates MPPI samples for value estimation, delivering up to 72% higher success rates and 2.1x faster convergence on tasks like race driving and Lunar Lander with obstacles.
Unify Robot Actions in Camera Frame cs.RO · 2025-11-21 · conditional · none · ref 27 · internal anchor
CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.
Humanoid Whole-Body Badminton via Multi-Stage Reinforcement Learning cs.RO · 2025-11-14 · unverdicted · none · ref 28 · internal anchor
A multi-stage RL curriculum produces a unified whole-body controller enabling humanoid robots to sustain badminton rallies in simulation and return shuttles at up to 19.1 m/s in real hardware, with both EKF-based and prediction-free variants.
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation cs.RO · 2025-08-07 · unverdicted · none · ref 22 · internal anchor
Genie Envisioner unifies robotic policy learning, simulation, and evaluation inside one instruction-conditioned video diffusion framework using GE-Base, GE-Act, and GE-Sim.
A Reconfigured Wheel-Legged Robot for Enhanced Steering and Adaptability cs.RO · 2025-07-30 · conditional · none · ref 30 · internal anchor
FLORES is a wheel-legged robot with front-leg hip-yaw DoFs replacing hip-roll, paired with a custom RL controller using adapted HIM and tailored rewards for smooth wheeled-to-legged transitions and efficient gaits.
AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation cs.CV · 2025-07-17 · unverdicted · none · ref 27 · internal anchor
AnyPos automates task-agnostic action collection and inverse-dynamics modeling with arm/end-effector decoupling plus a direction-aware decoder, delivering 51% higher test accuracy and 30-40% better success rates on bimanual tasks.
DreamPolicy: A Unified World-model Policy for Scalable Humanoid Locomotion cs.RO · 2025-05-24 · unverdicted · none · ref 74 · internal anchor
DreamPolicy integrates an autoregressive diffusion world model with policy learning to produce a single scalable policy that generalizes to unseen composite terrains for humanoid locomotion.
The Hive Mind is a Single Reinforcement Learning Agent cs.MA · 2024-10-23 · unverdicted · none · ref 34 · internal anchor
Bee hive mind from weighted voter imitation equals a single RL agent using a new multi-armed bandit rule called Maynard-Cross Learning.
Diffusion Policy Policy Optimization cs.RO · 2024-09-01 · unverdicted · none · ref 59 · internal anchor
DPPO fine-tunes diffusion policies via policy gradients and outperforms prior RL approaches for diffusion policies and PG-tuned alternatives on robot benchmarks while enabling stable training and hardware deployment.
SimART: A Unified and Open Real-world Multimodal Simulation Platform for 6G Integrated Sensing and Communication eess.SP · 2026-05-13 · unverdicted · none · ref 8 · internal anchor
SimART is an open platform that unifies robotics, ray tracing, and wireless tools via ROS for reproducible multimodal simulation in 6G integrated sensing and communication.
OrbiSim: World Models as Differentiable Physics Engines for Embodied Intelligence cs.RO · 2026-05-12 · unverdicted · none · ref 26 · internal anchor
OrbiSim builds a differentiable physics engine from world models to support gradient-based policy optimization and contact modeling in robotics.
REAP: Reinforcement-Learning End-to-End Autonomous Parking with Gaussian Splatting Simulator for Real2Sim2Real Transfer cs.RO · 2026-05-09 · unverdicted · none · ref 16 · internal anchor
REAP trains an end-to-end SAC policy with behavior cloning and collision penalties inside a 3DGS Real2Sim simulator and transfers it to physical vehicles, succeeding in narrow mechanical parking slots.
Finite-Step Invariant Sets for Hybrid Systems with Probabilistic Guarantees eess.SY · 2026-04-06 · unverdicted · none · ref 33 · internal anchor
A sampling-based optimization framework computes finite-step invariant ellipsoids for hybrid system return maps with user-specified probabilistic guarantees on invariance.
From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments cs.AI · 2026-03-25 · unverdicted · none · ref 86 · internal anchor
An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.
UniCon: A Unified System for Efficient Robot Learning Transfers cs.RO · 2026-01-21 · unverdicted · none · ref 16 · internal anchor
UniCon standardizes states and control logic into modular execution graphs for efficient transfer of learning controllers across heterogeneous robots, with lower latency than ROS.
Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning cs.RO · 2025-11-09 · unverdicted · none · ref 23 · internal anchor
A two-stage distillation plus reinforced fine-tuning approach produces a single humanoid locomotion controller that adapts across skills and irregular terrains.
Learning Geometry-Aware Nonprehensile Pushing and Pulling with Dexterous Hands cs.RO · 2025-09-22 · unverdicted · none · ref 25 · internal anchor
GD2P generates and learns dexterous hand poses for nonprehensile pushing and pulling by combining contact-guided sampling, physics-based filtering, and a geometry-conditioned diffusion model, demonstrated on Allegro and LEAP hands in real-world tests.
Relative Entropy Pathwise Policy Optimization cs.LG · 2025-07-15 · unverdicted · none · ref 9 · internal anchor
REPPO is an on-policy RL method that combines pathwise policy gradients with relative entropy constraints to achieve stable training and high sample efficiency without replay buffers.
Unreal Robotics Lab: A High-Fidelity Robotics Simulator with Advanced Physics and Rendering cs.RO · 2025-04-19 · unverdicted · none · ref 13 · internal anchor
Unreal Robotics Lab integrates Unreal Engine rendering with MuJoCo physics to enable high-fidelity simulation for robotics perception, control, and benchmarking under diverse conditions.
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations cs.RO · 2023-10-26 · unverdicted · none · ref 52 · internal anchor
MimicGen creates over 50K robot demonstrations from roughly 200 human ones, allowing imitation learning to achieve strong performance on complex long-horizon tasks like assembly and coffee preparation.
Energy-Efficient Quadruped Locomotion with Compliant Feet cs.RO · 2026-05-14 · unverdicted · none · ref 53 · internal anchor
Tuned foot compliance in quadruped robots lowers locomotion energy consumption by roughly 17 percent relative to rigid or overly soft designs.

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer