Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.
hub
Learning Dexterous In-Hand Manipulation
14 Pith papers cite this work. Polarity classification is still indexing.
abstract
We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system like friction coefficients and an object's appearance. Our policies transfer to the physical robot despite being trained entirely in simulation. Our method does not rely on any human demonstrations, but many behaviors found in human manipulation emerge naturally, including finger gaiting, multi-finger coordination, and the controlled use of gravity. Our results were obtained using the same distributed RL system that was used to train OpenAI Five. We also include a video of our results: https://youtu.be/jwSbzNHGflM
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
SPN is a CNN that detects a spacecraft bounding box, classifies then regresses attitude, and optimizes position via Gauss-Newton, achieving degree-level attitude and cm-level position errors on real images after training only on synthetic data.
HiPolicy is a new hierarchical multi-frequency action chunking method for imitation learning that jointly generates coarse and fine action sequences with entropy-guided execution to improve performance and efficiency in robotic manipulation.
Introduces a Stein variational inference-based deterministic formulation for distributionally robust control in contact-rich robotic manipulation, reporting up to 3x improved robustness under parametric uncertainty.
A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.
This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.
RoboNet is a multi-robot video dataset that enables pre-training of vision-based manipulation models which, after fine-tuning on a new robot, outperform robot-specific training that uses 4-20 times more data.
Sequential VAE embeds simulated trajectories into latent paths for Bayesian optimization with dynamic compression to enable data-efficient high-dimensional controller tuning on robots.
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.
Isaac Gym achieves 2-3 orders of magnitude faster robot policy training by keeping physics simulation and PyTorch-based RL entirely on GPU with direct buffer sharing.
Hierarchical RL combines a model-based cube solver with a model-free hand controller to solve Rubik's cubes in simulation, achieving 90.3% success on 1400 random scrambles.
ORRB is an open-source remote rendering backend that pairs Unity3d with MuJoCo for high-throughput, customizable visual domain randomization in robotics environments.
Describes a hierarchical RL method for multi-agent learning in team sports games aiming for human-like agents, reporting preliminary results that show promise.
citing papers explorer
-
Benchmarking Model-Based Reinforcement Learning
Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.
-
Pose Estimation for Non-Cooperative Rendezvous Using Neural Networks
SPN is a CNN that detects a spacecraft bounding box, classifies then regresses attitude, and optimizes position via Gauss-Newton, achieving degree-level attitude and cm-level position errors on real images after training only on synthetic data.
-
HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning
HiPolicy is a new hierarchical multi-frequency action chunking method for imitation learning that jointly generates coarse and fine action sequences with entropy-guided execution to improve performance and efficiency in robotic manipulation.
-
Distributionally Robust Control via Stein Variational Inference for Contact-Rich Manipulation
Introduces a Stein variational inference-based deterministic formulation for distributionally robust control in contact-rich robotic manipulation, reporting up to 3x improved robustness under parametric uncertainty.
-
RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields
A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.
-
A Survey on Vision-Language-Action Models for Embodied AI
This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.
-
RoboNet: Large-Scale Multi-Robot Learning
RoboNet is a multi-robot video dataset that enables pre-training of vision-based manipulation models which, after fine-tuning on a new robot, outperform robot-specific training that uses 4-20 times more data.
-
Bayesian Optimization in Variational Latent Spaces with Dynamic Compression
Sequential VAE embeds simulated trajectories into latent paths for Bayesian optimization with dynamic compression to enable data-efficient high-dimensional controller tuning on robots.
-
Generalizing from a few environments in safety-critical reinforcement learning
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.
-
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
Isaac Gym achieves 2-3 orders of magnitude faster robot policy training by keeping physics simulation and PyTorch-based RL entirely on GPU with direct buffer sharing.
-
Learning to Solve a Rubik's Cube with a Dexterous Hand
Hierarchical RL combines a model-based cube solver with a model-free hand controller to solve Rubik's cubes in simulation, achieving 90.3% success on 1400 random scrambles.
-
ORRB -- OpenAI Remote Rendering Backend
ORRB is an open-source remote rendering backend that pairs Unity3d with MuJoCo for high-throughput, customizable visual domain randomization in robotics environments.
-
On Multi-Agent Learning in Team Sports Games
Describes a hierarchical RL method for multi-agent learning in team sports games aiming for human-like agents, reporting preliminary results that show promise.
- DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors