hub

Learning Dexterous In-Hand Manipulation

OpenAI, Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Józefowicz, Bob McGrew, Jakub W · 2018 · cs.LG · arXiv 1808.00177

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

open full Pith review browse 14 citing papers arXiv PDF

abstract

We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system like friction coefficients and an object's appearance. Our policies transfer to the physical robot despite being trained entirely in simulation. Our method does not rely on any human demonstrations, but many behaviors found in human manipulation emerge naturally, including finger gaiting, multi-finger coordination, and the controlled use of gravity. Our results were obtained using the same distributed RL system that was used to train OpenAI Five. We also include a video of our results: https://youtu.be/jwSbzNHGflM

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Benchmarking Model-Based Reinforcement Learning

cs.LG · 2019-07-03 · accept · novelty 7.0

Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.

Pose Estimation for Non-Cooperative Rendezvous Using Neural Networks

cs.CV · 2019-06-24 · unverdicted · novelty 7.0

SPN is a CNN that detects a spacecraft bounding box, classifies then regresses attitude, and optimizes position via Gauss-Newton, achieving degree-level attitude and cm-level position errors on real images after training only on synthetic data.

HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning

cs.RO · 2026-04-07 · unverdicted · novelty 7.0

HiPolicy is a new hierarchical multi-frequency action chunking method for imitation learning that jointly generates coarse and fine action sequences with entropy-guided execution to improve performance and efficiency in robotic manipulation.

Distributionally Robust Control via Stein Variational Inference for Contact-Rich Manipulation

cs.RO · 2026-05-18 · unverdicted · novelty 6.0

Introduces a Stein variational inference-based deterministic formulation for distributionally robust control in contact-rich robotic manipulation, reporting up to 3x improved robustness under parametric uncertainty.

RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields

cs.RO · 2024-12-03 · unverdicted · novelty 6.0

A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.

A Survey on Vision-Language-Action Models for Embodied AI

cs.RO · 2024-05-23 · unverdicted · novelty 6.0

This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.

RoboNet: Large-Scale Multi-Robot Learning

cs.RO · 2019-10-24 · conditional · novelty 6.0

RoboNet is a multi-robot video dataset that enables pre-training of vision-based manipulation models which, after fine-tuning on a new robot, outperform robot-specific training that uses 4-20 times more data.

Bayesian Optimization in Variational Latent Spaces with Dynamic Compression

cs.RO · 2019-07-10 · unverdicted · novelty 6.0

Sequential VAE embeds simulated trajectories into latent paths for Bayesian optimization with dynamic compression to enable data-efficient high-dimensional controller tuning on robots.

Generalizing from a few environments in safety-critical reinforcement learning

cs.LG · 2019-07-02 · unverdicted · novelty 6.0

RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

cs.RO · 2021-08-24 · conditional · novelty 6.0

Isaac Gym achieves 2-3 orders of magnitude faster robot policy training by keeping physics simulation and PyTorch-based RL entirely on GPU with direct buffer sharing.

Learning to Solve a Rubik's Cube with a Dexterous Hand

cs.RO · 2019-07-26 · unverdicted · novelty 5.0

Hierarchical RL combines a model-based cube solver with a model-free hand controller to solve Rubik's cubes in simulation, achieving 90.3% success on 1400 random scrambles.

ORRB -- OpenAI Remote Rendering Backend

cs.GR · 2019-06-26 · unverdicted · novelty 4.0

ORRB is an open-source remote rendering backend that pairs Unity3d with MuJoCo for high-throughput, customizable visual domain randomization in robotics environments.

On Multi-Agent Learning in Team Sports Games

cs.MA · 2019-06-25 · unverdicted · novelty 3.0

Describes a hierarchical RL method for multi-agent learning in team sports games aiming for human-like agents, reporting preliminary results that show promise.

DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors

cs.RO · 2026-04-27

citing papers explorer

Showing 14 of 14 citing papers.

Benchmarking Model-Based Reinforcement Learning cs.LG · 2019-07-03 · accept · none · ref 1 · internal anchor
Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.
Pose Estimation for Non-Cooperative Rendezvous Using Neural Networks cs.CV · 2019-06-24 · unverdicted · none · ref 43 · internal anchor
SPN is a CNN that detects a spacecraft bounding box, classifies then regresses attitude, and optimizes position via Gauss-Newton, achieving degree-level attitude and cm-level position errors on real images after training only on synthetic data.
HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning cs.RO · 2026-04-07 · unverdicted · none · ref 1
HiPolicy is a new hierarchical multi-frequency action chunking method for imitation learning that jointly generates coarse and fine action sequences with entropy-guided execution to improve performance and efficiency in robotic manipulation.
Distributionally Robust Control via Stein Variational Inference for Contact-Rich Manipulation cs.RO · 2026-05-18 · unverdicted · none · ref 34 · internal anchor
Introduces a Stein variational inference-based deterministic formulation for distributionally robust control in contact-rich robotic manipulation, reporting up to 3x improved robustness under parametric uncertainty.
RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields cs.RO · 2024-12-03 · unverdicted · none · ref 36 · internal anchor
A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.
A Survey on Vision-Language-Action Models for Embodied AI cs.RO · 2024-05-23 · unverdicted · none · ref 230 · internal anchor
This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.
RoboNet: Large-Scale Multi-Robot Learning cs.RO · 2019-10-24 · conditional · none · ref 2 · internal anchor
RoboNet is a multi-robot video dataset that enables pre-training of vision-based manipulation models which, after fine-tuning on a new robot, outperform robot-specific training that uses 4-20 times more data.
Bayesian Optimization in Variational Latent Spaces with Dynamic Compression cs.RO · 2019-07-10 · unverdicted · none · ref 1 · internal anchor
Sequential VAE embeds simulated trajectories into latent paths for Bayesian optimization with dynamic compression to enable data-efficient high-dimensional controller tuning on robots.
Generalizing from a few environments in safety-critical reinforcement learning cs.LG · 2019-07-02 · unverdicted · none · ref 2 · internal anchor
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning cs.RO · 2021-08-24 · conditional · none · ref 5
Isaac Gym achieves 2-3 orders of magnitude faster robot policy training by keeping physics simulation and PyTorch-based RL entirely on GPU with direct buffer sharing.
Learning to Solve a Rubik's Cube with a Dexterous Hand cs.RO · 2019-07-26 · unverdicted · none · ref 3 · internal anchor
Hierarchical RL combines a model-based cube solver with a model-free hand controller to solve Rubik's cubes in simulation, achieving 90.3% success on 1400 random scrambles.
ORRB -- OpenAI Remote Rendering Backend cs.GR · 2019-06-26 · unverdicted · none · ref 10 · internal anchor
ORRB is an open-source remote rendering backend that pairs Unity3d with MuJoCo for high-throughput, customizable visual domain randomization in robotics environments.
On Multi-Agent Learning in Team Sports Games cs.MA · 2019-06-25 · unverdicted · none · ref 2 · internal anchor
Describes a hierarchical RL method for multi-agent learning in team sports games aiming for human-like agents, reporting preliminary results that show promise.
DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors cs.RO · 2026-04-27 · unreviewed · ref 8

Learning Dexterous In-Hand Manipulation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer