Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

· 2018 · cs.RO · arXiv 1810.05687

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

We consider the problem of transferring policies to the real world by training on a distribution of simulated scenarios. Rather than manually tuning the randomization of simulations, we adapt the simulation parameter distribution using a few real world roll-outs interleaved with policy training. In doing so, we are able to change the distribution of simulations to improve the policy transfer by matching the policy behavior in simulation and the real world. We show that policies trained with our method are able to reliably transfer to different robots in two real world tasks: swing-peg-in-hole and opening a cabinet drawer. The video of our experiments can be found at https://sites.google.com/view/simopt

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Solving Rubik's Cube with a Robot Hand

cs.LG · 2019-10-16 · accept · novelty 7.0

Reinforcement learning models trained only in simulation using automatic domain randomization solve Rubik's cube with a real robot hand.

Bayesian Optimization in Variational Latent Spaces with Dynamic Compression

cs.RO · 2019-07-10 · unverdicted · novelty 6.0

Sequential VAE embeds simulated trajectories into latent paths for Bayesian optimization with dynamic compression to enable data-efficient high-dimensional controller tuning on robots.

citing papers explorer

Showing 2 of 2 citing papers.

Solving Rubik's Cube with a Robot Hand cs.LG · 2019-10-16 · accept · none · ref 14 · internal anchor
Reinforcement learning models trained only in simulation using automatic domain randomization solve Rubik's cube with a real robot hand.
Bayesian Optimization in Variational Latent Spaces with Dynamic Compression cs.RO · 2019-07-10 · unverdicted · none · ref 23 · internal anchor
Sequential VAE embeds simulated trajectories into latent paths for Bayesian optimization with dynamic compression to enable data-efficient high-dimensional controller tuning on robots.

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer