An empirical investigation of the challenges of real-world reinforcement learning

Cosmin Paduraru; Daniel J. Mankowitz; Gabriel Dulac-Arnold; Jerry Li; Nir Levine; Sven Gowal; Todd Hester

arxiv: 2003.11881 · v2 · pith:QH3BTW33new · submitted 2020-03-24 · 💻 cs.LG · cs.AI

An empirical investigation of the challenges of real-world reinforcement learning

Gabriel Dulac-Arnold , Nir Levine , Daniel J. Mankowitz , Jerry Li , Cosmin Paduraru , Sven Gowal , Todd Hester This is my paper

classification 💻 cs.LG cs.AI

keywords challengesreal-worldlearningserieschallengeproposedreinforcementsome

0 comments

read the original abstract

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called the realworldrl-suite which we propose an as an open-source benchmark.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AtomComposer: Discovering Chemical Space from First Principles with Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 8.0

AtomComposer uses online RL with multi-composition training to discover up to 10x more valid 3D isomers on unseen chemical formulas than single-composition baselines.
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
cs.LG 2020-04 accept novelty 8.0

D4RL supplies new offline RL benchmarks and datasets from expert and mixed sources to expose weaknesses in existing algorithms and standardize evaluation.
Distributionally Robust Control via Stein Variational Inference for Contact-Rich Manipulation
cs.RO 2026-05 unverdicted novelty 6.0

Introduces a Stein variational inference-based deterministic formulation for distributionally robust control in contact-rich robotic manipulation, reporting up to 3x improved robustness under parametric uncertainty.
Learning to Adapt: Representation-Based Reinforcement Learning for Multi-Task Skill Transfer
cs.RO 2026-06 unverdicted novelty 4.0

RepMT-SAC uses spectral MDP decomposition to build a task-agnostic value-function core plus minimal task adjustment, yielding up to 30% better performance than baselines on quadcopter trajectory tasks with zero-shot i...