Assessing Generalization in Deep Reinforcement Learning

Charles Packer; Dawn Song; Jernej Kos; Katelyn Gao; Philipp Kr\"ahenb\"uhl; Vladlen Koltun

arxiv: 1810.12282 · v2 · pith:QKIHKSQ5new · submitted 2018-10-29 · 💻 cs.LG · cs.AI· stat.ML

Assessing Generalization in Deep Reinforcement Learning

Charles Packer , Katelyn Gao , Jernej Kos , Philipp Kr\"ahenb\"uhl , Vladlen Koltun , Dawn Song This is my paper

classification 💻 cs.LG cs.AIstat.ML

keywords generalizationdeepalgorithmsevaluationexperimentalgeneralizelearningreinforcement

0 comments

read the original abstract

Deep reinforcement learning (RL) has achieved breakthrough results on many tasks, but agents often fail to generalize beyond the environment they were trained in. As a result, deep RL algorithms that promote generalization are receiving increasing attention. However, works in this area use a wide variety of tasks and experimental setups for evaluation. The literature lacks a controlled assessment of the merits of different generalization schemes. Our aim is to catalyze community-wide progress on generalization in deep RL. To this end, we present a benchmark and experimental protocol, and conduct a systematic empirical study. Our framework contains a diverse set of environments, our methodology covers both in-distribution and out-of-distribution generalization, and our evaluation includes deep RL algorithms that specifically tackle generalization. Our key finding is that `vanilla' deep RL algorithms generalize better than specialized schemes that were proposed specifically to tackle generalization.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Why Does Agentic Safety Fail to Generalize Across Tasks?
cs.LG 2026-05 conditional novelty 6.0

Agentic safety fails to generalize across tasks because the task-to-safe-controller mapping has a higher Lipschitz constant than the task-to-controller mapping alone, as proven in linear-quadratic control and demonstr...
Visualizing Critic Match Loss Landscapes for Interpretation of Online Reinforcement Learning Control Algorithms
cs.LG 2026-03 unverdicted novelty 6.0

A projection-based visualization of critic match loss landscapes that reveals optimization paths and stability characteristics in online actor-critic reinforcement learning.
Reasoning and Generalization in RL: A Tool Use Perspective
cs.NE 2019-07 unverdicted novelty 5.0

Proposes a tool-use inspired framework with multiple test sets to measure specified types of generalization in RL.
Enhancing RL Generalizability in Robotics through SHAP Analysis of Algorithms and Hyperparameters
cs.LG 2026-05 unverdicted novelty 4.0

SHAP analysis of RL algorithms and hyperparameters reveals consistent impact patterns that enable guided configuration selection for improved generalization in robotic environments.
Mission-Aligned Learning-Informed Control of Autonomous Systems: Formulation and Foundations
math.OC 2025-07 unverdicted novelty 4.0

The paper formulates a two-level optimization scheme integrating control, classical planning, and reinforcement learning to improve safety and interpretability in autonomous systems.