pith. sign in

Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it
abstract

Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents using aggregate metrics of performance over multiple random seeds for a single environment. Unfortunately, there are still pernicious sources of variability in reinforcement learning agents that make reporting common summary statistics an unsound metric for performance. Our experiments demonstrate the variability of common agents used in the popular OpenAI Baselines repository. We make the case for reporting post-training agent performance as a distribution, rather than a point estimate.

fields

cs.LG 2 cs.RO 1

years

2026 1 2025 2

verdicts

UNVERDICTED 3

clear filters

representative citing papers

RAPTOR: A Foundation Policy for Quadrotor Control

cs.RO · 2025-09-15 · unverdicted · novelty 6.0

A 2084-parameter recurrent policy trained by distilling 1000 RL teacher policies enables zero-shot control across 10 real quadrotors differing in mass, motors, frames, propellers, and flight controllers.

Performance Variation in Deep Reinforcement Learning

cs.LG · 2026-06-04 · unverdicted · novelty 4.0

Proposes min-max IPR and percentile highlighting to evaluate run-to-run performance variation in deep RL, with case studies on normalizations in PPO/SAC, algorithm comparisons, and DQN/Rainbow on Atari.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Performance Variation in Deep Reinforcement Learning cs.LG · 2026-06-04 · unverdicted · none · ref 2 · internal anchor

    Proposes min-max IPR and percentile highlighting to evaluate run-to-run performance variation in deep RL, with case studies on normalizations in PPO/SAC, algorithm comparisons, and DQN/Rainbow on Atari.