pith. sign in

arxiv: 1712.09381 · v4 · pith:DZW7J67Unew · submitted 2017-12-26 · 💻 cs.AI · cs.DC· cs.LG

RLlib: Abstractions for Distributed Reinforcement Learning

classification 💻 cs.AI cs.DCcs.LG
keywords rllibalgorithmscomputationdistributedlearningprimitivesreinforcementabstractions
0
0 comments X
read the original abstract

Reinforcement learning (RL) algorithms involve the deep nesting of highly irregular computation patterns, each of which typically exhibits opportunities for distributed computation. We argue for distributing RL components in a composable way by adapting algorithms for top-down hierarchical control, thereby encapsulating parallelism and resource requirements within short-running compute tasks. We demonstrate the benefits of this principle through RLlib: a library that provides scalable software primitives for RL. These primitives enable a broad range of algorithms to be implemented with high performance, scalability, and substantial code reuse. RLlib is available at https://rllib.io/.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Scalable Option Learning in High-Throughput Environments

    cs.LG 2025-08 unverdicted novelty 6.0

    SOL is a new hierarchical RL algorithm that reaches 35x higher throughput and outperforms flat agents when trained on 30 billion frames in NetHack while showing positive scaling.

  2. RAMP: Hybrid DRL for Online Learning of Numeric Action Models

    cs.AI 2026-04 unverdicted novelty 5.0

    RAMP learns numeric action models online via a DRL-planning feedback loop and outperforms PPO on IPC numeric domains in solvability and plan quality.

  3. Gym-V: A Unified Vision Environment System for Agentic Vision Research

    cs.CV 2026-03 unverdicted novelty 5.0

    Gym-V supplies 179 visual environments showing that observation scaffolding like captions and rules matters more for training success than the choice of RL algorithm.

  4. Reinforcement learning for adaptive interior point methods in convex quadratic programming

    math.OC 2025-09 unverdicted novelty 5.0

    Reinforcement learning learns a policy that adapts control parameters of a regularized interior-point method, accelerating high-accuracy solutions for convex quadratic programs and generalizing across problem classes ...