pith. sign in

arxiv: 2507.13833 · v4 · pith:4RVOZWSSnew · submitted 2025-07-18 · 💻 cs.DC

DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training

classification 💻 cs.DC
keywords datadistflowcontroldistributedarchitecturecommunicationefficientexecution
0
0 comments X
read the original abstract

Effectively scaling Reinforcement Learning (RL) is crucial for enhancing the reasoning and alignment of Large Language Models. The massive data and complex execution flows inherent in these tasks require a distributed architecture capable of efficient scaling. However, to simplify programming and dependency management, mainstream frameworks often rely on a centralized architecture where a single node dispatches both control and data. This inherent coupling creates significant communication bottlenecks, severely limiting system scalability and efficiency. We present DISTFLOW, a novel, fully distributed RL framework that adopts a multi-controller paradigm. By decoupling data transmission from control dispatch, DISTFLOW establishes a parallelism-aware, decentralized Data Coordinator that leverages local caching, load balancing, and asynchronous double buffer to minimize communication overhead and mitigate straggler effects. For control logic, it introduces a task scheduler built upon Directed Acyclic Graph (DAG) that facilitates fine-grained, independent execution. Experimental results demonstrate that DISTFLOW achieves near-linear scalability up to 512 GPUs and delivers up to a 2.63x throughput improvement over state-of-the-art (SOTA) frameworks. The source code is available at: https://github.com/sii-research/siiRL.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.