{"paper":{"title":"ROSE: Rollout On Serving GPUs via Cooperative Elasticity for Agentic RL","license":"http://creativecommons.org/licenses/by/4.0/","headline":"ROSE accelerates agentic RL by repurposing idle serving GPUs for rollouts while preserving SLOs.","cross_cats":[],"primary_cat":"cs.DC","authors_text":"Bo Zheng, Dakai An, Dilxat Muhtar, Jiamang Wang, Ju Huang, Lin Qu, Lunxi Cao, Shaopan Xiong, Siran Yang, Teng Ma, Tianyuan Wu, Wei Gao, Wei Wang, Weixun Wang, Xuchun Shang, Yuheng Zhao","submitted_at":"2026-05-07T16:33:40Z","abstract_excerpt":"Agentic reinforcement learning (RL) is reshaping LLM post-training, but end-to-end training time is dominated by compute-intensive, multi-turn rollouts whose resource demand varies significantly across training steps. Resource-fixed systems cannot adapt to this variation, while resource-elastic approaches that provision external GPUs on demand suffer from high allocation overhead and limited availability.\n  We observe that serving clusters leave substantial GPU compute and memory idle, and propose cooperative elasticity: sharing already-deployed serving GPUs with rollout workloads to provide o"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Experiments across multiple model sizes and cluster scales show that ROSE improves average end-to-end throughput by 1.20-3.31 x compared with state-of-the-art resource-fixed and elastic baselines.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Production serving clusters routinely leave substantial GPU compute and memory headroom that can be safely repurposed for rollouts without violating serving SLOs under bursty traffic.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"ROSE delivers 1.2-3.3x higher end-to-end throughput for agentic RL by safely co-using underutilized serving GPUs for rollouts while meeting serving SLOs.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"ROSE accelerates agentic RL by repurposing idle serving GPUs for rollouts while preserving SLOs.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"a3f3fe3a81fff164a97db6c2bfd87352c9e6508b90e1a3a0b8a65976d125bb9b"},"source":{"id":"2605.06534","kind":"arxiv","version":2},"verdict":{"id":"738bef35-f2d5-4bf4-9548-f6eafcc817fd","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-08T05:09:19.878666Z","strongest_claim":"Experiments across multiple model sizes and cluster scales show that ROSE improves average end-to-end throughput by 1.20-3.31 x compared with state-of-the-art resource-fixed and elastic baselines.","one_line_summary":"ROSE delivers 1.2-3.3x higher end-to-end throughput for agentic RL by safely co-using underutilized serving GPUs for rollouts while meeting serving SLOs.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Production serving clusters routinely leave substantial GPU compute and memory headroom that can be safely repurposed for rollouts without violating serving SLOs under bursty traffic.","pith_extraction_headline":"ROSE accelerates agentic RL by repurposing idle serving GPUs for rollouts while preserving SLOs."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.06534/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"claim_evidence","ran_at":"2026-05-20T12:22:03.856463Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-20T07:39:51.131127Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T18:01:19.608495Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T12:35:47.347012Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"d9ede38d9021c1a30d09a40c86fd6ce4e6d6dc3baac1519cc8d6f74d686121d0"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}