Hybridflow: A flexible and efficient rlhf frame- work.EuroSys 2025 (30/03/2025-03/04/2025, Rotter- dam)

Wu, C · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Valve: Production Online-Offline Inference Colocation with Jointly-Bounded Preemption Latency and Rate

cs.OS · 2026-04-09 · unverdicted · novelty 6.0

Valve jointly bounds preemption latency and rate for online-offline LLM colocation on GPUs, delivering 34.6% higher cluster utilization and a 2,170-GPU saving in a production deployment of 8,054 GPUs with under 5% TTFT and 2% TPOT impact.

citing papers explorer

Showing 1 of 1 citing paper.

Valve: Production Online-Offline Inference Colocation with Jointly-Bounded Preemption Latency and Rate cs.OS · 2026-04-09 · unverdicted · none · ref 10
Valve jointly bounds preemption latency and rate for online-offline LLM colocation on GPUs, delivering 34.6% higher cluster utilization and a 2,170-GPU saving in a production deployment of 8,054 GPUs with under 5% TTFT and 2% TPOT impact.

Hybridflow: A flexible and efficient rlhf frame- work.EuroSys 2025 (30/03/2025-03/04/2025, Rotter- dam)

fields

years

verdicts

representative citing papers

citing papers explorer