Negative drift ensures queue stability, preventing unbounded growth that would lead to eviction

Expected Queue Length · 1962

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints

cs.LG · 2025-04-15 · unverdicted · novelty 6.0

The paper develops fluid-guided online scheduling algorithms (WAIT and Nested WAIT) for LLM inference that handle endogenous KV-cache memory growth and improve stability and latency over baselines in simulations.

citing papers explorer

Showing 1 of 1 citing paper.

Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints cs.LG · 2025-04-15 · unverdicted · none · ref 10
The paper develops fluid-guided online scheduling algorithms (WAIT and Nested WAIT) for LLM inference that handle endogenous KV-cache memory growth and improve stability and latency over baselines in simulations.

Negative drift ensures queue stability, preventing unbounded growth that would lead to eviction

fields

years

verdicts

representative citing papers

citing papers explorer