SCOOT: SLO-Oriented Perfor- mance Tuning for LLM Inference Engines

Ke Cheng, Zhi Wang, Wen Hu, Tiannuo Yang, Jianguo Li, Sheng Zhang · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.DC · 2026-05-06 · unverdicted · novelty 6.0

Nitsum dynamically adapts tensor parallelism and GPU splits in LLM serving to raise SLO-compliant goodput by up to 5.3 times over prior systems.

Showing 1 of 1 citing paper.

Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism cs.DC · 2026-05-06 · unverdicted · none · ref 17
Nitsum dynamically adapts tensor parallelism and GPU splits in LLM serving to raise SLO-compliant goodput by up to 5.3 times over prior systems.