SuperInfer improves TTFT SLO attainment by up to 74.7% on GH200 Superchips via SLO-aware rotary scheduling (RotaSched) and full-duplex KV cache rotation (DuplexKV) over NVLink-C2C while preserving TBT and throughput.
[Online; accessed 2025-10-27]
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
AdaFair-MARL enforces workload fairness as an explicit second-order cone constraint in cooperative MARL via adaptive primal-dual optimization, achieving near-perfect constraint satisfaction while preserving team performance.
citing papers explorer
-
SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips
SuperInfer improves TTFT SLO attainment by up to 74.7% on GH200 Superchips via SLO-aware rotary scheduling (RotaSched) and full-duplex KV cache rotation (DuplexKV) over NVLink-C2C while preserving TBT and throughput.
-
AdaFair-MARL: Enforcing Adaptive Fairness Constraints in Multi-Agent Reinforcement Learning
AdaFair-MARL enforces workload fairness as an explicit second-order cone constraint in cooperative MARL via adaptive primal-dual optimization, achieving near-perfect constraint satisfaction while preserving team performance.