pith. sign in

Fu, Zhiqiang Xie, Beidi Chen, Clark W

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AI 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

PALS: Power-Aware LLM Serving for Mixture-of-Experts Models

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

PALS adds dynamic GPU power capping to LLM serving frameworks like vLLM, jointly tuning it with batch size via offline models and feedback control to improve energy efficiency up to 26.3% and cut QoS violations 4-7x on dense and MoE models.

citing papers explorer

Showing 1 of 1 citing paper.

  • PALS: Power-Aware LLM Serving for Mixture-of-Experts Models cs.AI · 2026-05-20 · unverdicted · none · ref 36

    PALS adds dynamic GPU power capping to LLM serving frameworks like vLLM, jointly tuning it with batch size via offline models and feedback control to improve energy efficiency up to 26.3% and cut QoS violations 4-7x on dense and MoE models.