pith. sign in

Beehive: Sub-second elasticity for web services with semi-faas execution

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 3 2025 1

roles

background 1

polarities

background 1

representative citing papers

PALS: Power-Aware LLM Serving for Mixture-of-Experts Models

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

PALS adds dynamic GPU power capping to LLM serving frameworks like vLLM, jointly tuning it with batch size via offline models and feedback control to improve energy efficiency up to 26.3% and cut QoS violations 4-7x on dense and MoE models.

citing papers explorer

Showing 4 of 4 citing papers.