ConfigSpec shows that optimal configurations for speculative LLM inference conflict across goodput (favoring smallest drafters at device-specific K=2-10), cost (favoring largest drafters at K=2), and energy (favoring smallest drafters at K=2), requiring profiling-based selection instead of fixed or
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving
ConfigSpec shows that optimal configurations for speculative LLM inference conflict across goodput (favoring smallest drafters at device-specific K=2-10), cost (favoring largest drafters at K=2), and energy (favoring smallest drafters at K=2), requiring profiling-based selection instead of fixed or