GELATO combines drift-plus-penalty Lyapunov control with generative entropy early exiting to adaptively offload tokens in device-edge speculative decoding, delivering higher throughput and lower energy use than prior distributed SD systems while preserving output quality.
Edge and terminal cooperation enabled LLM deployment optimization in wireless network
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.NI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GELATO: Generative Entropy- and Lyapunov-based Adaptive Token Offloading for Device-Edge Speculative LLM Inference
GELATO combines drift-plus-penalty Lyapunov control with generative entropy early exiting to adaptively offload tokens in device-edge speculative decoding, delivering higher throughput and lower energy use than prior distributed SD systems while preserving output quality.