CR^2 matches full-information routing performance for device-edge LLM inference using only device-side signals and cuts normalized deployment cost by up to 16.9% at matched accuracy.
Fast inference from transform- ers via speculative decoding
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IT 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CR^2: Cost-Aware Risk-Controlled Routing for Wireless Device-Edge LLM Inference
CR^2 matches full-information routing performance for device-edge LLM inference using only device-side signals and cuts normalized deployment cost by up to 16.9% at matched accuracy.