WISV uses a channel-aware semantic acceptance policy on hidden representations to boost accepted sequence length by up to 60.8% and cut interaction rounds by 37.3% in distributed speculative decoding, with under 1% accuracy loss.
Neurosurgeon: Collaborative intelligence between the cloud and mobile edge
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Joint placement and routing for fixed-partition DNN inference over multi-hop edge networks is addressed with an alternating optimization framework that shows split flexibility matters most in IoT-edge-cloud settings and congestion awareness helps as load increases.
Cloud inference can match or exceed on-device performance for latency-sensitive control in distributed CPS when high-throughput resources amortize network and queueing delays.
citing papers explorer
-
WISV: Wireless-Informed Semantic Verification for Distributed Speculative Decoding in Device-Edge LLM Inference
WISV uses a channel-aware semantic acceptance policy on hidden representations to boost accepted sequence length by up to 60.8% and cut interaction rounds by 37.3% in distributed speculative decoding, with under 1% accuracy loss.
-
Design Insights into Partition Placement and Routing for DNN Inference in Multi-Hop Edge Networks
Joint placement and routing for fixed-partition DNN inference over multi-hop edge networks is addressed with an alternating optimization framework that shows split flexibility matters most in IoT-edge-cloud settings and congestion awareness helps as load increases.
-
Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference
Cloud inference can match or exceed on-device performance for latency-sensitive control in distributed CPS when high-throughput resources amortize network and queueing delays.