Neurosurgeon: Collaborative intelligence between the cloud and mobile edge

· 2017

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

WISV: Wireless-Informed Semantic Verification for Distributed Speculative Decoding in Device-Edge LLM Inference

cs.IT · 2026-04-20 · unverdicted · novelty 7.0

WISV uses a channel-aware semantic acceptance policy on hidden representations to boost accepted sequence length by up to 60.8% and cut interaction rounds by 37.3% in distributed speculative decoding, with under 1% accuracy loss.

Design Insights into Partition Placement and Routing for DNN Inference in Multi-Hop Edge Networks

cs.NI · 2026-04-28 · unverdicted · novelty 5.0

Joint placement and routing for fixed-partition DNN inference over multi-hop edge networks is addressed with an alternating optimization framework that shows split flexibility matters most in IoT-edge-cloud settings and congestion awareness helps as load increases.

Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference

cs.LG · 2026-02-17 · unverdicted · novelty 5.0

Cloud inference can match or exceed on-device performance for latency-sensitive control in distributed CPS when high-throughput resources amortize network and queueing delays.

citing papers explorer

Showing 3 of 3 citing papers.

WISV: Wireless-Informed Semantic Verification for Distributed Speculative Decoding in Device-Edge LLM Inference cs.IT · 2026-04-20 · unverdicted · none · ref 14
WISV uses a channel-aware semantic acceptance policy on hidden representations to boost accepted sequence length by up to 60.8% and cut interaction rounds by 37.3% in distributed speculative decoding, with under 1% accuracy loss.
Design Insights into Partition Placement and Routing for DNN Inference in Multi-Hop Edge Networks cs.NI · 2026-04-28 · unverdicted · none · ref 2
Joint placement and routing for fixed-partition DNN inference over multi-hop edge networks is addressed with an alternating optimization framework that shows split flexibility matters most in IoT-edge-cloud settings and congestion awareness helps as load increases.
Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference cs.LG · 2026-02-17 · unverdicted · none · ref 19
Cloud inference can match or exceed on-device performance for latency-sensitive control in distributed CPS when high-throughput resources amortize network and queueing delays.

Neurosurgeon: Collaborative intelligence between the cloud and mobile edge

fields

years

verdicts

representative citing papers

citing papers explorer