RRTO: A High-Performance Transparent Offloading System for Model Inference in Mobile Edge Computing

Fangming Liu; Haoze Song; Heming Cui; Jun Luo; Wei Ni; Xiuxian Guan; Yuhao Qing; Zekai Sun; Zhe Chen; Zheng Lin

arxiv: 2507.21739 · v1 · pith:74ZZTH3Qnew · submitted 2025-07-29 · 💻 cs.NI

RRTO: A High-Performance Transparent Offloading System for Model Inference in Mobile Edge Computing

Zekai Sun , Xiuxian Guan , Zheng Lin , Yuhao Qing , Haoze Song , Zihan Fang , Zhe Chen , Fangming Liu

show 3 more authors

Heming Cui Wei Ni Jun Luo

This is my paper

classification 💻 cs.NI

keywords rrtotransparentoffloadingcodecompatibilityinferencemobilenon-transparent

0 comments

read the original abstract

Deploying Machine Learning (ML) applications on resource-constrained mobile devices remains challenging due to limited computational resources and poor platform compatibility. While Mobile Edge Computing (MEC) offers offloading-based inference paradigm using GPU servers, existing approaches are divided into non-transparent and transparent methods, with the latter necessitating modifications to the source code. Non-transparent offloading achieves high performance but requires intrusive code modification, limiting compatibility with diverse applications. Transparent offloading, in contrast, offers wide compatibility but introduces significant transmission delays due to per-operator remote procedure calls (RPCs). To overcome this limitation, we propose RRTO, the first high-performance transparent offloading system tailored for MEC inference. RRTO introduces a record/replay mechanism that leverages the static operator sequence in ML models to eliminate repetitive RPCs. To reliably identify this sequence, RRTO integrates a novel Operator Sequence Search algorithm that detects repeated patterns, filters initialization noise, and accelerates matching via a two-level strategy. Evaluation demonstrates that RRTO achieves substantial reductions of up to 98% in both per-inference latency and energy consumption compared to state-of-the-art transparent methods and yields results comparable to non-transparent approaches, all without necessitating any source code modification.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Physically-Induced Atmospheric Adversarial Perturbations: Enhancing Transferability and Robustness in Remote Sensing Image Classification
cs.CV 2026-04 unverdicted novelty 7.0

FogFool creates fog-based adversarial perturbations using Perlin noise optimization to achieve high black-box transferability (83.74% TASR) and robustness to defenses in remote sensing classification.
Dual-Envelope Constrained Nonlinear MPC for Distributed Drive Electric Vehicles Drifting Under Bounded Steering and Direct Yaw-Moment Control
eess.SY 2026-04 unverdicted novelty 6.0

The extended dual-envelope NMPC enables smoother drifting convergence and cuts steady-state tracking errors in speed, sideslip angle, and yaw rate by 33%, 71%, and 31% respectively in hardware tests.
SL-FAC: A Communication-Efficient Split Learning Framework with Frequency-Aware Compression
cs.LG 2026-04 unverdicted novelty 6.0

SL-FAC reduces communication in split learning via frequency-aware compression of activations and gradients while aiming to preserve training-critical information.
SwarmSense-DNN: A Trustworthy and Decentralized Neural Framework for Proactive Anomaly Defense in Consumer IoT
cs.CR 2026-06 unverdicted novelty 3.0

SwarmSense-DNN is a proposed decentralized neural framework that integrates swarm intelligence with hierarchical federated learning and graph neural networks to achieve 95.44% anomaly detection accuracy and 67% reduce...