Hybrid Edge-HPC Systems for Low-Latency Data-Driven Inference

Alan Subedi; Andre Merzky; Avhishek Biswas; Benjamin Carter; Chandra Krintz; Douglas Thain; Liubov Kurafeeva; Memet Can Vuran; Michael Fay; Rich Wolski

arxiv: 2605.20532 · v2 · pith:PITKQT6Dnew · submitted 2026-05-19 · 💻 cs.DC

Hybrid Edge-HPC Systems for Low-Latency Data-Driven Inference

Liubov Kurafeeva , Ryan Hartung , Benjamin Carter , Alan Subedi , Avhishek Biswas , Michael Fay , Shantenu Jha , Chandra Krintz

show 4 more authors

Andre Merzky Douglas Thain Memet Can Vuran Rich Wolski

This is my paper

Pith reviewed 2026-05-21 06:11 UTC · model grok-4.3

classification 💻 cs.DC

keywords hybrid edge-HPClow-latency inferencesurrogate modelsasynchronous updatescomputational fluid dynamicsdigital agriculturemodel fidelityreverse backfill

0 comments

The pith

RBF enables continuous low-latency inference at the edge while model accuracy improves asynchronously from HPC simulations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a hybrid system for applications that need fast responses from live sensor data but depend on slow high-performance simulations to build accurate models. RBF deploys simple surrogate models on edge devices to handle immediate inference needs and brings in better versions from remote HPC when they finish. This was shown in a digital agriculture example where airflow in a screenhouse is inferred from CFD simulations. The approach keeps the system responsive and allows accuracy to rise over time even when updates arrive irregularly due to scheduling. A sympathetic reader would care because many cyber-physical systems face this exact tension between speed and model quality.

Core claim

RBF (Reverse Backfill) decouples low-latency inference from simulation-driven training by running lightweight surrogate models at the edge and asynchronously incorporating improved models from HPC backfill computations, allowing continuous operation and progressive fidelity gains despite delayed and irregular model updates in simulation-bounded settings.

What carries the argument

The Reverse Backfill (RBF) architecture, which reinterprets opportunistic HPC backfilling to prioritize model accuracy improvement rather than utilization, through pluggable surrogate models and asynchronous updates across edge to HPC infrastructure.

If this is right

Continuous low-latency inference persists even with HPC scheduling delays and irregular model updates.
Model fidelity improves over time as higher-accuracy versions from simulations become available.
The system orchestrates computation across edge devices, private 5G, cloud, and HPC resources.
Pluggable surrogate models support adaptation to different simulation-driven inference tasks.
Evaluation quantifies the impact of delayed updates on prediction accuracy in a CFD-based airflow inference application.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This method could extend to other domains requiring real-time physical process modeling with slow update cycles, such as environmental monitoring or industrial control.
It offers a practical way to leverage backfill opportunities on HPC systems specifically for enhancing model quality instead of just filling idle time.
Testable extensions include measuring the optimal frequency of surrogate refreshes based on application tolerance for accuracy drift.

Load-bearing premise

Lightweight surrogate models at the edge maintain usable accuracy for the application while waiting for asynchronous updates from HPC simulations.

What would settle it

If the prediction accuracy for airflow patterns in the agricultural screenhouse drops below acceptable levels for the duration of the longest delays between model updates, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.20532 by Alan Subedi, Andre Merzky, Avhishek Biswas, Benjamin Carter, Chandra Krintz, Douglas Thain, Liubov Kurafeeva, Memet Can Vuran, Michael Fay, Rich Wolski, Ryan Hartung, Shantenu Jha.

**Figure 1.** Figure 1: RBF System Architecture The Reverse Backfill architecture has three tiers. (Left) At the remote facility, sensor data is collected and conveyed over a private wireless network to a fault resilient distributed log using a pub/sub protocol. When available, published models are downloaded and used for rapid inference in place of simulation. (Middle) A dedicated, but resource-limited, cluster pulls the publish… view at source ↗

**Figure 2.** Figure 2: Timeline of the RBF instantiation illustrating asynchronous, simulationdriven model updates. Passive data collection (pdc), simulation (sim), and training (train) stages overlap across multiple pipeline instances, while model updates are published opportunistically upon completion. This design enables continuous inference despite irregular and delayed HPC execution. continuous, low-latency inference while… view at source ↗

**Figure 3.** Figure 3: Model accuracy decay over time using different history windows for all three models (PINN, FNO, PCR). The x-axis shows elapsed time ranging from [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Timeline of model publish events for all three model types (PINN, FNO, PCR) during a simultaneous live experiment on two resources: the [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: P-95 Model Transfer Time of 100 runs. Transfer time is in seconds. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Emerging cyber-physical systems increasingly require low-latency inference from streaming sensor data while maintaining models that reflect complex and evolving physical processes. In many domains, however, model updates depend on high-fidelity simulations and training executed on remote high-performance computing (HPC) systems under batch scheduling. This creates a fundamental mismatch between the responsiveness required at the edge and the cost, throughput, and availability of simulation-driven model updates. We present RBF (Reverse Backfill), a hybrid edge-HPC learning and inference architecture that integrates low-latency edge inference with asynchronous, simulation-driven model improvement. RBF targets simulation-bounded settings in which model updates are constrained by simulation throughput and HPC scheduling delays, and reinterprets HPC backfilling by using opportunistic computation to improve model accuracy rather than system utilization. RBF decouples inference from simulation and training by deploying lightweight surrogate models at the edge while incorporating improved models asynchronously as they become available. The architecture supports pluggable surrogate models and orchestrates computation across heterogeneous infrastructure spanning edge devices, private 5G, cloud, and HPC resources. We instantiate RBF using a real-world digital agriculture deployment that couples edge sensing with computational fluid dynamics (CFD) simulations to infer airflow patterns in a large agricultural screenhouse. Our evaluation characterizes end-to-end system behavior under realistic constraints, quantifying simulation latency, training cost, inference throughput, and the impact of delayed model updates on prediction accuracy. Results demonstrate that RBF enables continuous, low-latency inference while improving model fidelity over time despite delayed and irregular model updates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RBF gives a workable pattern for low-latency edge inference with async HPC model updates in sim-heavy settings like digital ag CFD, but the evaluation stops short of showing how long surrogates stay usable under irregular delays.

read the letter

The main thing to know is that this paper describes RBF, a hybrid architecture that runs fast surrogate inference at the edge while pulling in improved models from HPC simulations on an irregular schedule. They apply it to a screenhouse airflow inference task that couples sensors with CFD simulations, and the deployment gives the claims some concrete grounding rather than staying purely architectural.

Referee Report

1 major / 2 minor

Summary. The paper introduces RBF (Reverse Backfill), a hybrid edge-HPC architecture that decouples low-latency inference from simulation-driven model updates by deploying lightweight surrogate models at the edge while asynchronously incorporating improved models from HPC resources. It targets simulation-bounded cyber-physical systems and is evaluated in a real-world digital agriculture deployment that uses edge sensing and CFD simulations to infer airflow patterns in a screenhouse, with measurements of simulation latency, training cost, inference throughput, and accuracy effects from delayed updates.

Significance. If the evaluation demonstrates that surrogate accuracy remains within application-specific limits under irregular HPC update delays, the result would be significant for practical hybrid systems in domains requiring continuous inference with evolving physical models. The real-world deployment and quantification of end-to-end metrics (latency, throughput, accuracy impact) are strengths that ground the architecture beyond abstract claims.

major comments (1)

The central claim that RBF enables continuous low-latency inference while improving model fidelity over time despite delayed and irregular updates is load-bearing on the assumption that lightweight edge surrogates maintain usable accuracy for the CFD airflow task. The abstract states that the evaluation quantifies the impact of delayed model updates on prediction accuracy, yet the manuscript provides no explicit tolerance threshold (e.g., maximum allowable RMSE or percentage error) for the screenhouse application and no analysis or plot of prediction error versus wall-clock delay. Without this, it is not possible to verify that degradation stays within acceptable limits rather than merely improving over time.

minor comments (2)

The architecture section would benefit from a clear diagram illustrating the data and model flow across edge, private 5G, cloud, and HPC components to aid reader understanding of the orchestration.
Clarify how surrogate model pluggability is implemented and whether any specific constraints apply when swapping models in the deployed CFD inference pipeline.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for identifying an important gap in how we present the accuracy results. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: The central claim that RBF enables continuous low-latency inference while improving model fidelity over time despite delayed and irregular updates is load-bearing on the assumption that lightweight edge surrogates maintain usable accuracy for the CFD airflow task. The abstract states that the evaluation quantifies the impact of delayed model updates on prediction accuracy, yet the manuscript provides no explicit tolerance threshold (e.g., maximum allowable RMSE or percentage error) for the screenhouse application and no analysis or plot of prediction error versus wall-clock delay. Without this, it is not possible to verify that degradation stays within acceptable limits rather than merely improving over time.

Authors: We agree that the manuscript would be strengthened by an explicit, application-derived tolerance threshold and by a direct plot of prediction error versus wall-clock delay. While the evaluation already quantifies accuracy changes under delayed updates and shows progressive improvement, it does not define a screenhouse-specific bound (e.g., maximum RMSE tolerable for practical airflow-based climate control) nor isolate error as a function of update latency. In the revised manuscript we will add (1) a short subsection stating the tolerance threshold justified by the digital-agriculture use case and (2) a new figure plotting RMSE against wall-clock time since the last model update. These additions will make the claim that accuracy remains usable under irregular HPC delays directly verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture and evaluation grounded in described deployment

full rationale

The manuscript describes the RBF hybrid edge-HPC architecture, its decoupling of edge inference from asynchronous HPC updates, and an instantiation on a real digital-agriculture CFD screenhouse deployment. Evaluation reports measured quantities (simulation latency, training cost, inference throughput, impact of delayed updates) obtained from that deployment. No equations, fitted parameters, or self-citations are used to derive the central claims; the results are empirical characterizations of the implemented system rather than reductions to inputs by construction. The paper is therefore self-contained against external benchmarks and receives score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper rests on the domain assumption that model updates are constrained by simulation throughput and HPC scheduling delays, and introduces the RBF system design without explicit free parameters or new physical entities in the abstract.

axioms (1)

domain assumption Model updates depend on high-fidelity simulations and training executed on remote HPC systems under batch scheduling, creating a mismatch with edge responsiveness requirements.
Stated directly in the abstract as the fundamental problem RBF addresses.

invented entities (1)

RBF (Reverse Backfill) architecture no independent evidence
purpose: Integrates low-latency edge inference with asynchronous simulation-driven model improvement using pluggable surrogate models.
New system design presented in the paper to solve the stated mismatch.

pith-pipeline@v0.9.0 · 5852 in / 1396 out tokens · 50336 ms · 2026-05-21T06:11:27.263203+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

RBF decouples inference from simulation and training by deploying lightweight surrogate models at the edge while incorporating improved models asynchronously as they become available.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Figure 3 displays the decay of model accuracy over time... mean absolute error (MAE) in meters per second

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.