Active Inference-Based Adaptive Routing for Heterogeneous Edge AI Services

Boris Sedlak; Schahram Dustdar; Zihang Wang

arxiv: 2604.17373 · v1 · submitted 2026-04-19 · 💻 cs.DC · cs.ET· cs.PF

Active Inference-Based Adaptive Routing for Heterogeneous Edge AI Services

Zihang Wang , Boris Sedlak , Schahram Dustdar This is my paper

Pith reviewed 2026-05-10 05:56 UTC · model grok-4.3

classification 💻 cs.DC cs.ETcs.PF

keywords active inferenceedge computingadaptive routingAI service orchestrationBayesian inferenceexpected free energyonline learningheterogeneous services

0 comments

The pith

Active inference guides routing for edge AI services by minimizing expected free energy from real-time metrics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AIF-Router, a framework that applies active inference to adaptively route AI services across heterogeneous edge and cloud environments. It shows that by performing real-time Bayesian state inference and minimizing expected free energy, the system can balance latency, throughput, and resource use without any offline training phase. A sympathetic reader would care because traditional orchestration methods often require pre-training or struggle with the variability and unreliability of edge devices, potentially enabling more robust self-managing systems for distributed AI inference.

Core claim

AIF-Router performs Bayesian state inference and expected free energy minimization to guide routing decisions based on observability-driven real-time metrics, resulting in stable online learning behavior for adaptive AI service orchestration in unreliable edge environments.

What carries the argument

The expected free energy minimization process within the active inference framework, which selects routing actions by evaluating the trade-off between information gain and preferred outcomes based on inferred states.

Load-bearing premise

That real-time Bayesian state inference and expected free energy minimization can be performed reliably on unstable edge nodes using only observability-driven metrics without offline training or additional safeguards.

What would settle it

Running AIF-Router on physical edge hardware with induced instability, such as random node failures or metric noise, and measuring whether the learning curve remains stable and outperforms non-adaptive routing in terms of service quality metrics.

Figures

Figures reproduced from arXiv: 2604.17373 by Boris Sedlak, Schahram Dustdar, Zihang Wang.

**Figure 1.** Figure 1: AIF-Router control flow with Bayesian state inference, action selection, and multi-tier request dispatching. As shown in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: P50 latency comparison. AIF-Router achieves 34.7% lower median latency (2003 ms vs. 3067 ms, p < 0.0001) [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Tier allocation comparison. AIF-Router learns to allocate more requests to the heavy tier (46% vs 38%) after observing performance feedback, while experiencing higher failure rates on unstable edge devices. Key Findings: – Latency-reliability tradeoff. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

Edge computing enables AI inference closer to data sources, reducing latency and bandwidth costs. However, orchestrating AI services across the cloud-edge continuum remains challenging due to dynamic workloads and infrastructure variability. We present AIF-Router, an Active Inference--based routing framework that autonomously learns to balance latency, throughput, and resource utilization across multi-tier AI services without offline training. AIF-Router performs Bayesian state inference and expected free energy minimization to guide routing decisions based on observability-driven real-time metrics. Despite device instability on edge nodes, AIF-Router exhibits stable online learning behavior and demonstrates the feasibility of applying Active Inference for adaptive AI service orchestration in unreliable edge environments. Our findings highlight both the promise and practical challenges of deploying self-adaptive decision-making frameworks for real-world edge AI systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AIF-Router applies active inference to edge routing decisions but the stability and feasibility claims lack any supporting measurements or comparisons.

read the letter

The core contribution is a routing framework called AIF-Router that uses Bayesian state inference and expected free energy minimization to adaptively place AI services across cloud-edge tiers. It avoids offline training and instead updates beliefs from live observability metrics, which is a direct application of active inference to a concrete orchestration problem. That combination has not appeared in the routing literature before, and the paper does a clean job laying out how the free-energy objective can trade off latency, throughput, and resource use under uncertainty. The framing of edge instability as a source of partial observability is also reasonable and matches real deployment conditions. What the work does well is keep the method parameter-light and online; it does not rely on pre-collected traces or supervised labels. That is a genuine difference from most reinforcement-learning or heuristic routers. The soft spots are exactly where the stress-test note points. The abstract and the central claims assert stable online behavior and practical feasibility, yet no latency numbers for the inference step, no state-space sizes, no message-passing schedules, and no head-to-head results against baselines appear. Without those, it is impossible to judge whether belief updates stay inside inter-arrival times or survive node failures. The assumption that real-time EFE minimization runs reliably on unstable hardware therefore remains untested in the write-up. This paper is aimed at researchers already comfortable with active inference who want to see it tried on distributed systems problems. A reader looking for validated performance gains or reproducible code will find little to take away. It is coherent on its own terms and engages the literature honestly, so it clears the bar for serious refereeing. I would send it out so the authors can supply the missing timing and comparison data; the idea itself is worth checking.

Referee Report

3 major / 1 minor

Summary. The paper introduces AIF-Router, an Active Inference-based routing framework for orchestrating heterogeneous AI services across the cloud-edge continuum. It claims to enable autonomous adaptation by performing real-time Bayesian state inference and expected free energy minimization on observability-driven metrics, balancing latency, throughput, and resource utilization without offline training. The central result is that the framework exhibits stable online learning behavior and demonstrates the feasibility of active inference for adaptive AI service orchestration despite device instability on edge nodes.

Significance. If substantiated, the work would provide a principled, self-adaptive alternative to heuristic or supervised routing methods in dynamic edge environments, potentially influencing designs for autonomous orchestration in unreliable distributed AI systems. It directly addresses practical challenges of workload variability and infrastructure instability while highlighting deployment challenges for active inference frameworks.

major comments (3)

Abstract: The claim that AIF-Router 'exhibits stable online learning behavior' and 'demonstrates the feasibility' is unsupported by any quantitative results, error bars, baseline comparisons, performance traces, or implementation details, making verification of the central claim impossible.
The manuscript provides no complexity bounds, state-space dimensionality, variational inference approximation details, message-passing schedule, or per-decision latency measurements for the real-time Bayesian state inference and expected free energy minimization steps. These omissions leave the computational tractability and stability under device instability and variable workloads unverified.
No description of experiments, workloads, failure models, or robustness metrics is supplied to test behavior under partial observability from node failures, which is required to substantiate the 'stable online learning' result given the stress-test concern about inference reliability on unstable nodes.

minor comments (1)

The abstract would benefit from explicit definition of the observability-driven metrics and the precise form of the expected free energy used for policy selection.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough and constructive review. The comments highlight important gaps in empirical support, technical specifications, and experimental details that we agree must be addressed to substantiate the central claims. We will perform a major revision incorporating quantitative results, implementation specifics, and experimental descriptions. Point-by-point responses follow.

read point-by-point responses

Referee: Abstract: The claim that AIF-Router 'exhibits stable online learning behavior' and 'demonstrates the feasibility' is unsupported by any quantitative results, error bars, baseline comparisons, performance traces, or implementation details, making verification of the central claim impossible.

Authors: We agree that the abstract's claims require explicit empirical backing. In the revised manuscript, we will modify the abstract to reference specific quantitative outcomes from our evaluations, such as convergence rates of expected free energy with standard deviations, latency/throughput improvements over baselines (e.g., round-robin and load-balancing heuristics), and traces of online adaptation under varying workloads. Implementation details, including the observability metric collection pipeline, will be summarized with pointers to the full experimental section. revision: yes
Referee: The manuscript provides no complexity bounds, state-space dimensionality, variational inference approximation details, message-passing schedule, or per-decision latency measurements for the real-time Bayesian state inference and expected free energy minimization steps. These omissions leave the computational tractability and stability under device instability and variable workloads unverified.

Authors: We acknowledge these omissions limit assessment of real-time feasibility. The revision will add a dedicated subsection on the active inference implementation: state-space dimensionality (defined over latency, throughput, CPU/memory utilization, and node availability variables), variational approximations (mean-field variational inference with factorized posteriors), message-passing schedule (loopy belief propagation with fixed iteration limits), asymptotic complexity (linear in the number of edge nodes per decision), and empirical per-decision latencies measured on representative hardware. These additions will directly address tractability under instability. revision: yes
Referee: No description of experiments, workloads, failure models, or robustness metrics is supplied to test behavior under partial observability from node failures, which is required to substantiate the 'stable online learning' result given the stress-test concern about inference reliability on unstable nodes.

Authors: We concur that the lack of experimental methodology prevents verification of stability claims. We will expand the manuscript with a full experimental evaluation section detailing: workloads (heterogeneous AI services with Poisson arrivals and varying model sizes), failure models (random node crashes inducing partial observability with configurable failure rates), robustness metrics (free energy variance, routing decision stability, SLA violation rates, and online learning convergence under stress), and results from simulated and emulated edge environments. This will provide the required evidence for stable behavior despite device instability. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper applies established active inference (Bayesian inference + expected free energy minimization) drawn from prior literature to an edge routing task and reports empirical stability results. No equations, self-citations, or parameter-fitting steps are shown that reduce the claimed 'stable online learning' outcome to a tautology or to the same fitted data used for validation. The derivation chain remains independent of its inputs and relies on external active-inference foundations plus new experimental observations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so the ledger is necessarily incomplete. The framework rests on standard active-inference assumptions (Bayesian updating and free-energy minimization) plus domain assumptions about observability of edge metrics; no explicit free parameters or invented entities are named.

pith-pipeline@v0.9.0 · 5431 in / 1084 out tokens · 43257 ms · 2026-05-10T05:56:06.110931+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

[1]

Machine Learning47(2–3), 235–256 (2002)

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning47(2–3), 235–256 (2002)

work page 2002
[2]

ACM Queue14(1), 70–93 (2016)

Burns, B., Grant, B., Oppenheimer, D., Brewer, E., Wilkes, J.: Borg, omega, and kubernetes: Lessons learned from three container-management systems over a decade. ACM Queue14(1), 70–93 (2016)

work page 2016
[3]

Chen, Y., Alspaugh, S., Katz, R.: Interactive analytical processing in big data systems:Across-industry studyofmapreduceworkloads.ProceedingsoftheVLDB Endowment5(12), 1802–1813 (2012)

work page 2012
[4]

In: Proceedings of the ACM Symposium on Cloud Computing (SoCC) (2020)

Crankshaw,D.,Sela,G.I.,Mo,X.,Zumar,C.,Stoica,I.,Gonzalez,J.,Tumanov,A.: Inferline: Latency-aware provisioning and scaling for prediction serving pipelines. In: Proceedings of the ACM Symposium on Cloud Computing (SoCC) (2020)

work page 2020
[5]

In: USENIX NSDI (2017)

Crankshaw, D., Wang, X., Zhou, G., Franklin, M.J., Gonzalez, J.E., Stoica, I.: Clipper: A low-latency online prediction serving system. In: USENIX NSDI (2017)

work page 2017
[6]

IEEE Internet of Things Journal7(8), 7457–7469 (2020)

Deng, S., Zhao, H., Fang, W., Yin, J., Dustdar, S., Zomaya, A.Y.: Edge intelli- gence: The confluence of edge computing and artificial intelligence. IEEE Internet of Things Journal7(8), 7457–7469 (2020)

work page 2020
[7]

Frazier, P.I.: A tutorial on bayesian optimization (2018)

work page 2018
[8]

Neural Computation29(1), 1–49 (2017)

Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., Pezzulo, G.: Active in- ference: A process theory. Neural Computation29(1), 1–49 (2017)

work page 2017
[9]

In: USENIX OSDI (2020) 12 Wang et al

Gujarati, A., Karimi, R., Alzayat, S., Hao, W., Kaufmann, A., Vigfusson, Y., Mace, J.: Serving dnns like clockwork: Performance predictability from the bottom up. In: USENIX OSDI (2020) 12 Wang et al

work page 2020
[10]

Wiley-IEEE Press (2004)

Hellerstein, J.L., Diao, Y., Parekh, S., Tilbury, D.M.: Feedback Control of Com- puting Systems. Wiley-IEEE Press (2004)

work page 2004
[11]

In: Proceedings of the ASPLOS

Kang, Y., Hauswald, J., Gao, C., Rovinski, A., Mudge, T., Mars, J., Tang, L.: Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In: Proceedings of the ASPLOS. pp. 615–629 (2017)

work page 2017
[12]

In: Proceedings of the European Con- ference on Artificial Intelligence (ECAI)

Lanillos, P., Pages, J., Cheng, G.: Robot self/other distinction: Active inference meets neural networks learning in a mirror. In: Proceedings of the European Con- ference on Artificial Intelligence (ECAI). pp. 2410–2418 (2020)

work page 2020
[13]

In: Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets)

Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets). pp. 50–56 (2016)

work page 2016
[14]

In: Proceedings of the ACM SIGCOMM Conference

Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learn- ing scheduling algorithms for data processing clusters. In: Proceedings of the ACM SIGCOMM Conference. pp. 270–288 (2019)

work page 2019
[15]

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) pp

McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication- efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) pp. 1273–1282 (2017)

work page 2017
[16]

Journal of Systems and Software137, 491–507 (2018)

Pahl, C., Jamshidi, P., Zimmermann, O.: Microservices: A systematic mapping study. Journal of Systems and Software137, 491–507 (2018)

work page 2018
[17]

MIT Press (2022)

Parr, T., Pezzulo, G., Friston, K.J.: Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. MIT Press (2022)

work page 2022
[18]

In: USENIX ATC

Romero, F., Li, Q., Yadwadkar, N.J., Kozyrakis, C.: Infaas: Automated model-less inference serving. In: USENIX ATC. pp. 397–411 (2021)

work page 2021
[19]

Foundations and Trends in Machine Learning11(1), 1–96 (2018)

Russo, D.J., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z.: A tutorial on thomp- son sampling. Foundations and Trends in Machine Learning11(1), 1–96 (2018)

work page 2018
[20]

Sedlak, B., Furutanpey, A., Wang, Z., Pujol, V.C., Dustdar, S.: Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods (2025)

work page 2025
[21]

Sedlak, B., Pujol, V.C., Morichetta, A., Donta, P.K., Dustdar, S.: Adaptive Stream Processing on Edge Devices through Active Inference (Sep 2024)

work page 2024
[22]

In: Proceedings of the Advances in Neural Information Processing Systems (NIPS)

Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of ma- chine learning algorithms. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS). pp. 2951–2959 (2012)

work page 2012
[23]

In: Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS)

Teerapittayanon, S., McDanel, B., Kung, H.T.: Distributed deep neural networks over the cloud, the edge and end devices. In: Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS). pp. 328–339 (2017)

work page 2017
[24]

In: Proceedings of the IEEE Interna- tional Conference on Autonomic Computing (ICAC)

Tesauro, G., Jong, N.K., Das, R., Bennani, M.N.: A hybrid reinforcement learning approach to autonomic resource allocation. In: Proceedings of the IEEE Interna- tional Conference on Autonomic Computing (ICAC). pp. 65–73 (2006)

work page 2006
[25]

In: USENIX OSDI

Xiao, W., Bhardwaj, R., Ramjee, R., Sivathanu, M., Kwatra, N., Han, Z., Patel, P., Peng, X., Zhao, H., Zhang, Q., Yang, F., Zhou, L.: Gandiva: Introspective cluster scheduling for deep learning. In: USENIX OSDI. pp. 595–610 (2018)

work page 2018
[26]

In: IEEE/ACM International Symposium on Quality of Service (IWQoS)

Zhao, N., Liang, J., Dovrolis, C., Liu, M.: Self-adaptive microservice chains with deep reinforcement learning. In: IEEE/ACM International Symposium on Quality of Service (IWQoS). pp. 1–10 (2019)

work page 2019
[27]

In: Proceedings of the IEEE International Conference on Communications (ICC)

Zhao, T., Zhou, S., Guo, X., Niu, Z.: Tasks scheduling and resource allocation in heterogeneous cloud for delay-bounded mobile edge computing. In: Proceedings of the IEEE International Conference on Communications (ICC). pp. 1–7 (2017)

work page 2017
[28]

Proceedings of the IEEE 107(8), 1738–1762 (2019)

Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE 107(8), 1738–1762 (2019)

work page 2019

[1] [1]

Machine Learning47(2–3), 235–256 (2002)

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning47(2–3), 235–256 (2002)

work page 2002

[2] [2]

ACM Queue14(1), 70–93 (2016)

Burns, B., Grant, B., Oppenheimer, D., Brewer, E., Wilkes, J.: Borg, omega, and kubernetes: Lessons learned from three container-management systems over a decade. ACM Queue14(1), 70–93 (2016)

work page 2016

[3] [3]

Chen, Y., Alspaugh, S., Katz, R.: Interactive analytical processing in big data systems:Across-industry studyofmapreduceworkloads.ProceedingsoftheVLDB Endowment5(12), 1802–1813 (2012)

work page 2012

[4] [4]

In: Proceedings of the ACM Symposium on Cloud Computing (SoCC) (2020)

Crankshaw,D.,Sela,G.I.,Mo,X.,Zumar,C.,Stoica,I.,Gonzalez,J.,Tumanov,A.: Inferline: Latency-aware provisioning and scaling for prediction serving pipelines. In: Proceedings of the ACM Symposium on Cloud Computing (SoCC) (2020)

work page 2020

[5] [5]

In: USENIX NSDI (2017)

Crankshaw, D., Wang, X., Zhou, G., Franklin, M.J., Gonzalez, J.E., Stoica, I.: Clipper: A low-latency online prediction serving system. In: USENIX NSDI (2017)

work page 2017

[6] [6]

IEEE Internet of Things Journal7(8), 7457–7469 (2020)

Deng, S., Zhao, H., Fang, W., Yin, J., Dustdar, S., Zomaya, A.Y.: Edge intelli- gence: The confluence of edge computing and artificial intelligence. IEEE Internet of Things Journal7(8), 7457–7469 (2020)

work page 2020

[7] [7]

Frazier, P.I.: A tutorial on bayesian optimization (2018)

work page 2018

[8] [8]

Neural Computation29(1), 1–49 (2017)

Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., Pezzulo, G.: Active in- ference: A process theory. Neural Computation29(1), 1–49 (2017)

work page 2017

[9] [9]

In: USENIX OSDI (2020) 12 Wang et al

Gujarati, A., Karimi, R., Alzayat, S., Hao, W., Kaufmann, A., Vigfusson, Y., Mace, J.: Serving dnns like clockwork: Performance predictability from the bottom up. In: USENIX OSDI (2020) 12 Wang et al

work page 2020

[10] [10]

Wiley-IEEE Press (2004)

Hellerstein, J.L., Diao, Y., Parekh, S., Tilbury, D.M.: Feedback Control of Com- puting Systems. Wiley-IEEE Press (2004)

work page 2004

[11] [11]

In: Proceedings of the ASPLOS

Kang, Y., Hauswald, J., Gao, C., Rovinski, A., Mudge, T., Mars, J., Tang, L.: Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In: Proceedings of the ASPLOS. pp. 615–629 (2017)

work page 2017

[12] [12]

In: Proceedings of the European Con- ference on Artificial Intelligence (ECAI)

Lanillos, P., Pages, J., Cheng, G.: Robot self/other distinction: Active inference meets neural networks learning in a mirror. In: Proceedings of the European Con- ference on Artificial Intelligence (ECAI). pp. 2410–2418 (2020)

work page 2020

[13] [13]

In: Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets)

Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets). pp. 50–56 (2016)

work page 2016

[14] [14]

In: Proceedings of the ACM SIGCOMM Conference

Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learn- ing scheduling algorithms for data processing clusters. In: Proceedings of the ACM SIGCOMM Conference. pp. 270–288 (2019)

work page 2019

[15] [15]

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) pp

McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication- efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) pp. 1273–1282 (2017)

work page 2017

[16] [16]

Journal of Systems and Software137, 491–507 (2018)

Pahl, C., Jamshidi, P., Zimmermann, O.: Microservices: A systematic mapping study. Journal of Systems and Software137, 491–507 (2018)

work page 2018

[17] [17]

MIT Press (2022)

Parr, T., Pezzulo, G., Friston, K.J.: Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. MIT Press (2022)

work page 2022

[18] [18]

In: USENIX ATC

Romero, F., Li, Q., Yadwadkar, N.J., Kozyrakis, C.: Infaas: Automated model-less inference serving. In: USENIX ATC. pp. 397–411 (2021)

work page 2021

[19] [19]

Foundations and Trends in Machine Learning11(1), 1–96 (2018)

Russo, D.J., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z.: A tutorial on thomp- son sampling. Foundations and Trends in Machine Learning11(1), 1–96 (2018)

work page 2018

[20] [20]

Sedlak, B., Furutanpey, A., Wang, Z., Pujol, V.C., Dustdar, S.: Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods (2025)

work page 2025

[21] [21]

Sedlak, B., Pujol, V.C., Morichetta, A., Donta, P.K., Dustdar, S.: Adaptive Stream Processing on Edge Devices through Active Inference (Sep 2024)

work page 2024

[22] [22]

In: Proceedings of the Advances in Neural Information Processing Systems (NIPS)

Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of ma- chine learning algorithms. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS). pp. 2951–2959 (2012)

work page 2012

[23] [23]

In: Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS)

Teerapittayanon, S., McDanel, B., Kung, H.T.: Distributed deep neural networks over the cloud, the edge and end devices. In: Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS). pp. 328–339 (2017)

work page 2017

[24] [24]

In: Proceedings of the IEEE Interna- tional Conference on Autonomic Computing (ICAC)

Tesauro, G., Jong, N.K., Das, R., Bennani, M.N.: A hybrid reinforcement learning approach to autonomic resource allocation. In: Proceedings of the IEEE Interna- tional Conference on Autonomic Computing (ICAC). pp. 65–73 (2006)

work page 2006

[25] [25]

In: USENIX OSDI

Xiao, W., Bhardwaj, R., Ramjee, R., Sivathanu, M., Kwatra, N., Han, Z., Patel, P., Peng, X., Zhao, H., Zhang, Q., Yang, F., Zhou, L.: Gandiva: Introspective cluster scheduling for deep learning. In: USENIX OSDI. pp. 595–610 (2018)

work page 2018

[26] [26]

In: IEEE/ACM International Symposium on Quality of Service (IWQoS)

Zhao, N., Liang, J., Dovrolis, C., Liu, M.: Self-adaptive microservice chains with deep reinforcement learning. In: IEEE/ACM International Symposium on Quality of Service (IWQoS). pp. 1–10 (2019)

work page 2019

[27] [27]

In: Proceedings of the IEEE International Conference on Communications (ICC)

Zhao, T., Zhou, S., Guo, X., Niu, Z.: Tasks scheduling and resource allocation in heterogeneous cloud for delay-bounded mobile edge computing. In: Proceedings of the IEEE International Conference on Communications (ICC). pp. 1–7 (2017)

work page 2017

[28] [28]

Proceedings of the IEEE 107(8), 1738–1762 (2019)

Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE 107(8), 1738–1762 (2019)

work page 2019