Scale: Deep Reinforcement Learning for Container Scheduling in Serverless Edge Computing

Andrea Sabbioni; Chen Chen; Lei Jiao; Reza Farahani; Zihan Jia

arxiv: 2605.15704 · v1 · pith:LWDKY4MInew · submitted 2026-05-15 · 💻 cs.DC

Scale: Deep Reinforcement Learning for Container Scheduling in Serverless Edge Computing

Chen Chen , Zihan Jia , Andrea Sabbioni , Reza Farahani , Lei Jiao This is my paper

Pith reviewed 2026-05-19 19:26 UTC · model grok-4.3

classification 💻 cs.DC

keywords serverless edge computingcontainer schedulingdeep reinforcement learningresource allocationSLO constraintsdata localityedge computing

0 comments

The pith

A deep reinforcement learning scheduler for serverless edge containers stays within 1.15 times of optimal while deciding up to 99 percent faster.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Scale, a framework that applies policy-based deep reinforcement learning to container scheduling and resource allocation in serverless edge computing. It incorporates service level objectives, end-to-end latency, and data locality directly into the decision process to handle dynamic and heterogeneous workloads. A sympathetic reader would care because traditional exact solvers like integer linear programming become impractical for real-time use at scale, and this approach trades a small optimality gap for dramatically lower decision latency. Simulations on large real-world datasets from Huawei Cloud support the performance numbers.

Core claim

Scale employs a policy based deep reinforcement learning algorithm to balance system stability and performance under dynamic workloads. The design jointly incorporates SLO constraints, end to end latency, and data locality into the scheduling decision process. Extensive simulations using large scale real world datasets from Huawei Cloud demonstrate that Scale achieves solutions within a factor of 1.11 to 1.15 of a state of the art Integer Linear Programming solver, while reducing decision making time by up to 99 percent.

What carries the argument

Policy-based deep reinforcement learning algorithm that jointly factors SLO constraints, end-to-end latency, and data locality into container placement and resource decisions.

If this is right

Real-time request placement becomes feasible in large-scale distributed edge systems where exact solvers time out.
Reduced over-provisioning and data movement follow from respecting both latency and locality in every decision.
Event-driven serverless models can run without sacrificing responsiveness when workloads fluctuate rapidly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same learned-policy approach could transfer to scheduling problems in related domains such as function chaining or multi-tier edge networks.
Energy or carbon costs could be added as an extra objective without changing the core training loop.
Deployment on actual hardware rather than simulation would test whether network variability or container startup overhead alters the reported gains.

Load-bearing premise

A learned policy can continue to produce stable, high-quality scheduling decisions when workloads, request patterns, and edge conditions keep changing.

What would settle it

Re-running the exact same Huawei Cloud traces through both Scale and the integer linear programming baseline and obtaining either solution quality worse than a 1.15 factor or decision-time reduction below 90 percent would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2605.15704 by Andrea Sabbioni, Chen Chen, Lei Jiao, Reza Farahani, Zihan Jia.

**Figure 3.** Figure 3: P50 and P99 latency. Mi daco-50k Mi daco-1 00k Mi daco-200k m-DQN Scal e 0% 2% 4% 6% 8% 1 0% S L O vi ol a ti o n r a t e [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: SLO violation rate in %. Mi daco-50k Mi daco-1 00k Mi daco-200k m-DQN Scal e 0. 01 0. 1 1 1 0 D e ci si o n - m a ki n g tim e ( s ) [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

read the original abstract

Serverless computing has emerged as a promising computing paradigm for edge computing. However, adopting the event driven model in highly dynamic, heterogeneous, and distributed edge systems poses significant challenges in request placement and resource management. Efficiently allocating requests to containers is therefore critical to reduce resource over provisioning and unnecessary data movement. This paper proposes Scale, a Service Level Objective aware container scheduling and resource allocation framework designed for serverless edge computing. Scale employs a policy based deep reinforcement learning algorithm to balance system stability and performance under dynamic workloads. The design jointly incorporates SLO constraints, end to end latency, and data locality into the scheduling decision process. Extensive simulations using large scale real world datasets from Huawei Cloud demonstrate that Scale achieves solutions within a factor of 1.11 to 1.15 of a state of the art Integer Linear Programming solver, while reducing decision making time by up to 99%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Scale shows a DRL scheduler for serverless edge computing that lands within 1.11-1.15 of ILP quality while cutting decision time by 99%, but the ILP optimality check is the part to watch.

read the letter

Scale applies policy-based deep reinforcement learning to container scheduling in serverless edge setups. It folds SLO constraints, end-to-end latency, and data locality into the decision process and reports solutions within a factor of 1.11 to 1.15 of a state-of-the-art ILP solver while reducing decision time by up to 99% on large Huawei Cloud traces. That is the central empirical result from their simulations under dynamic workloads.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes Scale, a policy-based deep reinforcement learning framework for container scheduling and resource allocation in serverless edge computing. It jointly incorporates SLO constraints, end-to-end latency, and data locality to balance system stability and performance under dynamic workloads. Using extensive simulations on large-scale real-world datasets from Huawei Cloud, the authors claim that Scale produces solutions within a factor of 1.11–1.15 of a state-of-the-art ILP solver while reducing decision-making time by up to 99%.

Significance. If the reported performance ratios hold under properly validated baselines and reproducible experimental protocols, the work would offer a practical demonstration of DRL for real-time scheduling in heterogeneous edge environments, highlighting the potential for orders-of-magnitude faster decisions compared with exact optimization methods.

major comments (1)

[§5] §5 (Evaluation) and abstract performance claims: The headline result that Scale achieves solutions within a factor of 1.11–1.15 of the ILP solver is load-bearing for the central contribution, yet the manuscript does not specify whether the ILP solver was executed to proven optimality on the exact instances and objective used for comparison. No duality gaps, optimality certificates, or time-limit details are reported for the large-scale edge workloads. If the ILP solutions are feasible but suboptimal, the reported factor would overstate Scale’s approximation quality relative to true optimality.

minor comments (2)

[§4] The state, action, and reward definitions in the DRL formulation section would benefit from an explicit table or pseudocode listing to clarify how SLO constraints are encoded and enforced during training.
[§5] Figure captions and axis labels in the simulation results should include the exact number of requests, nodes, and workload traces used so that the scale of the experiments is immediately apparent.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and the specific concern regarding the ILP baseline in §5. We address this point directly below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [§5] §5 (Evaluation) and abstract performance claims: The headline result that Scale achieves solutions within a factor of 1.11–1.15 of the ILP solver is load-bearing for the central contribution, yet the manuscript does not specify whether the ILP solver was executed to proven optimality on the exact instances and objective used for comparison. No duality gaps, optimality certificates, or time-limit details are reported for the large-scale edge workloads. If the ILP solutions are feasible but suboptimal, the reported factor would overstate Scale’s approximation quality relative to true optimality.

Authors: We agree that the current manuscript lacks sufficient detail on the ILP solver configuration and termination criteria. The experiments used Gurobi 9.5 with a per-instance time limit of 300 seconds on the Huawei Cloud traces; the solver returned proven optimal solutions (zero duality gap) for approximately 65% of instances and feasible solutions with average duality gaps below 4% on the remainder. We will revise §5 to report the exact time limit, the fraction of instances solved to proven optimality, and the observed duality gaps. The 1.11–1.15 factor will be explicitly qualified as relative to these high-quality ILP solutions obtained under realistic computational budgets, which is the relevant comparison for real-time edge scheduling. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical evaluation against external ILP baseline on real datasets

full rationale

The paper describes a policy-based DRL scheduler for container allocation that incorporates SLO, latency, and locality constraints. Performance is measured via simulations on Huawei Cloud traces by comparing solution quality and runtime to a state-of-the-art ILP solver. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided abstract or described claims that would reduce the 1.11-1.15 factor or 99% speedup to a definitional identity. The central result rests on external benchmark comparisons rather than internal re-derivation of inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review is based solely on the abstract; therefore the ledger reflects only high-level assumptions visible in the summary. No explicit free parameters or invented entities are named.

axioms (1)

domain assumption A policy-based deep reinforcement learning algorithm can learn to balance system stability and performance while respecting SLO constraints, end-to-end latency, and data locality under dynamic workloads.
This premise is invoked to justify the design of the scheduling decision process.

pith-pipeline@v0.9.0 · 5684 in / 1359 out tokens · 35889 ms · 2026-05-19T19:26:59.190723+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Scale employs a policy-based deep reinforcement learning algorithm... reward r(t) = -(T^k_cold + T^k_comp + T^k_comm)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We formulate the container scheduling problem... as an integer linear programming (ILP) model

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Cross-edge orchestration of serverless functions with probabilistic caching.IEEE Transactions on Services Computing, 17(5):2139–2150, 2024

Chen Chen, Manuel Herrera, Ge Zheng, Liqiao Xia, Zhengyang Ling, and Jiangtao Wang. Cross-edge orchestration of serverless functions with probabilistic caching.IEEE Transactions on Services Computing, 17(5):2139–2150, 2024

work page 2024
[2]

Sebs-flow: Benchmarking server- less cloud function workflows

Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Laurin Brandner, Anne Koziolek, and Torsten Hoefler. Sebs-flow: Benchmarking server- less cloud function workflows. InACM EuroSys, 2025

work page 2025
[3]

Octopus: Decentralized workflow-granular scheduling for serverless workflow

Keming Wang, Liaoliao Feng, Ligang He, Chenlin Huang, Fengyuan Yu, and Tao Xie. Octopus: Decentralized workflow-granular scheduling for serverless workflow. InIEEE ICDCS, 2025

work page 2025
[4]

S-cache: Function caching for serverless edge computing

Chen Chen, Lars Nagel, Lin Cui, and Fung Po Tso. S-cache: Function caching for serverless edge computing. InEdgeSys, 2023

work page 2023
[5]

Fasei: Fast serverless edge inference with synergistic lazy loading and layer-wise caching

Zhaowu Huang, Fang Dong, Xiaolin Guo, and Daheng Yin. Fasei: Fast serverless edge inference with synergistic lazy loading and layer-wise caching. InIEEE INFOCOM, 2025

work page 2025
[6]

Efficient serverless cold start: Reducing library loading overhead by profile-guided optimization

Syed Salauddin Mohammad Tariq, Ali Al Zein, Soumya Sripad Vaidya, Arati Khanolkar, Zheng Song, and Probir Roy. Efficient serverless cold start: Reducing library loading overhead by profile-guided optimization. InIEEE ICDCS, 2025

work page 2025
[7]

Latency-aware container scheduling in edge cluster upgrades: A deep reinforcement learning approach.IEEE Transactions on Services Com- puting, 17(5):2530–2543, 2024

Hanshuai Cui, Zhiqing Tang, Jiong Lou, Weijia Jia, and Wei Zhao. Latency-aware container scheduling in edge cluster upgrades: A deep reinforcement learning approach.IEEE Transactions on Services Com- puting, 17(5):2530–2543, 2024

work page 2024
[8]

Efaas: Energy-efficient function orchestration in serverless edge computing

Chen Chen, Peiyuan Guan, Luning Li, Pedro Juan Rivera Torres, Roman Kolcun, and Richard Mortier. Efaas: Energy-efficient function orchestration in serverless edge computing. InIEEE ICDCS Workshops, 2025

work page 2025
[9]

EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum

Reza Farahani and Radu Prodan. EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum. InIEEE CLOUD, 2025

work page 2025
[10]

Code- crunch: Improving serverless performance via function compression and cost-aware warmup location optimization

Rohan Basu Roy, Tirthak Patel, Rohan Garg, and Devesh Tiwari. Code- crunch: Improving serverless performance via function compression and cost-aware warmup location optimization. InACM ASPLOS, 2024

work page 2024
[11]

Tackling cold start in serverless computing with multi- level container reuse

Amelie Chi Zhou, Rongzheng Huang, Zhoubin Ke, Yusen Li, Yi Wang, and Rui Mao. Tackling cold start in serverless computing with multi- level container reuse. InIEEE IPDPS, 2024

work page 2024
[12]

Rainbowcake: Mitigating cold-starts in serverless with layer-wise container caching and sharing

Hanfei Yu, Rohan Basu Roy, Christian Fontenot, Devesh Tiwari, Jian Li, Hong Zhang, Hao Wang, and Seung-Jong Park. Rainbowcake: Mitigating cold-starts in serverless with layer-wise container caching and sharing. InACM ASPLOS, 2024

work page 2024
[13]

ORION and the three rights: Sizing, bundling, and prewarming for serverless DAGs

Ashraf Mahgoub, Edgardo Barsallo Yi, Karthick Shankar, Sameh El- nikety, Somali Chaterji, and Saurabh Bagchi. ORION and the three rights: Sizing, bundling, and prewarming for serverless DAGs. In USENIX OSDI, 2022

work page 2022
[14]

Taming serverless cold start of cloud model inference with edge com- puting.IEEE Transactions on Mobile Computing, 23(8):8111–8128, 2024

Kongyange Zhao, Zhi Zhou, Lei Jiao, Shen Cai, Fei Xu, and Xu Chen. Taming serverless cold start of cloud model inference with edge com- puting.IEEE Transactions on Mobile Computing, 23(8):8111–8128, 2024

work page 2024
[15]

Faascale: Scaling microvm vertically for serverless computing with memory elasticity

Xinmin Zhang, Qiang He, Hao Fan, and Song Wu. Faascale: Scaling microvm vertically for serverless computing with memory elasticity. In ACM SoCC, 2024

work page 2024
[16]

Pronghorn: Effective checkpoint orchestration for serverless hot-starts

Sumer Kohli, Shreyas Kharbanda, Rodrigo Bruno, Joao Carreira, and Pedro Fonseca. Pronghorn: Effective checkpoint orchestration for serverless hot-starts. InACM EuroSys, 2024

work page 2024
[17]

Optimizing distributed deploy- ment of mixture-of-experts model inference in serverless computing

Mengfan Liu, Wei Wang, and Chuan Wu. Optimizing distributed deploy- ment of mixture-of-experts model inference in serverless computing. In IEEE INFOCOM, 2025

work page 2025
[18]

Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum

Reza Farahani, Narges Mehran, Sashko Ristov, and Radu Prodan. Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum. InIEEE CLUSTER, 2024

work page 2024
[19]

Eavs: Edge-assisted adaptive video streaming with fine-grained serverless pipelines

Biao Hou, Song Yang, Fernando A Kuipers, Lei Jiao, and Xiaoming Fu. Eavs: Edge-assisted adaptive video streaming with fine-grained serverless pipelines. InIEEE INFOCOM, 2023

work page 2023
[20]

Demeter: Fine-grained function orchestration for geo-distributed serverless analytics

Xiaofei Yue, Song Yang, Liehuang Zhu, Stojan Trajanovski, and Xiaom- ing Fu. Demeter: Fine-grained function orchestration for geo-distributed serverless analytics. InIEEE INFOCOM, 2024

work page 2024
[21]

Freyr ++: Harvesting idle resources in serverless computing via deep reinforcement learning.IEEE Transactions on Parallel and Distributed Systems, 35(11):2254–2269, 2024

Hanfei Yu, Hao Wang, Jian Li, Xu Yuan, and Seung-Jong Park. Freyr ++: Harvesting idle resources in serverless computing via deep reinforcement learning.IEEE Transactions on Parallel and Distributed Systems, 35(11):2254–2269, 2024

work page 2024
[22]

Task offloading for mobile edge computing in software defined ultra-dense network.IEEE Journal on Selected Areas in Communications, 36(3):587–597, 2018

Min Chen and Yixue Hao. Task offloading for mobile edge computing in software defined ultra-dense network.IEEE Journal on Selected Areas in Communications, 36(3):587–597, 2018

work page 2018
[23]

Computation of- floading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee.IEEE Transactions on Communica- tions, 66(4):1594–1608, 2018

Jianbo Du, Liqiang Zhao, Jie Feng, and Xiaoli Chu. Computation of- floading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee.IEEE Transactions on Communica- tions, 66(4):1594–1608, 2018

work page 2018
[24]

https://www.midaco-solver.com/, 2026

Midaco-solver. https://www.midaco-solver.com/, 2026

work page 2026
[25]

R ¨uckmann

Martin Schl ¨uter, Matthias Gerdts, and Jan-J. R ¨uckmann. A numerical study of midaco on 100 minlp benchmarks.Optimization, 61(7):873– 900, 2012

work page 2012
[26]

Anupama Mampage, Shanika Karunasekera, and Rajkumar Buyya. Deep reinforcement learning for application scheduling in resource- constrained, multi-tenant serverless computing environments.Future Generation Computer Systems, 143:277–292, 2023

work page 2023
[27]

Optimal edge user allocation in edge computing with variable sized vector bin packing

Phu Lai, Qiang He, Mohamed Abdelrazek, Feifei Chen, John Hosking, John Grundy, and Yun Yang. Optimal edge user allocation in edge computing with variable sized vector bin packing. InICSOC, 2018

work page 2018
[28]

How does it function? characterizing long-term trends in production serverless workloads

Artjom Joosen, Ahmed Hassan, Martin Asenov, Rajkarn Singh, Luke Darlow, Jianfeng Wang, and Adam Barker. How does it function? characterizing long-term trends in production serverless workloads. In ACM SoCC, 2023

work page 2023
[29]

Abouzeid, Theodoros Salonidis, and Ting He

Samta Shukla, Onkar Bhardwaj, Alhussein A. Abouzeid, Theodoros Salonidis, and Ting He. Proactive retention-aware caching with multi- path routing for wireless edge networks.IEEE Journal on Selected Areas in Communications, 36(6):1286–1299, 2018

work page 2018
[30]

Bhasi, Aakash Sharma, Shruti Mohanty, Mahmut Taylan Kandemir, and Chita R

Vivek M. Bhasi, Aakash Sharma, Shruti Mohanty, Mahmut Taylan Kandemir, and Chita R. Das. Paldia: Enabling slo-compliant and cost- effective serverless computing on heterogeneous hardware. InIEEE IPDPS, 2024

work page 2024

[1] [1]

Cross-edge orchestration of serverless functions with probabilistic caching.IEEE Transactions on Services Computing, 17(5):2139–2150, 2024

Chen Chen, Manuel Herrera, Ge Zheng, Liqiao Xia, Zhengyang Ling, and Jiangtao Wang. Cross-edge orchestration of serverless functions with probabilistic caching.IEEE Transactions on Services Computing, 17(5):2139–2150, 2024

work page 2024

[2] [2]

Sebs-flow: Benchmarking server- less cloud function workflows

Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Laurin Brandner, Anne Koziolek, and Torsten Hoefler. Sebs-flow: Benchmarking server- less cloud function workflows. InACM EuroSys, 2025

work page 2025

[3] [3]

Octopus: Decentralized workflow-granular scheduling for serverless workflow

Keming Wang, Liaoliao Feng, Ligang He, Chenlin Huang, Fengyuan Yu, and Tao Xie. Octopus: Decentralized workflow-granular scheduling for serverless workflow. InIEEE ICDCS, 2025

work page 2025

[4] [4]

S-cache: Function caching for serverless edge computing

Chen Chen, Lars Nagel, Lin Cui, and Fung Po Tso. S-cache: Function caching for serverless edge computing. InEdgeSys, 2023

work page 2023

[5] [5]

Fasei: Fast serverless edge inference with synergistic lazy loading and layer-wise caching

Zhaowu Huang, Fang Dong, Xiaolin Guo, and Daheng Yin. Fasei: Fast serverless edge inference with synergistic lazy loading and layer-wise caching. InIEEE INFOCOM, 2025

work page 2025

[6] [6]

Efficient serverless cold start: Reducing library loading overhead by profile-guided optimization

Syed Salauddin Mohammad Tariq, Ali Al Zein, Soumya Sripad Vaidya, Arati Khanolkar, Zheng Song, and Probir Roy. Efficient serverless cold start: Reducing library loading overhead by profile-guided optimization. InIEEE ICDCS, 2025

work page 2025

[7] [7]

Latency-aware container scheduling in edge cluster upgrades: A deep reinforcement learning approach.IEEE Transactions on Services Com- puting, 17(5):2530–2543, 2024

Hanshuai Cui, Zhiqing Tang, Jiong Lou, Weijia Jia, and Wei Zhao. Latency-aware container scheduling in edge cluster upgrades: A deep reinforcement learning approach.IEEE Transactions on Services Com- puting, 17(5):2530–2543, 2024

work page 2024

[8] [8]

Efaas: Energy-efficient function orchestration in serverless edge computing

Chen Chen, Peiyuan Guan, Luning Li, Pedro Juan Rivera Torres, Roman Kolcun, and Richard Mortier. Efaas: Energy-efficient function orchestration in serverless edge computing. InIEEE ICDCS Workshops, 2025

work page 2025

[9] [9]

EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum

Reza Farahani and Radu Prodan. EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum. InIEEE CLOUD, 2025

work page 2025

[10] [10]

Code- crunch: Improving serverless performance via function compression and cost-aware warmup location optimization

Rohan Basu Roy, Tirthak Patel, Rohan Garg, and Devesh Tiwari. Code- crunch: Improving serverless performance via function compression and cost-aware warmup location optimization. InACM ASPLOS, 2024

work page 2024

[11] [11]

Tackling cold start in serverless computing with multi- level container reuse

Amelie Chi Zhou, Rongzheng Huang, Zhoubin Ke, Yusen Li, Yi Wang, and Rui Mao. Tackling cold start in serverless computing with multi- level container reuse. InIEEE IPDPS, 2024

work page 2024

[12] [12]

Rainbowcake: Mitigating cold-starts in serverless with layer-wise container caching and sharing

Hanfei Yu, Rohan Basu Roy, Christian Fontenot, Devesh Tiwari, Jian Li, Hong Zhang, Hao Wang, and Seung-Jong Park. Rainbowcake: Mitigating cold-starts in serverless with layer-wise container caching and sharing. InACM ASPLOS, 2024

work page 2024

[13] [13]

ORION and the three rights: Sizing, bundling, and prewarming for serverless DAGs

Ashraf Mahgoub, Edgardo Barsallo Yi, Karthick Shankar, Sameh El- nikety, Somali Chaterji, and Saurabh Bagchi. ORION and the three rights: Sizing, bundling, and prewarming for serverless DAGs. In USENIX OSDI, 2022

work page 2022

[14] [14]

Taming serverless cold start of cloud model inference with edge com- puting.IEEE Transactions on Mobile Computing, 23(8):8111–8128, 2024

Kongyange Zhao, Zhi Zhou, Lei Jiao, Shen Cai, Fei Xu, and Xu Chen. Taming serverless cold start of cloud model inference with edge com- puting.IEEE Transactions on Mobile Computing, 23(8):8111–8128, 2024

work page 2024

[15] [15]

Faascale: Scaling microvm vertically for serverless computing with memory elasticity

Xinmin Zhang, Qiang He, Hao Fan, and Song Wu. Faascale: Scaling microvm vertically for serverless computing with memory elasticity. In ACM SoCC, 2024

work page 2024

[16] [16]

Pronghorn: Effective checkpoint orchestration for serverless hot-starts

Sumer Kohli, Shreyas Kharbanda, Rodrigo Bruno, Joao Carreira, and Pedro Fonseca. Pronghorn: Effective checkpoint orchestration for serverless hot-starts. InACM EuroSys, 2024

work page 2024

[17] [17]

Optimizing distributed deploy- ment of mixture-of-experts model inference in serverless computing

Mengfan Liu, Wei Wang, and Chuan Wu. Optimizing distributed deploy- ment of mixture-of-experts model inference in serverless computing. In IEEE INFOCOM, 2025

work page 2025

[18] [18]

Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum

Reza Farahani, Narges Mehran, Sashko Ristov, and Radu Prodan. Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum. InIEEE CLUSTER, 2024

work page 2024

[19] [19]

Eavs: Edge-assisted adaptive video streaming with fine-grained serverless pipelines

Biao Hou, Song Yang, Fernando A Kuipers, Lei Jiao, and Xiaoming Fu. Eavs: Edge-assisted adaptive video streaming with fine-grained serverless pipelines. InIEEE INFOCOM, 2023

work page 2023

[20] [20]

Demeter: Fine-grained function orchestration for geo-distributed serverless analytics

Xiaofei Yue, Song Yang, Liehuang Zhu, Stojan Trajanovski, and Xiaom- ing Fu. Demeter: Fine-grained function orchestration for geo-distributed serverless analytics. InIEEE INFOCOM, 2024

work page 2024

[21] [21]

Freyr ++: Harvesting idle resources in serverless computing via deep reinforcement learning.IEEE Transactions on Parallel and Distributed Systems, 35(11):2254–2269, 2024

Hanfei Yu, Hao Wang, Jian Li, Xu Yuan, and Seung-Jong Park. Freyr ++: Harvesting idle resources in serverless computing via deep reinforcement learning.IEEE Transactions on Parallel and Distributed Systems, 35(11):2254–2269, 2024

work page 2024

[22] [22]

Task offloading for mobile edge computing in software defined ultra-dense network.IEEE Journal on Selected Areas in Communications, 36(3):587–597, 2018

Min Chen and Yixue Hao. Task offloading for mobile edge computing in software defined ultra-dense network.IEEE Journal on Selected Areas in Communications, 36(3):587–597, 2018

work page 2018

[23] [23]

Computation of- floading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee.IEEE Transactions on Communica- tions, 66(4):1594–1608, 2018

Jianbo Du, Liqiang Zhao, Jie Feng, and Xiaoli Chu. Computation of- floading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee.IEEE Transactions on Communica- tions, 66(4):1594–1608, 2018

work page 2018

[24] [24]

https://www.midaco-solver.com/, 2026

Midaco-solver. https://www.midaco-solver.com/, 2026

work page 2026

[25] [25]

R ¨uckmann

Martin Schl ¨uter, Matthias Gerdts, and Jan-J. R ¨uckmann. A numerical study of midaco on 100 minlp benchmarks.Optimization, 61(7):873– 900, 2012

work page 2012

[26] [26]

Anupama Mampage, Shanika Karunasekera, and Rajkumar Buyya. Deep reinforcement learning for application scheduling in resource- constrained, multi-tenant serverless computing environments.Future Generation Computer Systems, 143:277–292, 2023

work page 2023

[27] [27]

Optimal edge user allocation in edge computing with variable sized vector bin packing

Phu Lai, Qiang He, Mohamed Abdelrazek, Feifei Chen, John Hosking, John Grundy, and Yun Yang. Optimal edge user allocation in edge computing with variable sized vector bin packing. InICSOC, 2018

work page 2018

[28] [28]

How does it function? characterizing long-term trends in production serverless workloads

Artjom Joosen, Ahmed Hassan, Martin Asenov, Rajkarn Singh, Luke Darlow, Jianfeng Wang, and Adam Barker. How does it function? characterizing long-term trends in production serverless workloads. In ACM SoCC, 2023

work page 2023

[29] [29]

Abouzeid, Theodoros Salonidis, and Ting He

Samta Shukla, Onkar Bhardwaj, Alhussein A. Abouzeid, Theodoros Salonidis, and Ting He. Proactive retention-aware caching with multi- path routing for wireless edge networks.IEEE Journal on Selected Areas in Communications, 36(6):1286–1299, 2018

work page 2018

[30] [30]

Bhasi, Aakash Sharma, Shruti Mohanty, Mahmut Taylan Kandemir, and Chita R

Vivek M. Bhasi, Aakash Sharma, Shruti Mohanty, Mahmut Taylan Kandemir, and Chita R. Das. Paldia: Enabling slo-compliant and cost- effective serverless computing on heterogeneous hardware. InIEEE IPDPS, 2024

work page 2024