Dispatching Odyssey: Exploring Performance in Computing Clusters under Real-world Workloads

Alexey Rolich; Andrea Baiocchi; Mert Yildiz

arxiv: 2504.10184 · v1 · submitted 2025-04-14 · 💻 cs.DC

Dispatching Odyssey: Exploring Performance in Computing Clusters under Real-world Workloads

Mert Yildiz , Alexey Rolich , Andrea Baiocchi This is my paper

Pith reviewed 2026-05-22 20:26 UTC · model grok-4.3

classification 💻 cs.DC

keywords dispatching policiescomputing clustersworkload tracesresponse timeJoin Idle QueueLeast Work LeftRound Robintask decomposition

0 comments

The pith

Real Google traces reveal optimal cluster sizes for JIQ and LWL policies at fixed utilization, while Round Robin worsens with scale and simpler JIQ beats size-aware LWL on task-decomposed jobs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper runs data-driven simulations on Google data center workload traces to test dispatching policies as the number of servers changes while holding utilization constant. It reports that Join Idle Queue and Least Work Left reach their best response times at particular cluster sizes, whereas Round Robin performance steadily declines. When jobs are split into tasks, the non-size-based JIQ policy outperforms the more powerful size-based LWL. The work also tests simple G/G queue approximations and shows preliminary gains from a two-stage scheduler that partitions tasks by service thresholds.

Core claim

For a fixed utilization coefficient, Join Idle Queue and Least Work Left exhibit an optimal working point as the number of servers varies, while Round Robin shows monotonously worsening performance. When jobs are decomposed into tasks, the simpler non-size-based JIQ policy outperforms the size-based LWL policy. A two-stage scheduling approach that partitions tasks based on service thresholds yields further performance gains under these realistic traces.

What carries the argument

Job-level and task-level dispatching policies (Join Idle Queue, Least Work Left, Round Robin) evaluated through simulations on real workload traces, with cluster size varied at constant utilization.

If this is right

JIQ and LWL reach minimum response time at specific cluster sizes for any fixed utilization.
Round Robin response time increases steadily as more servers are added.
Task decomposition reverses the expected ranking between size-aware and non-size-aware policies.
A simple two-stage partition of tasks by service threshold improves performance without changing the base dispatching policy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Clusters could be sized deliberately rather than scaled indefinitely to hit the observed optima.
Task decomposition logic may matter more for policy selection than raw policy complexity.
G/G queue approximations need workload-specific checks before they can guide real sizing decisions.
Modest two-stage modifications could be tested on production systems to capture the reported gains.

Load-bearing premise

The Google workload traces together with the simulation rules for task decomposition and lack of network effects are representative enough that the observed optimal points and policy rankings will hold in other clusters.

What would settle it

Applying the same policies and simulation setup to a second independent public workload trace and checking whether the optimal points for JIQ and LWL and the JIQ-over-LWL ranking under task decomposition still appear.

Figures

Figures reproduced from arXiv: 2504.10184 by Alexey Rolich, Andrea Baiocchi, Mert Yildiz.

**Figure 1.** Figure 1: Mean response time of a cluster of n servers as a function of n. Lines are computed by using analytical models, and markers with the dashed lines correspond to data-driven simulations based on the workload of day 4 of the Google trace of cell c. The utilization coefficient of servers is ρ0 = 0.8. Note: The data used includes original values with outliers and is not shuffled. of servers grows. Due to poor j… view at source ↗

**Figure 2.** Figure 2: Decomposition of original data and comparative an [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Mean response time of a cluster of n servers as a function of n. Lines are computed by using analytical models, and markers with the dashed lines correspond to data-driven simulations based on the workload of day 4 of the Google trace of cell c. The utilization coefficient of servers is ρ0 = 0.8. Note: The data used does not include outliers but both IAT and CPU values are kept unchanged. a significant dif… view at source ↗

**Figure 5.** Figure 5: Mean response time of dispatching algorithms for v [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Job slowdown of a single-stage server cluster as a fu [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Decomposition of original data and comparative an [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Probability of at least one server being idle upon j [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Job slowdown in a single-stage server cluster as a fu [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Mean job response time as a function of the overall [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

read the original abstract

Recent workload measurements in Google data centers provide an opportunity to challenge existing models and, more broadly, to enhance the understanding of dispatching policies in computing clusters. Through extensive data-driven simulations, we aim to highlight the key features of workload traffic traces that influence response time performance under simple yet representative dispatching policies. For a given computational power budget, we vary the cluster size, i.e., the number of available servers. A job-level analysis reveals that Join Idle Queue (JIQ) and Least Work Left (LWL) exhibit an optimal working point for a fixed utilization coefficient as the number of servers is varied, whereas Round Robin (RR) demonstrates monotonously worsening performance. Additionally, we explore the accuracy of simple G/G queue approximations. When decomposing jobs into tasks, interesting results emerge; notably, the simpler, non-size-based policy JIQ appears to outperform the more "powerful" size-based LWL policy. Complementing these findings, we present preliminary results on a two-stage scheduling approach that partitions tasks based on service thresholds, illustrating that modest architectural modifications can further enhance performance under realistic workload conditions. We provide insights into these results and suggest promising directions for fully explaining the observed phenomena.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Trace-driven sims on Google workloads find optimal cluster size for JIQ and LWL at fixed load plus JIQ beating LWL after task split, but methods stay high-level with no validation or sensitivity checks.

read the letter

The main things here are the simulation results on Google traces: at fixed utilization, JIQ and LWL hit an optimal cluster size while RR gets steadily worse, and splitting jobs into tasks flips the ranking so plain JIQ beats size-aware LWL. They also check G/G approximations and sketch a two-stage scheduler that splits tasks by size thresholds. These are concrete, incremental observations from real traces rather than new theory. The cluster-size sweep at constant load and the task-decomposition reversal are the parts that stand out as worth noting for people who run clusters. The work uses established policies on fresh data and reports the outcomes plainly. The soft spots sit in the execution. The abstract and summary give only a high-level picture of the simulations, with no visible checks on how well the model matches the traces, no error bars, and no tests for sensitivity to modeling choices such as task decomposition rules or the decision to ignore network and overhead effects. That leaves the optimal-point claims and the JIQ-LWL reversal resting on unexamined simulation fidelity. The study is empirical and does not claim derivations or self-referential predictions, so there is no circularity issue. This paper is for practitioners who want trace-driven numbers on dispatching under realistic traffic rather than for theorists. A reader who cares about cluster tuning at scale will get some usable pointers, but anyone wanting to build on the numbers will need the full methods and code first. It deserves a serious referee to examine the simulation setup and see whether the reported reversals hold under closer scrutiny.

Referee Report

2 major / 2 minor

Summary. The paper uses data-driven simulations on Google workload traces to study dispatching policies (JIQ, LWL, RR) in clusters. For fixed utilization coefficient, it reports that JIQ and LWL exhibit an optimal cluster size (number of servers) for response time while RR worsens monotonically; after decomposing jobs into tasks, JIQ outperforms the size-based LWL; it also examines G/G approximations and a preliminary two-stage scheduling approach.

Significance. If the simulation fidelity holds, the results provide empirical evidence that cluster-size scaling under realistic traces can produce non-monotonic behavior for some policies and that task decomposition can reverse expected policy rankings, with modest architectural changes offering further gains. The work is notable for grounding claims in production traces rather than synthetic models.

major comments (2)

[Simulation methodology] Simulation methodology section: the description of the data-driven simulations provides no details on validation against the Google traces (e.g., goodness-of-fit metrics, reproduction of trace statistics), error bars on reported response times, or sensitivity analysis to modeling choices such as task decomposition rules and the omission of network/overhead effects. These omissions are load-bearing because the central claims about optimal working points and policy ranking rest on unexamined simulation fidelity.
[Results on job decomposition] Job-decomposition results: the claim that JIQ outperforms LWL after task decomposition is presented without quantitative metrics (specific response-time ratios, confidence intervals, or statistical tests) or ablation on the decomposition rules, making it difficult to assess whether the ranking reversal is robust or an artifact of the modeling assumptions.

minor comments (2)

[Abstract] The abstract states that 'we explore the accuracy of simple G/G queue approximations' but the manuscript does not report the specific approximation formulas used or quantitative error measures against the simulations.
[Introduction/Results] Notation for the utilization coefficient and cluster-size scaling should be defined explicitly in the first results section to avoid ambiguity when comparing across policies.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to strengthen the simulation methodology and results sections.

read point-by-point responses

Referee: [Simulation methodology] Simulation methodology section: the description of the data-driven simulations provides no details on validation against the Google traces (e.g., goodness-of-fit metrics, reproduction of trace statistics), error bars on reported response times, or sensitivity analysis to modeling choices such as task decomposition rules and the omission of network/overhead effects. These omissions are load-bearing because the central claims about optimal working points and policy ranking rest on unexamined simulation fidelity.

Authors: We agree that the current description lacks explicit validation steps. In the revision we will add goodness-of-fit metrics and reproduction of key trace statistics (e.g., job-size and inter-arrival distributions), report error bars from multiple independent runs, and include a sensitivity analysis on task-decomposition rules. The omission of network and overhead effects is a deliberate modeling choice to isolate dispatching behavior; we will state this limitation explicitly and discuss its implications for the reported trends. revision: yes
Referee: [Results on job decomposition] Job-decomposition results: the claim that JIQ outperforms LWL after task decomposition is presented without quantitative metrics (specific response-time ratios, confidence intervals, or statistical tests) or ablation on the decomposition rules, making it difficult to assess whether the ranking reversal is robust or an artifact of the modeling assumptions.

Authors: We accept that quantitative support and robustness checks are needed. The revised manuscript will report concrete response-time ratios together with confidence intervals and appropriate statistical tests. We will also add an ablation study that varies the decomposition rules to confirm that the JIQ-over-LWL ranking reversal holds under different modeling choices. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is an empirical simulation study using Google workload traces to compare dispatching policies (JIQ, LWL, RR) under varying cluster sizes at fixed utilization. Central claims report observed performance outcomes, optimal working points, and policy rankings from these simulations, including task decomposition effects. No derivation chain, equations, or predictions are present that reduce by construction to fitted inputs, self-definitions, or self-citation load-bearing premises. The analysis relies on external traces and standard simulation models without internal self-referential forcing of results.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full text unavailable in provided context, so ledger entries are inferred at the level of stated modeling choices rather than explicit equations.

free parameters (1)

utilization coefficient
Fixed while varying number of servers; central to the optimal-point claim.

axioms (1)

domain assumption Google workload traces are representative of general computing-cluster traffic
Invoked implicitly when generalizing simulation results to other clusters.

pith-pipeline@v0.9.0 · 5742 in / 1327 out tokens · 61818 ms · 2026-05-22T20:26:32.478598+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A job-level analysis reveals that Join Idle Queue (JIQ) and Least Work Left (LWL) exhibit an optimal working point for a fixed utilization coefficient as the number of servers is varied
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

When decomposing jobs into tasks, the simpler, non-size-based policy JIQ appears to outperform the more powerful size-based LWL policy

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

"Two-Stagification": Job Dispatching in Large-Scale Clusters via a Two-Stage Architecture
cs.DC 2025-05 unverdicted novelty 5.0

A two-stage dispatching architecture with an optimized job-size threshold improves mean response time over single-stage classical policies in large clusters.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Harchol-Balter, Performance Modeling and Design of Computer Systems: Queueing Theory in Action

M. Harchol-Balter, Performance Modeling and Design of Computer Systems: Queueing Theory in Action . Cambridge University Press, 2013

work page 2013
[2]

Designing lo w- complexity heavy-trafﬁc delay-optimal load balancing sch emes: Theory to algorithms,

X. Zhou, F. Wu, J. Tan, Y . Sun, and N. Shroff, “Designing lo w- complexity heavy-trafﬁc delay-optimal load balancing sch emes: Theory to algorithms,” Proc. ACM Meas. Anal. Comput. Syst. , vol. 1, no. 2, Dec. 2017

work page 2017
[3]

A comprehensi ve survey for scheduling techniques in cloud computing,

M. Kumar, S. Sharma, A. Goel, and S. Singh, “A comprehensi ve survey for scheduling techniques in cloud computing,” Journal of Network and Computer Applications , vol. 143, pp. 1–33, 2019

work page 2019
[4]

A survey on scheduling techniques in computing and n etwork convergence,

S. Tang, Y . Y u, H. Wang, G. Wang, W. Chen, Z. Xu, S. Guo, and W. Gao, “A survey on scheduling techniques in computing and n etwork convergence,” IEEE Communications Surveys & Tutorials , vol. 26, no. 1, pp. 160–195, 2024

work page 2024
[5]

Open problems in queueing theory in spired by datacenter computing,

M. Harchol-Balter, “Open problems in queueing theory in spired by datacenter computing,” Queueing Syst. Theory Appl. , vol. 97, no. 12, pp. 3–37, feb 2021

work page 2021
[6]

Star and rats: Multi-level dis patching poli- cies,

E. Hyytiä and R. Righter, “Star and rats: Multi-level dis patching poli- cies,” in 2020 32nd International Teletrafﬁc Congress (ITC 32) , 2020, pp. 81–89

work page 2020
[7]

Job d ispatch- ing policies for queueing systems with unknown service rate s,

T. Choudhury, G. Joshi, W. Wang, and S. Shakkottai, “Job d ispatch- ing policies for queueing systems with unknown service rate s,” in Proceedings of the Twenty-second International Symposium on Theory, Algorithmic F oundations, and Protocol Design for Mobile Ne tworks and Mobile Computing (MobiHoc ’21) , 2021, pp. 181–190

work page 2021
[8]

Dispatching to parallel servers,

O. Bilenne, “Dispatching to parallel servers,” Queueing Systems, vol. 99, no. 3, pp. 199–230, 2021

work page 2021
[9]

A g eneral ’power-of-d’ dispatching framework for heterogeneous sys tems,

J. A. Jaleel, S. Doroudi, K. Gardner, and A. Wickeham, “A g eneral ’power-of-d’ dispatching framework for heterogeneous sys tems,” Queue- ing Systems , vol. 102, no. 3-4, pp. 431–480, 2022

work page 2022
[10]

cluster-data/Clusterdata2019,

J. Wilkes, “cluster-data/Clusterdata2019,” https://github.com/google/cluster-data/tree/master, 2019

work page 2019
[11]

Borg: the next generatio n,

M. Tirmazi, A. Barker, N. Deng, M. E. Haque, Z. G. Qin, S. H and, M. Harchol-Balter, and J. Wilkes, “Borg: the next generatio n,” in Proceedings of the Fifteenth European Conference on Comput er Systems, ser. EuroSys ’20. New Y ork, NY , USA: Association for Computi ng Machinery, 2020

work page 2020
[12]

Sfakianakis, E

I. Sfakianakis, E. Kanellou, M. Marazakis, and A. Bilas , Trace-Based W orkload Generation and Execution. Springer International Publishing, 08 2021, pp. 37–54

work page 2021
[13]

Data-driven workload gener ation based on google data center measurements,

M. Yildiz and A. Baiocchi, “Data-driven workload gener ation based on google data center measurements,” in 2024 IEEE 25th International Conference on High Performance Switching and Routing (HPSR ), 2024, pp. 143–148

work page 2024
[14]

Task ass ignment in a distributed system: Improving performance by unbalancin g load,

M. Crovella, M. Harchol-Balter, and C. Murta, “Task ass ignment in a distributed system: Improving performance by unbalancin g load,” in Proceedings of the ACM SIGMETRICS Joint International Conf erence on Measurement and Modeling of Computer Systems , 1998, pp. 268–269, poster Session

work page 1998
[15]

Network analysis without exponen tiality assump- tions,

M. Harchol-Balter, “Network analysis without exponen tiality assump- tions,” Ph.D. dissertation, University of California at Be rkeley, 1996

work page 1996
[16]

Exploiting process l ifetime distri- butions for dynamic load balancing,

M. Harchol-Balter and A. Downey, “Exploiting process l ifetime distri- butions for dynamic load balancing,” in Proceedings of ACM SIGMET- RICS, Philadelphia, PA, 1996, pp. 13–24

work page 1996
[17]

The effect of heavy-tailed job siz e distributions on computer system design,

M. Harchol-Balter, “The effect of heavy-tailed job siz e distributions on computer system design,” in Proceedings of ASA-IMS Conference on Applications of Heavy Tailed Distributions in Economics, E ngineering and Statistics , Washington, DC, 1999

work page 1999
[18]

Size- based scheduling to improve web performance,

M. Harchol-Balter, B. Schroeder, N. Bansal, and M. Agra wal, “Size- based scheduling to improve web performance,” ACM Transactions on Computer Systems (TOCS) , vol. 21, no. 2, pp. 207–233, 2003

work page 2003
[19]

The most common queue ing theory questions asked by computer systems practitioners,

M. Harchol-Balter and Z. Scully, “The most common queue ing theory questions asked by computer systems practitioners,” SIGMETRICS Per- form. Eval. Rev. , vol. 49, no. 4, p. 3–7, Jun. 2022

work page 2022
[20]

Numerical performance of approximate que uing formulae with application to ﬂexible manufacturing systems,

W. Marchal, “Numerical performance of approximate que uing formulae with application to ﬂexible manufacturing systems,” Annals of Opera- tions Research, vol. 3, no. 1, p. 141?152, 1985

work page 1985
[21]

The Merit of Simple Policies: Buying Performance With Parallelism and System Architecture

M. Yildiz, A. Rolich, and A. Baiocchi, “The merit of simp le policies: Buying performance with parallelism and system architectu re,” 2025. [Online]. Available: https://arxiv.org/abs/2503.16166

work page internal anchor Pith review Pith/arXiv arXiv 2025

[1] [1]

Harchol-Balter, Performance Modeling and Design of Computer Systems: Queueing Theory in Action

M. Harchol-Balter, Performance Modeling and Design of Computer Systems: Queueing Theory in Action . Cambridge University Press, 2013

work page 2013

[2] [2]

Designing lo w- complexity heavy-trafﬁc delay-optimal load balancing sch emes: Theory to algorithms,

X. Zhou, F. Wu, J. Tan, Y . Sun, and N. Shroff, “Designing lo w- complexity heavy-trafﬁc delay-optimal load balancing sch emes: Theory to algorithms,” Proc. ACM Meas. Anal. Comput. Syst. , vol. 1, no. 2, Dec. 2017

work page 2017

[3] [3]

A comprehensi ve survey for scheduling techniques in cloud computing,

M. Kumar, S. Sharma, A. Goel, and S. Singh, “A comprehensi ve survey for scheduling techniques in cloud computing,” Journal of Network and Computer Applications , vol. 143, pp. 1–33, 2019

work page 2019

[4] [4]

A survey on scheduling techniques in computing and n etwork convergence,

S. Tang, Y . Y u, H. Wang, G. Wang, W. Chen, Z. Xu, S. Guo, and W. Gao, “A survey on scheduling techniques in computing and n etwork convergence,” IEEE Communications Surveys & Tutorials , vol. 26, no. 1, pp. 160–195, 2024

work page 2024

[5] [5]

Open problems in queueing theory in spired by datacenter computing,

M. Harchol-Balter, “Open problems in queueing theory in spired by datacenter computing,” Queueing Syst. Theory Appl. , vol. 97, no. 12, pp. 3–37, feb 2021

work page 2021

[6] [6]

Star and rats: Multi-level dis patching poli- cies,

E. Hyytiä and R. Righter, “Star and rats: Multi-level dis patching poli- cies,” in 2020 32nd International Teletrafﬁc Congress (ITC 32) , 2020, pp. 81–89

work page 2020

[7] [7]

Job d ispatch- ing policies for queueing systems with unknown service rate s,

T. Choudhury, G. Joshi, W. Wang, and S. Shakkottai, “Job d ispatch- ing policies for queueing systems with unknown service rate s,” in Proceedings of the Twenty-second International Symposium on Theory, Algorithmic F oundations, and Protocol Design for Mobile Ne tworks and Mobile Computing (MobiHoc ’21) , 2021, pp. 181–190

work page 2021

[8] [8]

Dispatching to parallel servers,

O. Bilenne, “Dispatching to parallel servers,” Queueing Systems, vol. 99, no. 3, pp. 199–230, 2021

work page 2021

[9] [9]

A g eneral ’power-of-d’ dispatching framework for heterogeneous sys tems,

J. A. Jaleel, S. Doroudi, K. Gardner, and A. Wickeham, “A g eneral ’power-of-d’ dispatching framework for heterogeneous sys tems,” Queue- ing Systems , vol. 102, no. 3-4, pp. 431–480, 2022

work page 2022

[10] [10]

cluster-data/Clusterdata2019,

J. Wilkes, “cluster-data/Clusterdata2019,” https://github.com/google/cluster-data/tree/master, 2019

work page 2019

[11] [11]

Borg: the next generatio n,

M. Tirmazi, A. Barker, N. Deng, M. E. Haque, Z. G. Qin, S. H and, M. Harchol-Balter, and J. Wilkes, “Borg: the next generatio n,” in Proceedings of the Fifteenth European Conference on Comput er Systems, ser. EuroSys ’20. New Y ork, NY , USA: Association for Computi ng Machinery, 2020

work page 2020

[12] [12]

Sfakianakis, E

I. Sfakianakis, E. Kanellou, M. Marazakis, and A. Bilas , Trace-Based W orkload Generation and Execution. Springer International Publishing, 08 2021, pp. 37–54

work page 2021

[13] [13]

Data-driven workload gener ation based on google data center measurements,

M. Yildiz and A. Baiocchi, “Data-driven workload gener ation based on google data center measurements,” in 2024 IEEE 25th International Conference on High Performance Switching and Routing (HPSR ), 2024, pp. 143–148

work page 2024

[14] [14]

Task ass ignment in a distributed system: Improving performance by unbalancin g load,

M. Crovella, M. Harchol-Balter, and C. Murta, “Task ass ignment in a distributed system: Improving performance by unbalancin g load,” in Proceedings of the ACM SIGMETRICS Joint International Conf erence on Measurement and Modeling of Computer Systems , 1998, pp. 268–269, poster Session

work page 1998

[15] [15]

Network analysis without exponen tiality assump- tions,

M. Harchol-Balter, “Network analysis without exponen tiality assump- tions,” Ph.D. dissertation, University of California at Be rkeley, 1996

work page 1996

[16] [16]

Exploiting process l ifetime distri- butions for dynamic load balancing,

M. Harchol-Balter and A. Downey, “Exploiting process l ifetime distri- butions for dynamic load balancing,” in Proceedings of ACM SIGMET- RICS, Philadelphia, PA, 1996, pp. 13–24

work page 1996

[17] [17]

The effect of heavy-tailed job siz e distributions on computer system design,

M. Harchol-Balter, “The effect of heavy-tailed job siz e distributions on computer system design,” in Proceedings of ASA-IMS Conference on Applications of Heavy Tailed Distributions in Economics, E ngineering and Statistics , Washington, DC, 1999

work page 1999

[18] [18]

Size- based scheduling to improve web performance,

M. Harchol-Balter, B. Schroeder, N. Bansal, and M. Agra wal, “Size- based scheduling to improve web performance,” ACM Transactions on Computer Systems (TOCS) , vol. 21, no. 2, pp. 207–233, 2003

work page 2003

[19] [19]

The most common queue ing theory questions asked by computer systems practitioners,

M. Harchol-Balter and Z. Scully, “The most common queue ing theory questions asked by computer systems practitioners,” SIGMETRICS Per- form. Eval. Rev. , vol. 49, no. 4, p. 3–7, Jun. 2022

work page 2022

[20] [20]

Numerical performance of approximate que uing formulae with application to ﬂexible manufacturing systems,

W. Marchal, “Numerical performance of approximate que uing formulae with application to ﬂexible manufacturing systems,” Annals of Opera- tions Research, vol. 3, no. 1, p. 141?152, 1985

work page 1985

[21] [21]

The Merit of Simple Policies: Buying Performance With Parallelism and System Architecture

M. Yildiz, A. Rolich, and A. Baiocchi, “The merit of simp le policies: Buying performance with parallelism and system architectu re,” 2025. [Online]. Available: https://arxiv.org/abs/2503.16166

work page internal anchor Pith review Pith/arXiv arXiv 2025