Rank Based Routing in Large Server Systems under Extreme Congestion
Pith reviewed 2026-05-19 22:01 UTC · model grok-4.3
The pith
Marginal join-the-shortest-queue routing drives ranked queue gaps to a reflected Atlas process with explicit product-form stationary laws under extreme heavy traffic.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under diffusive scaling the ranked queue lengths and associated gap process converge to an infinite-dimensional reflected Atlas process with reflection at the origin and rank-based drift acting on the lowest particle; this limit admits a one-parameter family of product-form stationary gap distributions parametrized by a and b. The connection is established by showing that an auxiliary system with pauses agrees with the original dynamics at diffusion scale, admits an exact open Jackson network representation, and that the heavy-traffic limits of its stationary gap distributions are precisely the product-form invariants of the reflected Atlas process.
What carries the argument
The infinite-dimensional reflected Atlas process, which tracks the ordered queue lengths with reflection at zero and rank-dependent drift applied only to the lowest particle.
If this is right
- The limiting dynamics depend only on the shortest-queue arrival rate b; the parameter a enters solely through the choice of stationary distribution.
- Explicit product-form stationary laws for the gaps yield sharp asymptotics for the length of the shortest queues, the system-wide imbalance, and the average queue length.
- The policy achieves a quantifiable tradeoff between reduced communication (only a vanishing fraction of jobs needs shortest-queue information) and improved load balance relative to random routing.
- The same Atlas limit and stationary family apply to any routing scheme whose diffusion-scale behavior matches the marginal shortest-queue stream of rate b.
Where Pith is reading between the lines
- The construction suggests that similar reflected Atlas limits and product-form gaps may appear under other partial-information policies that preserve a positive shortest-queue arrival rate at diffusion scale.
- Because the stationary gaps factorize, tail probabilities and moments for individual ranked queues become directly computable without solving the full infinite-dimensional system.
- The approach of introducing an exactly solvable auxiliary process may extend to other heavy-traffic models where the original dynamics lack closed-form stationary distributions.
Load-bearing premise
The auxiliary system with pauses agrees with the original queueing dynamics at diffusion scale and admits an exact open Jackson network representation whose stationary distributions select the product-form invariants when taken to heavy traffic.
What would settle it
Numerical simulation of the finite-n system for successively larger n that checks whether the empirical distribution of the scaled gaps between ordered queue lengths approaches the explicit product-form density parametrized by a and b.
read the original abstract
We study $n$ parallel queues in an extreme heavy-traffic regime: each server works at rate $n$, while jobs arrive to a dispatcher at rate $n^2-(a-b)\sqrt{n}$, with fixed $a>b>0$. Arrivals are routed by a marginal join-the-shortest-queue policy: a small stream of rate $b\sqrt{n}$ joins the current shortest queue, while the remaining stream of rate $n^2-a\sqrt{n}$ is routed uniformly at random. This policy greatly reduces communication cost relative to full JSQ, while improving load balancing and offering a natural mechanism for premium jobs to join shorter queues. Under diffusive scaling, we prove limit theorems for the ranked queue lengths and associated gap process. The limit is an infinite-dimensional reflected Atlas process, with reflection at the origin and rank-based drift acting on the lowest particle. Its dynamics depend only on $b$, the shortest-queue arrival rate, while $a$ enters through the choice of invariant distribution. We prove well-posedness of this reflected infinite Atlas model and characterize a one-parameter family of product-form stationary gap distributions, parametrized by $a$ and $b$. To connect the diffusion limit with the stationary behavior of the queueing system, we introduce a related "system with pauses'' that agrees with the original dynamics at diffusion scale but admits an exact open Jackson network representation. This yields explicit finite-$n$ stationary gap distributions, whose heavy-traffic limits select the corresponding product-form invariant laws of the infinite reflected Atlas process. As consequences, we obtain sharp asymptotics for the lowest-ranked queues, system imbalance, and average queue length, quantifying the tradeoff between communication cost and load-balancing performance relative to random routing and full join-the-shortest-queue policies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies n parallel queues under extreme heavy traffic with arrival rate n² - (a-b)√n, where a marginal JSQ policy routes a rate-b√n stream to the shortest queue and the remainder uniformly at random. Under diffusive scaling it proves convergence of the ranked queue lengths and gap process to an infinite-dimensional reflected Atlas process with reflection at zero and rank-based drift on the lowest particle. The limit admits a one-parameter family of product-form stationary gap distributions parametrized by a and b. An auxiliary 'system with pauses' is introduced that is asserted to match the original dynamics at diffusion scale while being exactly representable as an open Jackson network; its explicit finite-n stationary distributions are shown to converge to the claimed Atlas invariants, yielding sharp asymptotics for lowest queues, imbalance, and average length.
Significance. If the results hold, the work contributes to heavy-traffic queueing theory by analyzing a low-communication routing policy that improves on random routing while remaining cheaper than full JSQ. The identification of the diffusion limit as a reflected Atlas process and the explicit product-form stationary measures for the gap process are technically notable. The pause-system construction that delivers exact finite-n stationary distributions whose heavy-traffic limits select the invariants is a clever device that enables concrete performance asymptotics and quantifies the communication-load-balancing tradeoff.
major comments (3)
- [Section 4 (pause-system construction)] The central transfer of stationary distributions relies on the claim that the auxiliary 'system with pauses' agrees with the original rank-based routing at diffusion scale (abstract and the construction used to obtain the Jackson-network stationary measures). A detailed verification is required that the generator (or pathwise coupling) of the scaled gap process coincides with that of the reflected Atlas process, specifically that pauses do not alter the lowest-particle drift or the reflection mechanism by terms that survive the limit; without this, the explicit finite-n gaps need not converge to the claimed product-form invariants.
- [Theorem 3.1 / Section 3] Well-posedness of the infinite-dimensional reflected Atlas process (Theorem on existence/uniqueness of the limit) must address the infinite-particle Skorokhod reflection and the domain of the generator; the current argument leaves open whether tightness and uniqueness hold uniformly in the ranked coordinates when the drift acts only on the lowest particle.
- [Section 5 (stationary distributions)] The selection of the specific member of the one-parameter family of product-form invariants by the heavy-traffic limit of the Jackson-network stationary distribution (Section 5) requires an explicit computation showing how the parameters a and b arise from the pause rates and the routing split; the current derivation appears to assume rather than derive the precise matching of the gap-process marginals.
minor comments (3)
- [Introduction / Section 2] The definition of the gap process and its relation to the ranked queue lengths should be stated explicitly with an equation in the introduction or Section 2 to avoid ambiguity when reading the limit theorems.
- [Section 3] Notation for the infinite Atlas process (e.g., the precise form of the rank-based drift vector) is introduced late; moving a compact definition to the model section would improve readability.
- [Introduction] A brief comparison paragraph with existing Atlas-process literature (e.g., Harris, Pal, etc.) would help situate the reflected infinite-dimensional variant.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment point by point below, indicating the revisions we will make to strengthen the presentation and proofs.
read point-by-point responses
-
Referee: [Section 4 (pause-system construction)] The central transfer of stationary distributions relies on the claim that the auxiliary 'system with pauses' agrees with the original rank-based routing at diffusion scale (abstract and the construction used to obtain the Jackson-network stationary measures). A detailed verification is required that the generator (or pathwise coupling) of the scaled gap process coincides with that of the reflected Atlas process, specifically that pauses do not alter the lowest-particle drift or the reflection mechanism by terms that survive the limit; without this, the explicit finite-n gaps need not converge to the claimed product-form invariants.
Authors: We thank the referee for highlighting this point. The pause system is constructed so that pauses occur only when the routing decision would otherwise be ambiguous at the diffusion scale, and we argue that their effect vanishes in the limit. However, we agree that a more detailed verification is needed. In the revised manuscript we will add a new subsection in Section 4 that explicitly compares the generators of the scaled gap processes. We will derive bounds showing that the perturbation to the lowest-particle drift and to the reflection terms is of order o(1) uniformly on compact sets, and we will supply a pathwise coupling under which the two processes differ by a process whose quadratic variation vanishes in the limit. These additions will make the transfer of stationary distributions fully rigorous. revision: yes
-
Referee: [Theorem 3.1 / Section 3] Well-posedness of the infinite-dimensional reflected Atlas process (Theorem on existence/uniqueness of the limit) must address the infinite-particle Skorokhod reflection and the domain of the generator; the current argument leaves open whether tightness and uniqueness hold uniformly in the ranked coordinates when the drift acts only on the lowest particle.
Authors: We appreciate the referee's observation. Theorem 3.1 currently proceeds via finite-dimensional approximations and weak convergence, but the infinite-dimensional Skorokhod reflection and generator domain are treated only implicitly. In the revision we will expand the proof of Theorem 3.1 to include: (i) a precise description of the domain of the generator for the infinite-particle reflected Atlas process, (ii) a uniform tightness argument that controls all ranked coordinates simultaneously by exploiting the rank-based drift structure, and (iii) a reference to existing results on reflected rank-based particle systems to establish uniqueness. These clarifications will close the gaps noted by the referee. revision: yes
-
Referee: [Section 5 (stationary distributions)] The selection of the specific member of the one-parameter family of product-form invariants by the heavy-traffic limit of the Jackson-network stationary distribution (Section 5) requires an explicit computation showing how the parameters a and b arise from the pause rates and the routing split; the current derivation appears to assume rather than derive the precise matching of the gap-process marginals.
Authors: We agree that the matching of parameters a and b should be derived explicitly rather than asserted. In the revised Section 5 we will insert a detailed calculation that starts from the Jackson-network balance equations for the pause system, substitutes the explicit pause rates chosen to reproduce the marginal JSQ split (rate b√n to the shortest queue and the remainder random), and shows that the resulting product-form gap distribution converges to the unique member of the one-parameter family whose parameters are exactly a and b. The computation will be carried out first for finite n and then passed to the limit, thereby deriving rather than assuming the precise identification. revision: yes
Circularity Check
No circularity: limit theorems and auxiliary construction are independent of fitted inputs or self-definitions
full rationale
The derivation proceeds via standard heavy-traffic scaling arguments to obtain convergence of the ranked queue lengths and gap process to a reflected infinite-dimensional Atlas process, followed by a separate well-posedness analysis and characterization of its product-form stationary measures. The auxiliary 'system with pauses' is introduced explicitly as a new construction that matches the original dynamics at diffusive scale while permitting an exact Jackson-network representation; the paper then takes its known stationary distributions to the heavy-traffic limit to select the Atlas invariants. No equation or claim reduces by construction to a parameter fitted from the target quantity, nor does any load-bearing step rest on a self-citation whose content is itself unverified. The central results are therefore self-contained mathematical statements.
Axiom & Free-Parameter Ledger
free parameters (2)
- a
- b
axioms (2)
- domain assumption Well-posedness of the infinite-dimensional reflected Atlas process
- domain assumption Equivalence of the pause system to the original dynamics at diffusion scale
invented entities (1)
-
reflected infinite Atlas process
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Rami Atar,A diffusion regime with nondegenerate slowdown, Operations Research60(2012), no. 2, 490–500
work page 2012
-
[2]
Rami Atar and Tomoyuki Ichiba,Rank-based stochastic differential inclusions and diffusion limits for a load-balancing model, Mathematics of Operations Research (2025)
work page 2025
-
[3]
Rami Atar and Gershon Wolansky,Invariance principle and Mckean–Vlasov limit for ran- domized load balancing in heavy traffic, The Annals of Applied Probability35(2025), no. 5, 3046–3085
work page 2025
-
[4]
Sayan Banerjee and Amarjit Budhiraja,Domains of attraction of invariant distributions of the infinite Atlas model, Annals of Probability50(2022), no. 6, 2286–2339
work page 2022
- [5]
-
[6]
Sayan Banerjee, Amarjit Budhiraja, and Benjamin Estevez,Load balancing in parallel queues and rank-based diffusions, Mathematics of Operations Research (2025)
work page 2025
-
[7]
Sayan Banerjee and Debankur Mukherjee,Join-the-shortest queue diffusion limit in halfin–whitt regime: Tail asymptotics and scaling of extrema, The Annals of Applied Prob- ability29(2019), no. 2, 1262–1309
work page 2019
- [8]
-
[9]
Shankar Bhamidi, Amarjit Budhiraja, and Miheer Dewaskar,Near equilibrium fluctuations for supermarket models with growing choices, The Annals of Applied Probability32(2022), no. 3, 2083–2138
work page 2022
-
[10]
Maury Bramson, Yi Lu, and Balaji Prabhakar,Asymptotic independence of queues under randomized load balancing, Queueing Systems71(2012), no. 3, 247–292. 34
work page 2012
-
[11]
Anton Braverman,Steady-state analysis of the join-the-shortest-queue model in the halfin–whitt regime, Mathematics of Operations Research45(2020), no. 3, 1069–1103
work page 2020
-
[12]
Amir Dembo, Milton Jara, and Stefano Olla,Equilibrium fluctuations of the Atlas model, Annals of Probability47(2019), no. 2, 565–603
work page 2019
-
[13]
the heavy-traffic asymptotics, Mathematics of Operations Research43(2018), no
Patrick Eschenfeldt and David Gamarnik,Join the shortest queue with many servers. the heavy-traffic asymptotics, Mathematics of Operations Research43(2018), no. 3, 867–886
work page 2018
-
[14]
Robert Fernholz and Ioannis Karatzas,Stochastic portfolio theory: an overview, Handbook of numerical analysis15(2009), 89–167
work page 2009
-
[15]
Varun Gupta, Mor Harchol-Balter, Karl Sigman, and Ward Whitt,Analysis of join-the- shortest-queue routing for web server farms, Performance Evaluation64(2007), no. 9–12, 1062–1081
work page 2007
-
[16]
Varun Gupta and Neil Walton,Load balancing in the nondegenerate slowdown regime, Oper- ations Research67(2019), no. 1, 281–294
work page 2019
-
[17]
Shlomo Halfin and Ward Whitt,Heavy-traffic limits for queues with many exponential servers, Operations Research29(1981), no. 3, 567–588
work page 1981
-
[18]
J. Michael Harrison and Martin I. Reiman,Reflected brownian motion on an orthant, The Annals of Probability9(1981), no. 2, 302–308
work page 1981
-
[19]
Daniela Hurtado-Lange and Siva Theja Maguluri,A load balancing system in the many-server heavy-traffic asymptotics, Queueing Systems101(2022), no. 3, 353–391
work page 2022
-
[20]
Sanjay Joshi and Usha Kumari,Load balancing in cloud computing: Challenges & issues, 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I) (Piscat- away, NJ), IEEE, 2016, pp. 120–125
work page 2016
-
[21]
F. P. Kelly,Reversibility and stochastic networks, Advances in Applied Probability11(1979), no. 2, 275–295
work page 1979
-
[22]
Xin Liu and Lei Ying,Steady-state analysis of load-balancing algorithms in the sub-halfin–whitt regime, Journal of Applied Probability57(2020), no. 2, 578–596
work page 2020
-
[23]
,Universal scaling of distributed queues under load balancing in the super-halfin-whitt regime, IEEE/ACM Transactions on Networking30(2021), no. 1, 190–201
work page 2021
-
[24]
Debankur Mukherjee, Sem Borst, Johan S. H. van Leeuwaarden, and Philip A. Whiting, Universality of load balancing schemes on the diffusion scale, Journal of Applied Probability 53(2016), no. 4, 1111–1124
work page 2016
-
[25]
Debankur Mukherjee, Sem C. Borst, Johan S. H. van Leeuwaarden, and Philip A. Whiting, Scalable load balancing in networked systems: A survey, SIAM Review62(2020), no. 2, 375– 439
work page 2020
- [26]
- [27]
-
[28]
Andrey Sarantsev,Infinite systems of competing brownian particles, Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques53(2017), no. 4, 2279–2315
work page 2017
-
[29]
,Comparison techniques for competing brownian particles, Journal of Theoretical Prob- ability32(2019), no. 2, 545–585
work page 2019
-
[30]
Andrey Sarantsev and Li-Cheng Tsai,Stationary gap distributions for infinite systems of com- peting Brownian particles, Electronic Journal of Probability22(2017), no. none, 1 – 20
work page 2017
-
[31]
Zhisheng Zhao, Sayan Banerjee, and Debankur Mukherjee,Many-server asymptotics for join- the-shortest-queue in the super-halfin-whitt scaling window, Mathematics of Operations Re- search (2025). S. Banerjee, A. Budhiraja and E. Loeser, Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 27599, USA emails: sayan@ema...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.