pith. sign in

arxiv: 2605.17677 · v1 · pith:JRKA7AVZnew · submitted 2026-05-17 · 🧮 math.PR

Rank Based Routing in Large Server Systems under Extreme Congestion

Pith reviewed 2026-05-19 22:01 UTC · model grok-4.3

classification 🧮 math.PR
keywords heavy traffic limitsdiffusion approximationjoin-the-shortest-queuereflected Atlas processproduct-form stationary distributionslarge server systemsqueueing networksrank-based routing
0
0 comments X

The pith

Marginal join-the-shortest-queue routing drives ranked queue gaps to a reflected Atlas process with explicit product-form stationary laws under extreme heavy traffic.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper considers n parallel servers each working at rate n, with total arrivals at rate n squared minus (a minus b) times square root n, where a exceeds b. Jobs are routed by sending a small stream of rate b times square root n always to the current shortest queue and routing the remainder uniformly at random. Under diffusive scaling the ordered queue lengths and the gaps between them converge to an infinite-dimensional reflected Atlas process whose dynamics depend only on b while the stationary gap distribution is chosen from a one-parameter family indexed by a and b. An auxiliary system with pauses is introduced that matches the original dynamics at diffusion scale yet possesses exact open Jackson network stationary distributions; these distributions, when scaled, select the product-form invariants of the Atlas process. This yields sharp asymptotics for the lowest queues, overall imbalance, and mean queue length that quantify the performance-communication tradeoff relative to pure random routing and full join-the-shortest-queue.

Core claim

Under diffusive scaling the ranked queue lengths and associated gap process converge to an infinite-dimensional reflected Atlas process with reflection at the origin and rank-based drift acting on the lowest particle; this limit admits a one-parameter family of product-form stationary gap distributions parametrized by a and b. The connection is established by showing that an auxiliary system with pauses agrees with the original dynamics at diffusion scale, admits an exact open Jackson network representation, and that the heavy-traffic limits of its stationary gap distributions are precisely the product-form invariants of the reflected Atlas process.

What carries the argument

The infinite-dimensional reflected Atlas process, which tracks the ordered queue lengths with reflection at zero and rank-dependent drift applied only to the lowest particle.

If this is right

  • The limiting dynamics depend only on the shortest-queue arrival rate b; the parameter a enters solely through the choice of stationary distribution.
  • Explicit product-form stationary laws for the gaps yield sharp asymptotics for the length of the shortest queues, the system-wide imbalance, and the average queue length.
  • The policy achieves a quantifiable tradeoff between reduced communication (only a vanishing fraction of jobs needs shortest-queue information) and improved load balance relative to random routing.
  • The same Atlas limit and stationary family apply to any routing scheme whose diffusion-scale behavior matches the marginal shortest-queue stream of rate b.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The construction suggests that similar reflected Atlas limits and product-form gaps may appear under other partial-information policies that preserve a positive shortest-queue arrival rate at diffusion scale.
  • Because the stationary gaps factorize, tail probabilities and moments for individual ranked queues become directly computable without solving the full infinite-dimensional system.
  • The approach of introducing an exactly solvable auxiliary process may extend to other heavy-traffic models where the original dynamics lack closed-form stationary distributions.

Load-bearing premise

The auxiliary system with pauses agrees with the original queueing dynamics at diffusion scale and admits an exact open Jackson network representation whose stationary distributions select the product-form invariants when taken to heavy traffic.

What would settle it

Numerical simulation of the finite-n system for successively larger n that checks whether the empirical distribution of the scaled gaps between ordered queue lengths approaches the explicit product-form density parametrized by a and b.

read the original abstract

We study $n$ parallel queues in an extreme heavy-traffic regime: each server works at rate $n$, while jobs arrive to a dispatcher at rate $n^2-(a-b)\sqrt{n}$, with fixed $a>b>0$. Arrivals are routed by a marginal join-the-shortest-queue policy: a small stream of rate $b\sqrt{n}$ joins the current shortest queue, while the remaining stream of rate $n^2-a\sqrt{n}$ is routed uniformly at random. This policy greatly reduces communication cost relative to full JSQ, while improving load balancing and offering a natural mechanism for premium jobs to join shorter queues. Under diffusive scaling, we prove limit theorems for the ranked queue lengths and associated gap process. The limit is an infinite-dimensional reflected Atlas process, with reflection at the origin and rank-based drift acting on the lowest particle. Its dynamics depend only on $b$, the shortest-queue arrival rate, while $a$ enters through the choice of invariant distribution. We prove well-posedness of this reflected infinite Atlas model and characterize a one-parameter family of product-form stationary gap distributions, parametrized by $a$ and $b$. To connect the diffusion limit with the stationary behavior of the queueing system, we introduce a related "system with pauses'' that agrees with the original dynamics at diffusion scale but admits an exact open Jackson network representation. This yields explicit finite-$n$ stationary gap distributions, whose heavy-traffic limits select the corresponding product-form invariant laws of the infinite reflected Atlas process. As consequences, we obtain sharp asymptotics for the lowest-ranked queues, system imbalance, and average queue length, quantifying the tradeoff between communication cost and load-balancing performance relative to random routing and full join-the-shortest-queue policies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper studies n parallel queues under extreme heavy traffic with arrival rate n² - (a-b)√n, where a marginal JSQ policy routes a rate-b√n stream to the shortest queue and the remainder uniformly at random. Under diffusive scaling it proves convergence of the ranked queue lengths and gap process to an infinite-dimensional reflected Atlas process with reflection at zero and rank-based drift on the lowest particle. The limit admits a one-parameter family of product-form stationary gap distributions parametrized by a and b. An auxiliary 'system with pauses' is introduced that is asserted to match the original dynamics at diffusion scale while being exactly representable as an open Jackson network; its explicit finite-n stationary distributions are shown to converge to the claimed Atlas invariants, yielding sharp asymptotics for lowest queues, imbalance, and average length.

Significance. If the results hold, the work contributes to heavy-traffic queueing theory by analyzing a low-communication routing policy that improves on random routing while remaining cheaper than full JSQ. The identification of the diffusion limit as a reflected Atlas process and the explicit product-form stationary measures for the gap process are technically notable. The pause-system construction that delivers exact finite-n stationary distributions whose heavy-traffic limits select the invariants is a clever device that enables concrete performance asymptotics and quantifies the communication-load-balancing tradeoff.

major comments (3)
  1. [Section 4 (pause-system construction)] The central transfer of stationary distributions relies on the claim that the auxiliary 'system with pauses' agrees with the original rank-based routing at diffusion scale (abstract and the construction used to obtain the Jackson-network stationary measures). A detailed verification is required that the generator (or pathwise coupling) of the scaled gap process coincides with that of the reflected Atlas process, specifically that pauses do not alter the lowest-particle drift or the reflection mechanism by terms that survive the limit; without this, the explicit finite-n gaps need not converge to the claimed product-form invariants.
  2. [Theorem 3.1 / Section 3] Well-posedness of the infinite-dimensional reflected Atlas process (Theorem on existence/uniqueness of the limit) must address the infinite-particle Skorokhod reflection and the domain of the generator; the current argument leaves open whether tightness and uniqueness hold uniformly in the ranked coordinates when the drift acts only on the lowest particle.
  3. [Section 5 (stationary distributions)] The selection of the specific member of the one-parameter family of product-form invariants by the heavy-traffic limit of the Jackson-network stationary distribution (Section 5) requires an explicit computation showing how the parameters a and b arise from the pause rates and the routing split; the current derivation appears to assume rather than derive the precise matching of the gap-process marginals.
minor comments (3)
  1. [Introduction / Section 2] The definition of the gap process and its relation to the ranked queue lengths should be stated explicitly with an equation in the introduction or Section 2 to avoid ambiguity when reading the limit theorems.
  2. [Section 3] Notation for the infinite Atlas process (e.g., the precise form of the rank-based drift vector) is introduced late; moving a compact definition to the model section would improve readability.
  3. [Introduction] A brief comparison paragraph with existing Atlas-process literature (e.g., Harris, Pal, etc.) would help situate the reflected infinite-dimensional variant.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment point by point below, indicating the revisions we will make to strengthen the presentation and proofs.

read point-by-point responses
  1. Referee: [Section 4 (pause-system construction)] The central transfer of stationary distributions relies on the claim that the auxiliary 'system with pauses' agrees with the original rank-based routing at diffusion scale (abstract and the construction used to obtain the Jackson-network stationary measures). A detailed verification is required that the generator (or pathwise coupling) of the scaled gap process coincides with that of the reflected Atlas process, specifically that pauses do not alter the lowest-particle drift or the reflection mechanism by terms that survive the limit; without this, the explicit finite-n gaps need not converge to the claimed product-form invariants.

    Authors: We thank the referee for highlighting this point. The pause system is constructed so that pauses occur only when the routing decision would otherwise be ambiguous at the diffusion scale, and we argue that their effect vanishes in the limit. However, we agree that a more detailed verification is needed. In the revised manuscript we will add a new subsection in Section 4 that explicitly compares the generators of the scaled gap processes. We will derive bounds showing that the perturbation to the lowest-particle drift and to the reflection terms is of order o(1) uniformly on compact sets, and we will supply a pathwise coupling under which the two processes differ by a process whose quadratic variation vanishes in the limit. These additions will make the transfer of stationary distributions fully rigorous. revision: yes

  2. Referee: [Theorem 3.1 / Section 3] Well-posedness of the infinite-dimensional reflected Atlas process (Theorem on existence/uniqueness of the limit) must address the infinite-particle Skorokhod reflection and the domain of the generator; the current argument leaves open whether tightness and uniqueness hold uniformly in the ranked coordinates when the drift acts only on the lowest particle.

    Authors: We appreciate the referee's observation. Theorem 3.1 currently proceeds via finite-dimensional approximations and weak convergence, but the infinite-dimensional Skorokhod reflection and generator domain are treated only implicitly. In the revision we will expand the proof of Theorem 3.1 to include: (i) a precise description of the domain of the generator for the infinite-particle reflected Atlas process, (ii) a uniform tightness argument that controls all ranked coordinates simultaneously by exploiting the rank-based drift structure, and (iii) a reference to existing results on reflected rank-based particle systems to establish uniqueness. These clarifications will close the gaps noted by the referee. revision: yes

  3. Referee: [Section 5 (stationary distributions)] The selection of the specific member of the one-parameter family of product-form invariants by the heavy-traffic limit of the Jackson-network stationary distribution (Section 5) requires an explicit computation showing how the parameters a and b arise from the pause rates and the routing split; the current derivation appears to assume rather than derive the precise matching of the gap-process marginals.

    Authors: We agree that the matching of parameters a and b should be derived explicitly rather than asserted. In the revised Section 5 we will insert a detailed calculation that starts from the Jackson-network balance equations for the pause system, substitutes the explicit pause rates chosen to reproduce the marginal JSQ split (rate b√n to the shortest queue and the remainder random), and shows that the resulting product-form gap distribution converges to the unique member of the one-parameter family whose parameters are exactly a and b. The computation will be carried out first for finite n and then passed to the limit, thereby deriving rather than assuming the precise identification. revision: yes

Circularity Check

0 steps flagged

No circularity: limit theorems and auxiliary construction are independent of fitted inputs or self-definitions

full rationale

The derivation proceeds via standard heavy-traffic scaling arguments to obtain convergence of the ranked queue lengths and gap process to a reflected infinite-dimensional Atlas process, followed by a separate well-posedness analysis and characterization of its product-form stationary measures. The auxiliary 'system with pauses' is introduced explicitly as a new construction that matches the original dynamics at diffusive scale while permitting an exact Jackson-network representation; the paper then takes its known stationary distributions to the heavy-traffic limit to select the Atlas invariants. No equation or claim reduces by construction to a parameter fitted from the target quantity, nor does any load-bearing step rest on a self-citation whose content is itself unverified. The central results are therefore self-contained mathematical statements.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The model rests on standard existence results for reflected diffusions and on the construction of the auxiliary pause system; a and b are fixed parameters rather than fitted constants.

free parameters (2)
  • a
    Controls the overall arrival-rate deviation from criticality; enters the choice of invariant distribution.
  • b
    Controls the shortest-queue arrival rate; governs the rank-based drift in the limit process.
axioms (2)
  • domain assumption Well-posedness of the infinite-dimensional reflected Atlas process
    Invoked to guarantee unique strong solutions before stationary distributions are characterized.
  • domain assumption Equivalence of the pause system to the original dynamics at diffusion scale
    Required to transfer exact Jackson-network stationary distributions to the heavy-traffic limit.
invented entities (1)
  • reflected infinite Atlas process no independent evidence
    purpose: Diffusion limit of the ranked queue-length vector under the marginal routing policy
    New process introduced to capture rank-based drift and reflection; no independent empirical evidence supplied.

pith-pipeline@v0.9.0 · 5856 in / 1647 out tokens · 47530 ms · 2026-05-19T22:01:47.427680+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    2, 490–500

    Rami Atar,A diffusion regime with nondegenerate slowdown, Operations Research60(2012), no. 2, 490–500

  2. [2]

    Rami Atar and Tomoyuki Ichiba,Rank-based stochastic differential inclusions and diffusion limits for a load-balancing model, Mathematics of Operations Research (2025)

  3. [3]

    5, 3046–3085

    Rami Atar and Gershon Wolansky,Invariance principle and Mckean–Vlasov limit for ran- domized load balancing in heavy traffic, The Annals of Applied Probability35(2025), no. 5, 3046–3085

  4. [4]

    6, 2286–2339

    Sayan Banerjee and Amarjit Budhiraja,Domains of attraction of invariant distributions of the infinite Atlas model, Annals of Probability50(2022), no. 6, 2286–2339

  5. [5]

    1, 79–117

    Sayan Banerjee and Amarjit Budhiraja,Extremal invariant distributions of infinite brownian particle systems with rank dependent drifts, Probability Theory and Related Fields190(2024), no. 1, 79–117

  6. [6]

    Sayan Banerjee, Amarjit Budhiraja, and Benjamin Estevez,Load balancing in parallel queues and rank-based diffusions, Mathematics of Operations Research (2025)

  7. [7]

    2, 1262–1309

    Sayan Banerjee and Debankur Mukherjee,Join-the-shortest queue diffusion limit in halfin–whitt regime: Tail asymptotics and scaling of extrema, The Annals of Applied Prob- ability29(2019), no. 2, 1262–1309

  8. [8]

    1, 80–144

    ,Join-the-shortest queue diffusion limit in halfin–whitt regime: Sensitivity on the heavy- traffic parameter, The Annals of Applied Probability30(2020), no. 1, 80–144

  9. [9]

    3, 2083–2138

    Shankar Bhamidi, Amarjit Budhiraja, and Miheer Dewaskar,Near equilibrium fluctuations for supermarket models with growing choices, The Annals of Applied Probability32(2022), no. 3, 2083–2138

  10. [10]

    3, 247–292

    Maury Bramson, Yi Lu, and Balaji Prabhakar,Asymptotic independence of queues under randomized load balancing, Queueing Systems71(2012), no. 3, 247–292. 34

  11. [11]

    3, 1069–1103

    Anton Braverman,Steady-state analysis of the join-the-shortest-queue model in the halfin–whitt regime, Mathematics of Operations Research45(2020), no. 3, 1069–1103

  12. [12]

    2, 565–603

    Amir Dembo, Milton Jara, and Stefano Olla,Equilibrium fluctuations of the Atlas model, Annals of Probability47(2019), no. 2, 565–603

  13. [13]

    the heavy-traffic asymptotics, Mathematics of Operations Research43(2018), no

    Patrick Eschenfeldt and David Gamarnik,Join the shortest queue with many servers. the heavy-traffic asymptotics, Mathematics of Operations Research43(2018), no. 3, 867–886

  14. [14]

    Robert Fernholz and Ioannis Karatzas,Stochastic portfolio theory: an overview, Handbook of numerical analysis15(2009), 89–167

  15. [15]

    9–12, 1062–1081

    Varun Gupta, Mor Harchol-Balter, Karl Sigman, and Ward Whitt,Analysis of join-the- shortest-queue routing for web server farms, Performance Evaluation64(2007), no. 9–12, 1062–1081

  16. [16]

    1, 281–294

    Varun Gupta and Neil Walton,Load balancing in the nondegenerate slowdown regime, Oper- ations Research67(2019), no. 1, 281–294

  17. [17]

    3, 567–588

    Shlomo Halfin and Ward Whitt,Heavy-traffic limits for queues with many exponential servers, Operations Research29(1981), no. 3, 567–588

  18. [18]

    Michael Harrison and Martin I

    J. Michael Harrison and Martin I. Reiman,Reflected brownian motion on an orthant, The Annals of Probability9(1981), no. 2, 302–308

  19. [19]

    3, 353–391

    Daniela Hurtado-Lange and Siva Theja Maguluri,A load balancing system in the many-server heavy-traffic asymptotics, Queueing Systems101(2022), no. 3, 353–391

  20. [20]

    Sanjay Joshi and Usha Kumari,Load balancing in cloud computing: Challenges & issues, 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I) (Piscat- away, NJ), IEEE, 2016, pp. 120–125

  21. [21]

    F. P. Kelly,Reversibility and stochastic networks, Advances in Applied Probability11(1979), no. 2, 275–295

  22. [22]

    2, 578–596

    Xin Liu and Lei Ying,Steady-state analysis of load-balancing algorithms in the sub-halfin–whitt regime, Journal of Applied Probability57(2020), no. 2, 578–596

  23. [23]

    1, 190–201

    ,Universal scaling of distributed queues under load balancing in the super-halfin-whitt regime, IEEE/ACM Transactions on Networking30(2021), no. 1, 190–201

  24. [24]

    Debankur Mukherjee, Sem Borst, Johan S. H. van Leeuwaarden, and Philip A. Whiting, Universality of load balancing schemes on the diffusion scale, Journal of Applied Probability 53(2016), no. 4, 1111–1124

  25. [25]

    Borst, Johan S

    Debankur Mukherjee, Sem C. Borst, Johan S. H. van Leeuwaarden, and Philip A. Whiting, Scalable load balancing in networked systems: A survey, SIAM Review62(2020), no. 2, 375– 439

  26. [26]

    Pal and J

    S. Pal and J. Pitman,One-Dimensional Brownian Particle Systems with Rank-Dependent Drifts, Annals of Applied Probability (2008), no. 6, 2179–2207. 35

  27. [27]

    4, 18–19

    Prakirt Raj Jhunjhunwala, Daniela Hurtado-Lange, and Siva Theja Maguluri,Exponential tail bounds on queues: A confluence of non-asymptotic heavy traffic and large deviations, ACM SIGMETRICS Performance Evaluation Review51(2024), no. 4, 18–19

  28. [28]

    4, 2279–2315

    Andrey Sarantsev,Infinite systems of competing brownian particles, Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques53(2017), no. 4, 2279–2315

  29. [29]

    2, 545–585

    ,Comparison techniques for competing brownian particles, Journal of Theoretical Prob- ability32(2019), no. 2, 545–585

  30. [30]

    none, 1 – 20

    Andrey Sarantsev and Li-Cheng Tsai,Stationary gap distributions for infinite systems of com- peting Brownian particles, Electronic Journal of Probability22(2017), no. none, 1 – 20

  31. [31]

    Zhisheng Zhao, Sayan Banerjee, and Debankur Mukherjee,Many-server asymptotics for join- the-shortest-queue in the super-halfin-whitt scaling window, Mathematics of Operations Re- search (2025). S. Banerjee, A. Budhiraja and E. Loeser, Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 27599, USA emails: sayan@ema...