pith. sign in

arxiv: 2604.15261 · v3 · pith:BXLXWU4Tnew · submitted 2026-04-16 · 💻 cs.NI

RNG: Flat Datacenter Networks at Scale

Pith reviewed 2026-05-22 10:18 UTC · model grok-4.3

classification 💻 cs.NI
keywords datacenter networksflat topologiesquasi-random graphsdistributed routingoptical cablingcost reductionfault toleranceproduction deployment
0
0 comments X

The pith

Quasi-random graphs enable flat datacenter networks that match fat-tree performance while cutting costs by up to 45 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents RNG, a flat datacenter network built on quasi-random graphs. It solves earlier barriers to using such graphs by adding a distributed routing protocol that locates many edge-disjoint paths between endpoints and a passive optical device that shuffles cables internally. These changes let the network deliver performance equal to or better than fat trees across different traffic patterns. The lower cost and simpler cabling have led to its adoption as the default network for most workloads in a large production environment.

Core claim

RNG shows that a quasi-random graph topology, combined with a distributed routing protocol that exploits the graphs to locate large numbers of edge-disjoint paths and a passive optical cable-shuffling device, achieves performance that matches or exceeds fat trees for a range of traffic patterns while lowering costs by as much as 45 percent, and this design is now the default datacenter network for most workloads in production.

What carries the argument

Quasi-random graph topology paired with distributed routing that finds edge-disjoint paths and a passive optical device that simplifies cabling.

If this is right

  • RNG delivers performance equal to or better than fat trees for many traffic patterns.
  • Network costs drop by up to 45 percent while fault tolerance remains high.
  • The routing protocol scales because it finds large numbers of edge-disjoint paths using graph properties.
  • Cabling complexity stays similar to fat trees thanks to the passive optical shuffling device.
  • The design has been deployed at scale and serves as the default for most production workloads.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other large operators could adopt similar flat quasi-random topologies to reduce their own network capital costs.
  • The same path-finding approach might extend to other network settings that need many disjoint routes at low cost.
  • Further experiments with failure patterns outside current production use could reveal how far the stability of the graph properties extends.
  • High-bandwidth applications such as distributed training clusters may gain from the efficient use of multiple paths.

Load-bearing premise

The quasi-random graph properties that let the routing protocol locate many edge-disjoint paths stay reliable under the traffic patterns and failure modes found in large production datacenters.

What would settle it

A side-by-side production run in which RNG shows measurably higher latency, lower throughput, or more routing failures than a fat-tree network under the same real traffic and failure conditions.

Figures

Figures reproduced from arXiv: 2604.15261 by Chinchu Merine Joseph, C. Seshadhri, Elizabeth Tennent, Enrico Carlesso, Giacomo Bernardi, Luiza Popa, Pavan Manikonda, Randy Ram, Ratul Mahajan, Saurabh Kumar, Steven Robinson.

Figure 1
Figure 1. Figure 1: Example fat tree (a) and expander topologies [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Key design parameters of RNG 5 SPRAYPOINT ROUTING Spraypoint constructs a large number of edge disjoint paths between endpoint pairs, which increases network through￾put by offering more independent options to load balanc￾ing mechanisms like ECMP. While we analyze and deploy it on random graphs, Spraypoint works on any expander graph. It is based on the observation that high fan-out at the source and high … view at source ↗
Figure 4
Figure 4. Figure 4: A ShuffleBox. packet to an ECMP group id. LPM memory use depends on the number of network prefixes and is similar to fat trees.5 The second type of memory is to map ECMP group id to next hop set. Its required size depends on how ECMP groups are setup. The two possibilities are: (1) a separate ECMP group for each destination node, which consumes 𝑂(𝑛ℎ) memory; and (2) pre-define all possible 𝑑 ℎ groups and u… view at source ↗
Figure 5
Figure 5. Figure 5: Cabling in a datacenter with three rooms. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Path length distribution. possible value of 𝑑 is proportional to exp(−ℎ). Thus, a low value of ℎ > 1 suffices for good performance. The change in exp(−ℎ) from ℎ=1 (0.37) to ℎ=2 (0.13) is large, but it tapers off rapidly as ℎ increases. (2) For neighboring sources, high values of 𝑝 reduce the number of edge disjoint paths. 7.2 Path length Path lengths in RNG depend on the sizes of various Spray￾point levels… view at source ↗
Figure 8
Figure 8. Figure 8: (a) One of the racks hosting the emulated [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Per-flow throughput of multipath transport [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: PPS of 64B packets (left) and IOPS of storage [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Oversubscription for different topology parameters. The defaults are [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Edge disjoint path count. It offers higher peak throughput between endpoint pairs, and offers higher network-wide throughput. 9.3 Throughput relative to fat trees Beyond worst-case traffic patterns that help characterize oversubscription, operators want to also validate perfor￾mance for other traffic patterns. This validation is hard be￾cause there are many possible traffic patterns. Our evaluation is bas… view at source ↗
Figure 13
Figure 13. Figure 13: Oversubscription ratio (lower is better) for different traffic matrices. Both fat tree and [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Reduction in number of switches in RNG relative to a 3-tier fat tree for two different port counts. We use the ratio of the number of switches used by the two topologies as the measure of relative cost. This measure automatically includes transceivers, whose count is propor￾tional to switch count. It ignores passive optical components like cables, connectors, and patch panels; the cost of these components… view at source ↗
Figure 15
Figure 15. Figure 15: Matched uplinks as the datacenter grows. [PITH_FULL_IMAGE:figures/full_fig_p015_15.png] view at source ↗
Figure 17
Figure 17. Figure 17: Comparison of fraction of uplinks that find [PITH_FULL_IMAGE:figures/full_fig_p016_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: plots the optimal 𝛽 versus 𝛼. We see if we want the degree to be at least 0.8𝑑, we can get that as soon as 𝛽 ≈ 25% of the room is deployed. This performance can be obtained if the first phase is ≈30% of the first room. If the 𝛽 ∗ of the two-phase deployment does not meet the operator criterion, we can perform a similar analysis for suc￾cessively higher number of phases until we find the number of phases w… view at source ↗
Figure 19
Figure 19. Figure 19: Comparison between estimated latency based on datacenter layout and measurements on the production fabric. Units omitted for confidentiality. 4 6 8 10 Latency ( s) 0.0 0.2 0.4 0.6 0.8 1.0 CDF over paths Fat tree RNG (baseline) RNG (biased waypoints) RNG (biased waypoints, cabling) [PITH_FULL_IMAGE:figures/full_fig_p017_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Latency distribution of ToR-to-ToR paths [PITH_FULL_IMAGE:figures/full_fig_p017_20.png] view at source ↗
read the original abstract

We design and deploy in production the first flat datacenter networks. Our design, called RNG, is based on quasi-random graphs. While the cost and fault-tolerance benefits of such topologies have been long known, their practical realization has been hampered by a lack of scalable routing and cabling approaches. RNG has a new distributed routing protocol that exploits the properties of random graphs to find a large number of edge disjoint paths between pairs of endpoints. It uses a novel passive optical device that internally shuffles cables, which makes its cabling complexity similar to that of fat trees. We show that RNG matches or exceeds the performance of fat trees for a range of traffic patterns, despite being up to 45% cheaper. RNG is now the default datacenter network for most workloads at Amazon.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents RNG, a flat datacenter network based on quasi-random graphs. It introduces a distributed routing protocol that finds large numbers of edge-disjoint paths by exploiting random-graph properties and a passive optical shuffling device that reduces cabling complexity to levels comparable with fat trees. Evaluations on a range of traffic patterns are used to claim that RNG matches or exceeds fat-tree performance while being up to 45% cheaper; the paper further states that RNG has been deployed in production and is now the default network for most workloads at Amazon.

Significance. If the performance and robustness claims hold under production conditions, the work would offer a practical, lower-cost alternative to fat-tree fabrics at scale. The reported production deployment supplies concrete evidence of deployability and could influence industry practice if the quasi-random routing properties prove stable under real traffic matrices and failure distributions.

major comments (2)
  1. [Performance Evaluation] The performance equivalence claim rests on evaluations using synthetic or stylized traffic patterns (uniform, permutation, hotspot). Because the central claim is that RNG delivers fat-tree-comparable throughput and latency in Amazon's production environment, the manuscript must include results from actual production traffic traces, A/B tests on the deployed fabric, or failure-injection experiments that reflect observed spatial and temporal failure patterns.
  2. [Routing Protocol] The routing protocol's ability to locate large numbers of edge-disjoint paths is asserted to remain effective at Amazon scale, yet no data are provided on how path diversity or throughput degrades when traffic exhibits temporal correlation or incast, or when failures follow the empirical spatial distribution of link and switch outages. This stability assumption is load-bearing for the production-deployment claim.
minor comments (2)
  1. [Abstract] The abstract states 'up to 45% cheaper' without specifying the exact cost model, which components are included, or the baseline fat-tree configuration; a clear cost breakdown should appear in the main text.
  2. [Introduction] Notation for the quasi-random graph construction and the passive optical shuffler should be introduced with a small illustrative example or diagram early in the manuscript to aid readers unfamiliar with random-graph datacenter topologies.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback emphasizing the need to strengthen the link between our evaluations and production conditions. We address each major comment below with specific plans for revision where feasible, while noting constraints on proprietary data.

read point-by-point responses
  1. Referee: [Performance Evaluation] The performance equivalence claim rests on evaluations using synthetic or stylized traffic patterns (uniform, permutation, hotspot). Because the central claim is that RNG delivers fat-tree-comparable throughput and latency in Amazon's production environment, the manuscript must include results from actual production traffic traces, A/B tests on the deployed fabric, or failure-injection experiments that reflect observed spatial and temporal failure patterns.

    Authors: We agree that synthetic patterns provide only partial validation for the central production claim. The manuscript already states that RNG is deployed in production and serves as the default network for most workloads, supplying real-world evidence of stability and performance. However, raw production traffic traces and detailed failure logs are confidential. We will revise the evaluation section to add a new subsection summarizing anonymized, aggregated metrics from production (e.g., observed throughput and latency distributions across representative workloads) and describe the A/B testing approach used internally. We will also incorporate high-level results from failure-injection experiments that mirror publicly documented datacenter failure modes. This constitutes a partial revision because full traces cannot be released. revision: partial

  2. Referee: [Routing Protocol] The routing protocol's ability to locate large numbers of edge-disjoint paths is asserted to remain effective at Amazon scale, yet no data are provided on how path diversity or throughput degrades when traffic exhibits temporal correlation or incast, or when failures follow the empirical spatial distribution of link and switch outages. This stability assumption is load-bearing for the production-deployment claim.

    Authors: The routing protocol exploits random-graph expansion properties to discover edge-disjoint paths, and our existing simulations demonstrate robustness across the reported patterns. We will extend the routing evaluation with additional experiments that inject temporal correlation and incast traffic, reporting the resulting path diversity and throughput degradation. For failures, we will add analysis using empirical spatial distributions drawn from the open literature on datacenter outages. Because the protocol has operated without path-finding incidents in the deployed production fabric, we will include a brief qualitative summary of observed behavior under real conditions. This revision will be made in full. revision: yes

standing simulated objections not resolved
  • We cannot release actual production traffic traces, detailed A/B test logs, or Amazon-specific empirical failure distributions because these data are subject to strict confidentiality policies.

Circularity Check

0 steps flagged

No significant circularity in RNG design and evaluation claims

full rationale

The paper describes a practical systems design for flat datacenter networks using quasi-random graphs, introducing a distributed routing protocol and passive optical cabling device. Performance equivalence to fat trees and cost savings are presented as outcomes of empirical evaluation across traffic patterns plus production deployment at Amazon, rather than any closed mathematical derivation. No load-bearing steps reduce by construction to self-defined variables, fitted parameters renamed as predictions, or self-citation chains that import uniqueness or ansatzes. The derivation chain is self-contained with external grounding in deployment results, consistent with the most common honest finding for non-theoretical papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not expose explicit free parameters, axioms, or invented entities; the design appears to rest on standard properties of random graphs and conventional optical components.

pith-pipeline@v0.9.0 · 5693 in / 1124 out tokens · 25451 ms · 2026-05-22T10:18:52.500638+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages

  1. [1]

    Jung Ho Ahn, Nathan Binkert, Al Davis, Moray McLaren, and Robert S Schreiber. 2009. HyperX: topology, routing, and packaging of effi- cient large-scale networks. InProceedings of the Conference on High Performance Computing Networking, Storage and Analysis. 1–11

  2. [2]

    Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A scalable, commodity data center network architecture. InProc. ACM SIGCOMM Conference on Data Communication. 63–74. https://doi .org/ 10.1145/1402958.1402967

  3. [3]

    Hitesh Ballani, Paolo Costa, Raphael Behrendt, Daniel Cletheroe, Ist- van Haller, Krzysztof Jozwik, Fotini Karinou, Sophie Lange, Kai Shi, Benn Thomsen, and Hugh Williams. 2020. Sirius: A Flat Datacenter Network with Nanosecond Optical Switching. InProc. ACM SIGCOMM Conference on Data Communication. 782–797. https://doi .org/10.1145/ 3387514.3406221

  4. [4]

    Maciej Besta and Torsten Hoefler. 2014. Slim fly: a cost effective low-diameter network topology. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 348–359. https://doi.org/10.1109/SC.2014.34

  5. [5]

    Broadcom. 2026. Tomahawk 5 / BCM78900 series. (2026). https://www.broadcom.com/products/ethernet-connectivity/ switching/strataxgs/bcm78900-series [Retrieved 6-Feb-2026]

  6. [6]

    Broadcom. 2026. Tomahawk3 / BCM56980 Series. (2026). https://www.broadcom.com/products/ethernet-connectivity/ switching/strataxgs/bcm56980-series [Retrieved 6-Feb-2026]

  7. [7]

    Cheng-Shang Chang, Duan-Shin Lee, and Yi-Shean Jou. 2002. Load balanced Birkhoff–von Neumann switches, part I: One-stage buffering. Computer Communications25, 6 (2002), 611–622

  8. [8]

    2009.Concentration of Measure for the Analysis of Randomized Algorithms

    Devdutt Dubhashi and Alessandro Panconesi. 2009.Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge

  9. [9]

    Facebook. 2014. Introducing data center fabric, the next-generation Facebook data center network. (2014). https://engineering .fb.com/ 2014/11/14/production-engineering/introducing-data-center-fabric- the-next-generation-facebook-data-center-network/ [Retrieved 6-Feb-2026]

  10. [10]

    Nathan Farrington, George Porter, Sivasankar Radhakrishnan, Hamid Hajabdolali Bazzaz, Vikram Subramanya, Yeshaiahu Fainman, George Papen, and Amin Vahdat. 2010. Helios: a hybrid electrical/op- tical switch architecture for modular data centers. InProc. ACM SIG- COMM Conference on Data Communication. 339–350

  11. [11]

    Frieze and Páll Melsted

    Alan M. Frieze and Páll Melsted. 2012. Maximum matchings in ran- dom bipartite graphs and the space utilization of Cuckoo Hash tables. Random Struct. Algorithms41, 3 (2012), 334–364

  12. [12]

    fs.com. 2019. Fiber Distribution Panel Wiki, Types and Buying Tips. (2019). https://community .fs.com/article/fiber-distribution- panel-wiki-buying-tips.html [Retrieved: 2026-02-06]

  13. [13]

    fs.com. 2026. 8 Fibers MTP Female Type 1 OM4 50/125 Multimode Fiber Loopback Module. (2026). https://www .fs.com/products/ 35796.html?now_cid=2686 [Retrieved: 6-Feb-2026]

  14. [14]

    Adithya Gangidi, Rui Miao, Shengbao Zheng, Sai Jayesh Bondu, Guilherme Goes, Hany Morsy, Rohit Puri, Mohammad Riftadi, Ashmitha Jeevaraj Shetty, Jingyi Yang, Shuqiang Zhang, Mikel Jimenez Fernandez, Shashidhar Gandham, and Hongyi Zeng. 2024. RDMA over Ethernet for Distributed Training at Meta Scale. InProc. ACM SIG- COMM Conference on Data Communication. ...

  15. [15]

    Monia Ghobadi, Ratul Mahajan, Amar Phanishayee, Nikhil Devanur, Ja- nardhan Kulkarni, Gireeja Ranade, Pierre-Alexandre Blanche, Houman Rastegarfar, Madeleine Glick, and Daniel Kilper. 2016. Projector: Ag- ile reconfigurable data center interconnect. InProc. ACM SIGCOMM Conference on Data Communication. 216–229

  16. [16]

    Hamilton, Navendu Jain, Srikanth Kan- dula, Changhoon Kim, Parantap Lahiri, David A

    Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kan- dula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. InProc. ACM SIGCOMM Conference on Data Communication. 51–62. https://doi.org/10.1145/1592568.1592576

  17. [17]

    Chuanxiong Guo, Guohan Lu, Dan Li, Haitao Wu, Xuan Zhang, Yun- feng Shi, Chen Tian, Yongguang Zhang, and Songwu Lu. 2009. BCube: a high performance, server-centric network architecture for modular data centers. InProc. ACM SIGCOMM Conference on Data Communica- tion. 63–74

  18. [18]

    Chuanxiong Guo, Haitao Wu, Kun Tan, Lei Shi, Yongguang Zhang, and Songwu Lu. 2008. Dcell: a scalable and fault-tolerant network structure for data centers. InProc. ACM SIGCOMM Conference on Data Communication. 75–86

  19. [19]

    Daniel Halperin, Srikanth Kandula, Jitendra Padhye, Paramvir Bahl, and David Wetherall. 2011. Augmenting data center networks with multi-gigabit wireless links. InProc. ACM SIGCOMM Conference on Data Communication. 38–49

  20. [20]

    Navid Hamedazimi, Zafar Qazi, Himanshu Gupta, Vyas Sekar, Samir R Das, Jon P Longtin, Himanshu Shah, and Ashish Tanwer. 2014. Firefly: A reconfigurable wireless data center fabric using free-space optics. In Proc. ACM SIGCOMM Conference on Data Communication. 319–330

  21. [21]

    Brighten Godfrey

    Vipul Harsh, Sangeetha Abdu Jyothi, and P. Brighten Godfrey. 2020. Spineless Data Centers. InProc. ACM Workshop on Hot Topics in Net- works (HotNets). 67–73. https://doi.org/10.1145/3422604.3425945

  22. [22]

    Sangeetha Abdu Jyothi, Ankit Singla, P Brighten Godfrey, and Alexan- dra Kolla. 2016. Measuring and understanding throughput of network topologies. InProceedings of the International Conference for High Per- formance Computing, Networking, Storage and Analysis (SC). 761–772

  23. [23]

    Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, and Ankit Singla. 2017. Beyond fat-trees without antennae, mirrors, and disco-balls. InProc. ACM SIGCOMM Conference on Data Communica- tion. 281–294. https://doi.org/10.1145/3098822.3098836

  24. [24]

    Allan Kaye. 2025. Rail-Optimised Networking: How NVIDIA is Rethinking AI Network Design in the Data Centre. (2025). https://vespertec.com/news/rail-optimised-networking-how-nvidia- is-rethinking-ai-network-design-data-centre/ [Retrieved 3-Feb-2026]. 13

  25. [25]

    Murali Kodialam, TV Lakshman, and Sudipta Sengupta. 2011. Traffic- oblivious routing in the hose model.IEEE/ACM Transactions on Net- working19, 3 (2011), 774–787

  26. [27]

    Hong Liu, Ryohei Urata, Kevin Yasumura, Xiang Zhou, Roy Ban- non, Jill Berger, Pedram Dashti, Norm Jouppi, Cedric Lam, Sheng Li, Erji Mao, Daniel Nelson, George Papen, Mukarram Tariq, and Amin Vahdat. 2023. Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems. InProc. ACM SIGCOMM Conference on Data Communicatio...

  27. [28]

    William M Mellette, Rajdeep Das, Yibo Guo, Rob McGuinness, Alex C Snoeren, and George Porter. 2020. Expanding across time to deliver bandwidth efficiency and low latency. InUSENIX Symposium on Net- worked Systems Design and Implementation (NSDI). 1–18

  28. [29]

    Mellette, Alex Forencich, Rukshani Athapathu, Alex C

    William M. Mellette, Alex Forencich, Rukshani Athapathu, Alex C. Snoeren, George Papen, and George Porter. 2024. Realizing Rotor- Net: Toward Practical Microsecond Scale Optical Networking. In Proc. ACM SIGCOMM Conference on Data Communication. 392–414. https://doi.org/10.1145/3651890.3672273

  29. [30]

    Mogul and John Wilkes

    Jeffrey C. Mogul and John Wilkes. 2023. Physical Deployability Matters. InProc. ACM Workshop on Hot Topics in Networks (HotNets). 9–17. https://doi.org/10.1145/3626111.3628190

  30. [31]

    2001.Randomized Algo- rithms

    Rajeev Motwani and Prabhakar Raghavan. 2001.Randomized Algo- rithms. Cambridge University Press

  31. [32]

    Jayaram Mudigonda, Praveen Yalagandula, Mohammad Al-Fares, and Jeffrey C. Mogul. 2010. SPAIN: COTS data-center Ethernet for multi- pathing over arbitrary topologies. InUSENIX Symposium on Networked Systems Design and Implementation (NSDI). 18

  32. [33]

    Jayaram Mudigonda, Praveen Yalagandula, and Jeffrey C. Mogul. 2011. Taming the Flying Cable Monster: A Topology Design and Optimiza- tion Framework for Data-Center Networks. InUSENIX Annual Techni- cal Conference. https://api.semanticscholar.org/CorpusID:2989246

  33. [34]

    Pooria Namyar, Sucha Supittayapornpong, Mingyang Zhang, Min- lan Yu, and Ramesh Govindan. 2021. A throughput-centric view of the performance of datacenter topologies. InProc. ACM SIGCOMM Conference on Data Communication. 349–369. https://doi .org/10.1145/ 3452296.3472913

  34. [35]

    Sabine R Ohring, Maximilian Ibel, Sajal K Das, and Mohan J Kumar

  35. [36]

    On generalized fat trees. InProc. International Parallel Processing Symposium. 37–44

  36. [37]

    Doron Puder. 2014. Expansion of random graphs: new proofs, new results.Inventiones mathematicae201, 3 (Dec. 2014), 845–908. https: //doi.org/10.1007/s00222-014-0560-x

  37. [38]

    Kun Qian, Yongqing Xi, Jiamin Cao, Jiaqi Gao, Yichi Xu, Yu Guan, Binzhang Fu, Xuemei Shi, Fangbo Zhu, Rui Miao, Chao Wang, Peng Wang, Pengcheng Zhang, Xianlong Zeng, Eddie Ruan, Zhiping Yao, Ennan Zhai, and Dennis Cai. 2024. Alibaba HPN: A Data Center Network for Large Language Model Training. InProc. ACM SIGCOMM Conference on Data Communication. 691–706....

  38. [39]

    Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, and Michael Rubin

    Brandon Schlinker, Radhika Niranjan Mysore, Sean Smith, Jef- frey C. Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, and Michael Rubin. 2015. Condor: Better Topologies Through Declarative Design.. InProc. ACM SIGCOMM Conference on Data Communication. 449–463. http://dblp .uni-trier.de/db/conf/sigcomm/ sigcomm2015.html#SchlinkerMSMVYK15

  39. [40]

    Leah Shalev, Hani Ayoub, Nafea Bshara, and Erez Sabbag. 2020. A Cloud-Optimized Transport Protocol for Elastic and Scalable HPC.IEEE Micro40, 6 (2020), 67–73. https://doi .org/10.1109/ MM.2020.3016891

  40. [41]

    Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armis- tead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2015. Jupiter Rising: A Decade of Clos Topologies and Centralized Con- trol in Google’s Datacenter ...

  41. [42]

    Arjun Singhvi, Nandita Dukkipati, Prashant Chandra, Hassan M. G. Wassel, Naveen Kr. Sharma, Anthony Rebello, Henry Schuh, Praveen Kumar, Behnam Montazeri, Neelesh Bansod, Sarin Thomas, Inho Cho, Hyojeong Lee Seibert, Baijun Wu, Rui Yang, Yuliang Li, Kai Huang, Qianwen Yin, Abhishek Agarwal, Srinivas Vaduvatha, Wei- huang Wang, Masoud Moshref, Tao Ji, Davi...

  42. [43]

    Ankit Singla, Chi-Yao Hong, Lucian Popa, and Philip Brighten Godfrey

  43. [44]

    InProceedings of the 9th USENIX Symposium on Networked Systems Design and Im- plementation (NSDI)

    Jellyfish: Networking Data Centers Randomly. InProceedings of the 9th USENIX Symposium on Networked Systems Design and Im- plementation (NSDI). 225–238. https://www .usenix.org/conference/ nsdi12/technical-sessions/presentation/singla

  44. [45]

    snaketray.com. 2026. The Essential Guide to Data Center Cabling. (2026). https://www .snaketray.com/data-center-cabling-guide/ [Re- trieved 6-Feb-2026]

  45. [46]

    Solano, R

    F. Solano, R. Fabregat, Y. Donoso, and J.L. Marzo. 2005. Asymmetric tunnels in P2MP LSPs as a label space reduction method. InIEEE International Conference on Communications (ICC), Vol. 1. 43–47 Vol. 1. https://doi.org/10.1109/ICC.2005.1494318

  46. [47]

    Solano, R

    F. Solano, R. Fabregat, and J.L. Marzo. 2005. Full label space reduc- tion in MPLS networks: asymmetric merged tunneling.IEEE Com- munications Letters9, 11 (2005), 1021–1023. https://doi .org/10.1109/ LCOMM.2005.11016

  47. [48]

    Asaf Valadarsky, Gal Shahaf, Michael Dinitz, and Michael Schapira

  48. [49]

    Xpander: Towards Optimal-Performance Datacenters. InProc. International on Conference on emerging Networking EXperiments and Technologies (CoNEXT). Association for Computing Machinery, New York, NY, USA, 205–219. https://doi.org/10.1145/2999572.2999580

  49. [50]

    Leslie G Valiant and Gordon J Brebner. 1981. Universal schemes for par- allel communication. InProc. ACM Symposium on Theory of Computing (STOC). 263–277

  50. [51]

    Andersen, Michael Kaminsky, Konstantina Pa- pagiannaki, T.S

    Guohui Wang, David G. Andersen, Michael Kaminsky, Konstantina Pa- pagiannaki, T.S. Eugene Ng, Michael Kozuch, and Michael Ryan. 2010. c-Through: part-time optics in data centers. InProc. ACM SIGCOMM Conference on Data Communication. 327–338. https://doi .org/10.1145/ 1851182.1851222

  51. [52]

    Wikipedia. 2026. Balls into bins problem. (2026). https:// en.wikipedia.org/wiki/Balls_into_bins_problem [Retrieved 6-Feb- 2026]

  52. [53]

    Wikipedia. 2026. Expander graph. (2026). https://en .wikipedia.org/ wiki/Expander_graph [Retrieved 6-Feb-2026]

  53. [54]

    Wikipedia. 2026. k-shortest path routing. (2026). https:// en.wikipedia.org/wiki/K_shortest_path_routing [Retrieved: 6-Feb- 2026]

  54. [55]

    Wikipedia. 2026. Menger’s Theorem. (2026). https://en.wikipedia.org/ wiki/Menger%27s_theorem [Retrieved: 6-Feb-2026]

  55. [56]

    Wikipedia. 2026. Multi-commodity flow problem. (2026). https: //en.wikipedia.org/wiki/Multi-commodity_flow_problem [Retrieved: 14 6-Feb-2026]

  56. [57]

    Damon Wischik, Costin Raiciu, Adam Greenhalgh, and Mark Handley. 2011. Design, Implementation and Evaluation of Congestion Control for Multipath TCP. InUSENIX Sympo- sium on Networked Systems Design and Implementation (NSDI). https://www.usenix.org/conference/nsdi11/design-implementation- and-evaluation-congestion-control-multipath-tcp

  57. [58]

    1999.Models of Random Regular Graphs

    Nicholas Wormald. 1999.Models of Random Regular Graphs. Cam- bridge. https://web.williams.edu/Mathematics/sjmiller/public_html/ ntprob19/handouts/graphs/Womald_ModelsRandomGraphs.pdf

  58. [59]

    Eitan Zahavi. 2012. Fat-tree routing and node ordering providing contention free traffic for MPI global collectives.J. Parallel and Distrib. Comput.72, 11 (2012), 1423–1432. https://doi .org/10.1016/ j.jpdc.2012.01.018 Communication Architectures for Scalable Sys- tems

  59. [60]

    W/o phases

    Xia Zhou, Zengbin Zhang, Yibo Zhu, Yubo Li, Saipriya Kumar, Amin Vahdat, Ben Y Zhao, and Haitao Zheng. 2012. Mirror mirror on the ceil- ing: Flexible wireless links for data centers.ACM SIGCOMM Computer Communication Review42, 4 (2012), 443–454. A INCREMENTAL CABLING The physical connectivity ofRNGdescribed in §6 can be achieved incrementally. While the f...

  60. [61]

    Thus, Pr[𝑌= 2]= [1− (1−𝜙 2 3)3]

    The probability that no 𝑋𝑖 takes value2is (1 −𝜙 2 3)3. Thus, Pr[𝑌= 2]= [1− (1−𝜙 2 3)3]. We deduce Pr[𝑌= 1] as1 −Pr[𝑌= 0] −Pr[𝑌= 2], and applying the above formulas.□ The congestion on a path is distributed as 𝑌+ 1, where 𝑌 is distributed as in Claim G.4. As discussed in the earlier section, the average flow that can be sent isE [1/(𝑌+ 1)]. Using Claim G.4...