arxiv: 2605.12862 · v1 · submitted 2026-05-13 · 💻 cs.NI · cs.LG

Recognition: unknown

NeuroRisk: Physics-Informed Neural Optimization for Risk-Aware Traffic Engineering

Yingming Mao , Ximeng Liu , Jingyi Cheng , Xiyuan Liu , Jiashuai Liu , Yike Liu , Zhen Yao , Yuzhou Zhou

show 3 more authors

Siyuan Feng Qiaozhu Zhai Shizhen Zhao

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:54 UTC · model grok-4.3

classification 💻 cs.NI cs.LG

keywords risk-aware traffic engineeringwide-area networksphysics-informed neural networksdeep unrolled optimizationfailure scenariosSort-and-Select structurenetwork capacity constraintsneural feasibility enforcement

0 comments

The pith

NeuroRisk embeds the Sort-and-Select structure of risk-aware traffic engineering into a neural unrolled optimizer to deliver solver accuracy at 100- to 100000-fold speedups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that risk-aware traffic engineering over correlated failure scenarios in wide-area networks can be unified under a single Sort-and-Select optimization structure. This structure exposes a core tradeoff: classical solvers either simplify scenario selection for speed or pay high decomposition costs, while prior neural methods break when explicit capacity constraints and scenario-dependent risk are required. NeuroRisk addresses the tradeoff by unrolling an optimizer that enforces feasibility through gated edge-local reservations and permutation-invariant cues for scenario sets. If the embedding holds, operators gain the ability to run high-utilization risk-aware routing at operational timescales instead of relying on slow offline solvers or conservative safety margins.

Core claim

NeuroRisk is a physics-informed deep unrolled optimizer that exploits the Sort-and-Select structure of risk-aware TE. It enforces feasibility via gated edge-local reservations and represents scenario sets through permutation-invariant, gradient-aligned cues. On production-style WANs it achieves small optimality gaps relative to the solver with orders of magnitude speedup (10^2-10^5 ×) on risk objectives while outperforming neural baselines on nominal throughput.

What carries the argument

The Sort-and-Select structure that unifies risk-aware TE formulations, realized inside a neural unrolled optimizer via gated edge-local reservations and permutation-invariant scenario cues.

If this is right

Risk-aware TE becomes solvable at operational timescales instead of offline batch mode.
Network operators can reduce safety margins while still meeting availability targets.
Nominal throughput improves over prior neural TE methods that ignore explicit risk constraints.
The same gated-reservation and permutation-invariant design pattern applies to any TE variant whose risk model reduces to Sort-and-Select.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same neural unrolling technique could be applied to other selection-structured problems such as virtual network embedding or robust resource allocation under uncertainty.
Real-time risk-aware routing opens the door to closed-loop systems that continuously adjust reservations as traffic matrices or failure probabilities are observed.
If the permutation-invariant cues remain effective when scenario counts grow to thousands, the method scales to larger backbone networks without exponential solver blowup.

Load-bearing premise

The Sort-and-Select structure can be faithfully embedded into a neural unrolled optimizer using gated edge-local reservations and permutation-invariant cues so that feasibility is enforced under explicit capacity constraints and scenario-dependent risk.

What would settle it

Run NeuroRisk and an exact solver on the same production-style WAN instance with explicit capacity limits and a fixed set of failure scenarios; if NeuroRisk returns any solution that violates a capacity constraint or selects a suboptimal scenario subset, the embedding claim is falsified.

Figures

Figures reproduced from arXiv: 2605.12862 by Jiashuai Liu, Jingyi Cheng, Qiaozhu Zhai, Shizhen Zhao, Siyuan Feng, Ximeng Liu, Xiyuan Liu, Yike Liu, Yingming Mao, Yuzhou Zhou, Zhen Yao.

**Figure 2.** Figure 2: GS optimization fragility. Consider bottleneck [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: NeuroRisk framework architecture. The system separates into two domains: the Unconstrained Latent [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Mechanism of Projection Discontinuity. 𝐵 ′ = 𝜋 (𝐵)), leading to worse outcomes and a bounce-back that prevents progress toward 𝑥 ∗ . In our experiments, LS-based training exhibits a strong short-path preference: when both short and long paths are feasible, LS tends to concentrate traffic on shorter (often direct) routes, since longer paths traverse more edges and are more likely to activate tight constrai… view at source ↗

**Figure 5.** Figure 5: Overall performance summary. (a) Relative error of NeuroRisk versus the Gurobi optimum for each [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Solve time vs. number of scenarios (𝑁) on IBM. Solver-based methods (dashed lines) degrade significantly as 𝑁 grows, with PreTE spiking to 103 s due to decomposition overhead. In contrast, NeuroRisk (solid lines) maintains a flat, millisecond-level inference time. non-smooth runtime “jumps” (rising to 102–103 s), suggesting that the solver frequently enters harder combinatorial regimes (e.g., heavy bran… view at source ↗

**Figure 8.** Figure 8: presents the results on B4. GS and LS exhibit signifiPreTE 0.000 0.001 0.002 0.003 0.004 0.005 Relative Error TeaVaR 0.00 0.05 0.10 0.15 0.20 0.25 FFC 0.000 0.002 0.004 0.006 0.008 0.010 GR BR LS GS [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 10.** Figure 10: Scenario-distribution generalization on IBM. Same as above. (1, 2) (5, 2) (10, 2) (50, 2) (100, 2) Scenario 0.01 0.02 0.03 0.04 Relative Error Train Set FFC TeaVaR (1, 4) (5, 4) (10, 4) (50, 4) (100, 4) Scenario 0.01 0.02 0.03 0.04 Relative Error Train Set [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 9.** Figure 9: Scenario-distribution generalization [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 11.** Figure 11: Scenario-distribution generalization on GEANT. Same as above. checkpoint fixed. These figures extend the representative TeaVaR-only view shown in [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗

**Figure 12.** Figure 12: Demand-perturbation robustness on B4 (TeaVaR). Relative error under demand-proportional Gaussian noise with 𝜎 ∈ {5%, 15%, 30%}; demands are clipped to be nonnegative. (1, 2) (5, 2) (10, 2) (50, 2) (100, 2) Scenario 0.004 0.006 0.008 0.010 Relative Error Noise Level 5% 15% 30% (1, 4) (5, 4) (10, 4) (50, 4) (100, 4) Scenario 0.004 0.006 0.008 0.010 Relative Error Noise Level 5% 15% 30% [PITH_FULL_IMAGE:fig… view at source ↗

**Figure 15.** Figure 15: Tunnel-set generalization on B4. Relative error under candidate tunnel-set shifts (cross-𝐾𝑠𝑝 KSP), evaluated with fixed model weights. F ITERATION COUNT GENERALIZATION [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗

**Figure 16.** Figure 16: Training convergence on GERMANY50. Validation relative error (blue, left axis) and training objective (red, right axis) versus wall-clock time. Scenario chunking (chunk size 20) and gradient checkpointing are both enabled. Training converges within ∼4 hours on a single GPU. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_16.png] view at source ↗

read the original abstract

In production Wide-Area Networks (WANs), correlated failures dominate availability losses, forcing operators to reserve large safety margins that leave substantial capacity underutilized. Achieving high utilization under strict availability targets therefore requires risk-aware Traffic Engineering (TE) over dozens to hundreds of probabilistic failure scenarios-yet solving this problem at operational timescales remains elusive. We demonstrate that existing risk-aware formulations can be unified under an embedded Sort-and-Select structure, exposing a fundamental trade-off between expressiveness and tractability: classical optimizers either restrict scenario selection for efficiency or incur prohibitive decomposition costs. While deep learning appears promising, prior Deep TE methods mainly target maximum link utilization and rely on scaling-based feasibility, which fundamentally breaks under explicit capacity constraints and scenario-dependent risk. We present NeuroRisk, a physics-informed deep unrolled optimizer that exploits the structure of Sort-and-Select. NeuroRisk enforces feasibility via gated edge-local reservations and represents scenario sets through permutation-invariant, gradient-aligned cues. Evaluations on production-style WANs show that NeuroRisk achieves small optimality gaps relative to the solver with orders of magnitude speedup $(10^2- 10^5 \times)$ on risk objectives, while outperforming neural baselines on nominal throughput.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NeuroRisk gives a workable neural unrolling for risk-aware TE with claimed large speedups, but the gated-reservation feasibility step looks like the weakest link and needs checking.

read the letter

NeuroRisk unrolls a physics-informed optimizer around the Sort-and-Select structure that appears in many risk-aware TE formulations. It adds gated edge-local reservations to keep flows feasible and uses permutation-invariant cues to handle scenario sets. The abstract reports small optimality gaps to the exact solver plus speedups of 100 to 100,000 times on production-style WANs, plus better nominal throughput than earlier neural baselines. That combination is new enough to notice if the numbers hold in the full text.

Referee Report

2 major / 1 minor

Summary. The paper claims that risk-aware traffic engineering formulations share an embedded Sort-and-Select structure that trades expressiveness for tractability. It introduces NeuroRisk, a physics-informed unrolled neural optimizer that embeds this structure using gated edge-local reservations to enforce feasibility under explicit capacity constraints and permutation-invariant gradient-aligned cues to represent scenario sets. Evaluations on production-style WANs are reported to yield small optimality gaps versus exact solvers together with speedups of 10^2–10^5× on risk objectives while outperforming prior neural baselines on nominal throughput.

Significance. If the feasibility guarantees and empirical speedups hold, the work would enable operational-scale risk-aware TE that improves utilization under correlated failures without sacrificing availability targets. The structural unification of prior formulations is a useful contribution that could guide future hybrid optimization approaches.

major comments (2)

[NeuroRisk architecture and feasibility enforcement] The central feasibility claim rests on gated edge-local reservations plus permutation-invariant cues recovering the exact feasible set of the original combinatorial problem under scenario-dependent risk. No derivation, invariant, or post-hoc verification (e.g., maximum capacity violation rates across risk scenarios) is supplied to show that the learned gating provably prevents violations when the unrolling is only approximately aligned with the risk distribution; this directly undermines the risk-aware optimality-gap claims.
[Experimental evaluation] The evaluation section reports small optimality gaps and large speedups on production-style WANs, yet supplies no concrete details on network sizes, number of probabilistic scenarios, exact solver baselines, feasibility verification procedure, or statistics on capacity violations (e.g., max or 99th-percentile violation rates). Without these, the central empirical claim that NeuroRisk “enforces feasibility” while delivering the stated speedups cannot be assessed.

minor comments (1)

[Abstract] The abstract states speedups of “10^2-10^5 ×” without indicating whether these are wall-clock times, iteration counts, or per-scenario costs; a brief clarification would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the manuscript would benefit from additional details on feasibility enforcement and experimental specifics, and we will revise accordingly to strengthen the presentation of these claims.

read point-by-point responses

Referee: [NeuroRisk architecture and feasibility enforcement] The central feasibility claim rests on gated edge-local reservations plus permutation-invariant cues recovering the exact feasible set of the original combinatorial problem under scenario-dependent risk. No derivation, invariant, or post-hoc verification (e.g., maximum capacity violation rates across risk scenarios) is supplied to show that the learned gating provably prevents violations when the unrolling is only approximately aligned with the risk distribution; this directly undermines the risk-aware optimality-gap claims.

Authors: We acknowledge that the current manuscript does not include a formal derivation showing that the gated edge-local reservations exactly recover the feasible set under approximate alignment with the risk distribution. The design relies on local capacity gating to enforce constraints by construction, combined with permutation-invariant cues for scenario representation. In the revision we will add both a short proof sketch of the feasibility invariant under the Sort-and-Select structure and post-hoc empirical verification (maximum and 99th-percentile capacity violation rates across all risk scenarios) to substantiate the zero-violation claim in practice. revision: yes
Referee: [Experimental evaluation] The evaluation section reports small optimality gaps and large speedups on production-style WANs, yet supplies no concrete details on network sizes, number of probabilistic scenarios, exact solver baselines, feasibility verification procedure, or statistics on capacity violations (e.g., max or 99th-percentile violation rates). Without these, the central empirical claim that NeuroRisk “enforces feasibility” while delivering the stated speedups cannot be assessed.

Authors: We agree that the experimental section is missing key reproducibility details. The revised manuscript will explicitly report: the network topologies (node/edge counts for each production-style WAN), the exact number of probabilistic failure scenarios per instance, the solver baselines (e.g., Gurobi with the full risk-aware MIP formulation), the feasibility verification procedure (including how capacity violations are measured), and violation statistics (maximum and 99th-percentile rates) confirming that NeuroRisk produces feasible solutions in all reported runs. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The paper presents NeuroRisk as a novel physics-informed unrolled optimizer that identifies and exploits an embedded Sort-and-Select structure in prior risk-aware TE formulations. Feasibility is enforced through a new architectural mechanism (gated edge-local reservations plus permutation-invariant cues) rather than any reduction to fitted parameters, self-citations, or tautological re-derivations. No load-bearing steps reduce by construction to the inputs; the central claims rest on empirical speedups and optimality gaps relative to external solvers and baselines. The derivation is self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

Review performed from abstract only; full paper may contain additional parameters or assumptions not visible here.

free parameters (1)

neural network weights and hyperparameters
Trained parameters that define the optimizer; no specific values or training procedure given in abstract.

axioms (1)

domain assumption Existing risk-aware TE formulations can be unified under an embedded Sort-and-Select structure
Stated as the key insight that exposes the expressiveness-tractability trade-off.

invented entities (2)

gated edge-local reservations no independent evidence
purpose: Enforce feasibility under capacity constraints
New mechanism introduced in NeuroRisk to satisfy explicit constraints during optimization.
permutation-invariant gradient-aligned cues no independent evidence
purpose: Represent scenario sets for risk objectives
Representation chosen to handle probabilistic failure scenarios in a neural setting.

pith-pipeline@v0.9.0 · 5549 in / 1343 out tokens · 67412 ms · 2026-05-14T18:54:42.991424+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · 2 internal anchors

[1]

Firas Abuzaid, Srikanth Kandula, Behnaz Arzani, Ishai Menache, Matei Zaharia, and Peter Bailis. 2021. Contracting Wide-area Network Topologies to Solve Flow Problems Quickly. In18th USENIX Sym- posium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 175–200. https://www.usenix.org/conference/ nsdi21/presentation/abuzaid

work page 2021
[2]

Akyildiz, Ahyoung Lee, Pu Wang, Min Luo, and Wu Chou

Ian F. Akyildiz, Ahyoung Lee, Pu Wang, Min Luo, and Wu Chou

work page
[3]

A roadmap for traffic engineering in SDN-OpenFlow networks. Comput. Netw.71 (Oct. 2014), 1–30. https://doi.org/10.1016/j.comnet. 2014.06.002

work page doi:10.1016/j.comnet 2014
[4]

Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam, Francis Ma- tus, Rong Pan, Navindra Yadav, and George Varghese. 2014. CONGA: distributed congestion-aware load balancing for datacenters. InPro- ceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM ’14). Association for Computing Machiner...

work page doi:10.1145/2619239.2626316 2014
[5]

Rdma over ethernet for distributed training at meta scale,

Abd AlRhman AlQiam, Yuanjun Yao, Zhaodong Wang, Satyajeet Singh Ahuja, Ying Zhang, Sanjay G. Rao, Bruno Ribeiro, and Mohit Tawar- malani. 2024. Transferable Neural WAN TE for Changing Topolo- gies. InProceedings of the ACM SIGCOMM 2024 Conference (ACM SIG- COMM ’24). Association for Computing Machinery, New York, NY, USA, 86–102. https://doi.org/10.1145/3...

work page doi:10.1145/3651890.3672237 2024
[6]

David Applegate and Edith Cohen. 2003. Making intra-domain rout- ing robust to changing and uncertain traffic demands: understanding fundamental tradeoffs. InProceedings of the 2003 Conference on Appli- cations, Technologies, Architectures, and Protocols for Computer Com- munications (SIGCOMM ’03). Association for Computing Machinery, New York, NY, USA, 3...

work page doi:10.1145/863955.863991 2003
[7]

Shivam Arora, Alex Bihlo, and Francis Valiquette. 2024. Invariant physics-informed neural networks for ordinary differential equations. J. Mach. Learn. Res.25, 1, Article 233 (Jan. 2024), 24 pages

work page 2024
[8]

Yossi Azar, Edith Cohen, Amos Fiat, Haim Kaplan, and Harald Racke

work page
[9]

InProceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing (STOC ’03)

Optimal oblivious routing in polynomial time. InProceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing (STOC ’03). Association for Computing Machinery, New York, NY, USA, 383–388. https://doi.org/10.1145/780542.780599

work page doi:10.1145/780542.780599
[10]

Le, Mohammad Norouzi, and Samy Bengio

Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. 2017. Neural Combinatorial Optimization with Reinforcement Learning. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings. OpenReview.net. https://openreview.net/forum?id=Bk9mxlSFx

work page 2017
[11]

Yoshua Bengio, Andrea Lodi, and Antoine Prouvost. 2021. Ma- chine learning for combinatorial optimization: A methodological tour d’horizon.European Journal of Operational Research290, 2 (2021), 405–421. https://doi.org/10.1016/j.ejor.2020.07.063

work page doi:10.1016/j.ejor.2020.07.063 2021
[12]

Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang

work page
[13]

InProceedings of the Seventh COnference on Emerging Networking EXperiments and Technologies (CoNEXT ’11)

MicroTE: fine grained traffic engineering for data centers. InProceedings of the Seventh COnference on Emerging Networking EXperiments and Technologies (CoNEXT ’11). Association for Com- puting Machinery, New York, NY, USA, Article 8, 12 pages. https: //doi.org/10.1145/2079296.2079304

work page doi:10.1145/2079296.2079304
[14]

Umair bin Waheed, Ehsan Haghighat, Tariq Alkhalifah, Chao Song, and Qi Hao. 2021. PINNeik: Eikonal solution using physics-informed neural networks.Computers & Geosciences155 (2021), 104833. https: //doi.org/10.1016/j.cageo.2021.104833

work page doi:10.1016/j.cageo.2021.104833 2021
[15]

Jeremy Bogle, Nikhil Bhatia, Manya Ghobadi, Ishai Menache, Nikolaj Bjørner, Asaf Valadarsky, and Michael Schapira. 2019. TEAVAR: strik- ing the right utilization-availability balance in WAN traffic engineering. 13 SIGCOMM’26, Colorado, USA Mao et al. InProceedings of the ACM Special Interest Group on Data Communi- cation (SIGCOMM ’19). Association for Com...

work page doi:10.1145/3341302.3342069 2019
[16]

Luonan Chen and Kazuyuki Aihara. 1995. Chaotic simulated annealing by a neural network model with transient chaos.Neural Networks8, 6 (1995), 915–930. https://doi.org/10.1016/0893-6080(95)00033-V

work page doi:10.1016/0893-6080(95)00033-v 1995
[17]

Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song

Hanjun Dai, Elias B. Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song

work page
[18]

Learning Combinatorial Optimization Algorithms over Graphs. (2018). arXiv:cs.LG/1704.01665 https://arxiv.org/abs/1704.01665

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

Fortz and M

B. Fortz and M. Thorup. 2000. Internet traffic engineering by opti- mizing OSPF weights. InProceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064), Vol. 2. 519–528 vol.2. https://doi.org/10.1109/INFCOM.2000.832225

work page doi:10.1109/infcom.2000.832225 2000
[20]

2016.Deep Learning

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016.Deep Learning. MIT Press. http://www.deeplearningbook.org

work page 2016
[21]

Ramesh Govindan, Ina Minei, Mahesh Kallahalla, Bikash Koley, and Amin Vahdat. 2016. Evolve or Die: High-Availability Design Prin- ciples Drawn from Googles Network Infrastructure. InProceedings of the 2016 ACM SIGCOMM Conference (SIGCOMM ’16). Associa- tion for Computing Machinery, New York, NY, USA, 58–72. https: //doi.org/10.1145/2934872.2934891

work page doi:10.1145/2934872.2934891 2016
[22]

Karol Gregor and Yann LeCun. 2010. Learning fast approximations of sparse coding. InProceedings of the 27th International Conference on International Conference on Machine Learning (ICML’10). Omnipress, Madison, WI, USA, 399–406

work page 2010
[23]

Fei Gui, Songtao Wang, Dan Li, Li Chen, Kaihui Gao, Congcong Min, and Yi Wang. 2024. RedTE: Mitigating Subsecond Traffic Bursts with Real-time and Distributed Traffic Engineering. InProceedings of the ACM SIGCOMM 2024 Conference (ACM SIGCOMM ’24). Association for Computing Machinery, New York, NY, USA, 71–85. https://doi.org/10. 1145/3651890.3672231

work page arXiv 2024
[24]

Gurobi Optimization, LLC. 2023. Gurobi optimizer reference manual. (2023). https://www.gurobi.com

work page 2023
[25]

Chi-Yao Hong, Srikanth Kandula, Ratul Mahajan, Ming Zhang, Vijay Gill, Mohan Nanduri, and Roger Wattenhofer. 2013. Achieving high utilization with software-driven WAN.SIGCOMM Comput. Commun. Rev.43, 4 (Aug. 2013), 15–26. https://doi.org/10.1145/2534169.2486012

work page doi:10.1145/2534169.2486012 2013
[26]

Chi-Yao Hong, Subhasree Mandal, Mohammad Al-Fares, Min Zhu, Richard Alimi, Kondapa Naidu B., Chandan Bhagat, Sourabh Jain, Jay Kaimal, Shiyu Liang, Kirill Mendelev, Steve Padgett, Faro Rabe, Saikat Ray, Malveeka Tewari, Matt Tierney, Monika Zahn, Jonathan Zolla, Joon Ong, and Amin Vahdat. 2018. B4 and after: managing hierarchy, partitioning, and asymmetry...

work page doi:10.1145/3230543.3230545 2018
[27]

Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, Jon Zolla, Urs Hölzle, Stephen Stuart, and Amin Vah- dat. 2013. B4: experience with a globally-deployed software defined wan. InProceedings of the ACM SIGCOMM 2013 Conference on SIG- COMM (SIGCOMM ’13). Association for ...

work page doi:10.1145/2486001.2486019 2013
[28]

Chuan Jiang, Zixuan Li, Sanjay Rao, and Mohit Tawarmalani. 2022. Flexile: meeting bandwidth objectives almost always. InProceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT ’22). Association for Computing Machinery, New York, NY, USA, 110–125. https://doi.org/10.1145/3555050.3569119

work page doi:10.1145/3555050.3569119 2022
[29]

Chuan Jiang, Sanjay Rao, and Mohit Tawarmalani. 2020. PCF: Provably Resilient Flexible Routing. InProceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applica- tions, Technologies, Architectures, and Protocols for Computer Commu- nication (SIGCOMM ’20). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3387514.3405858 2020
[30]

Rao, and Mohit Tawarmalani

Chuan Jiang, Sanjay G. Rao, and Mohit Tawarmalani. 2021. FloMore: Meeting bandwidth requirements of flows.CoRRabs/2108.03221 (2021). arXiv:2108.03221 https://arxiv.org/abs/2108.03221

work page arXiv 2021
[31]

Wenjie Jiang, Rui Zhang-Shen, Jennifer Rexford, and Mung Chiang

work page
[32]

InProceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’09)

Cooperative content distribution and traffic engineering in an ISP network. InProceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’09). Association for Computing Machinery, New York, NY, USA, 239–250. https://doi.org/10.1145/1555349.1555377

work page doi:10.1145/1555349.1555377
[33]

Srikanth Kandula, Dina Katabi, Bruce Davie, and Anna Charny. 2005. Walking the tightrope: responsive yet stable traffic engineering.SIG- COMM Comput. Commun. Rev.35, 4 (Aug. 2005), 253–264. https: //doi.org/10.1145/1090191.1080122

work page doi:10.1145/1090191.1080122 2005
[34]

2018.Combinatorial Optimization: Theory and Algorithms(6th ed.)

Bernhard Korte and Jens Vygen. 2018.Combinatorial Optimization: Theory and Algorithms(6th ed.). Springer Publishing Company, Incor- porated

work page 2018
[35]

Praveen Kumar, Yang Yuan, Chris Yu, Nate Foster, Robert Kleinberg, Petr Lapukhov, Chiun Lin Lim, and Robert Soulé. 2018. Semi-Oblivious Traffic Engineering: The Road Not Taken. In15th USENIX Sympo- sium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 157–170. https://www.usenix.org/ conference/nsdi18/presentation/kumar

work page 2018
[36]

Ximing Lian and Liao Chen. 2026. Gaussian Causal Physics-Informed Neural Networks. InProceedings of the 2025 3rd International Conference on Mathematics and Machine Learning (ICMML ’25). Association for Computing Machinery, New York, NY, USA, 193–199. https://doi.org/ 10.1145/3783779.3783812

work page doi:10.1145/3783779.3783812 2026
[37]

Hongqiang Harry Liu, Srikanth Kandula, Ratul Mahajan, Ming Zhang, and David Gelernter. 2014. Traffic engineering with forward fault cor- rection.SIGCOMM Comput. Commun. Rev.44, 4 (Aug. 2014), 527–538. https://doi.org/10.1145/2740070.2626314

work page doi:10.1145/2740070.2626314 2014
[38]

Xiyuan Liu, Yang Liu, Jingyi Cheng, Ximeng Liu, and Shizhen Zhao

work page
[39]

InProceedings of the 9th Asia-Pacific Workshop on Networking (APNET ’25)

FauTE: Fault-tolerant Traffic Engineering in Data Center Net- work. InProceedings of the 9th Asia-Pacific Workshop on Networking (APNET ’25). Association for Computing Machinery, New York, NY, USA, 214–219. https://doi.org/10.1145/3735358.3735364

work page doi:10.1145/3735358.3735364
[40]

Ximeng Liu, Shizhen Zhao, Yong Cui, and Xinbing Wang. 2024. FI- GRET: Fine-Grained Robustness-Enhanced Traffic Engineering. InPro- ceedings of the ACM SIGCOMM 2024 Conference (ACM SIGCOMM ’24). Association for Computing Machinery, New York, NY, USA, 117–135. https://doi.org/10.1145/3651890.3672258

work page doi:10.1145/3651890.3672258 2024
[41]

Ximeng Liu, Shizhen Zhao, and Xinbing Wang. 2025. Geminet: Learn- ing the Duality-based Iterative Process for Lightweight Traffic En- gineering in Changing Topologies. (2025). arXiv:cs.NI/2506.23640 https://arxiv.org/abs/2506.23640

work page internal anchor Pith review Pith/arXiv arXiv 2025
[42]

Yingming Mao, Qiaozhu Zhai, Ximeng Liu, Zhen Yao, Xia Zhu, and Yuzhou Zhou. 2025. A Fast Solver-Free Algorithm for Traffic Engi- neering in Large-Scale Data Center Network. (Dec. 2025). https: //doi.org/10.48550/arXiv.2504.04027 arXiv:cs/2504.04027

work page doi:10.48550/arxiv.2504.04027 2025
[43]

Savory, and Polina Bayvel

Robin Matzner, Akanksha Ahuja, Rasoul Sadeghi, Michael Doherty, Alejandra Beghelli, Seb J. Savory, and Polina Bayvel. 2025. Topology Bench: systematic graph-based benchmarking for core optical net- works.Journal of Optical Communications and Networking17, 1 (2025), 7–27. https://doi.org/10.1364/JOCN.534477

work page doi:10.1364/jocn.534477 2025
[44]

Congcong Miao, Zhizhen Zhong, Yiren Zhao, Arpit Gupta, Ying Zhang, Sirui Li, Zekun He, Xianneng Zou, and Jilong Wang. 2025. PreTE: Traffic Engineering with Predictive Failures. InProceedings of the ACM SIGCOMM 2025 Conference (SIGCOMM ’25). Association for Computing 14 NeuroRisk: Physics-Informed Neural Optimization for Risk-Aware Traffic Engineering SIGC...

work page arXiv 2025
[45]

Yarin Perry, Felipe Vieira Frujeri, Chaim Hoch, Srikanth Kandula, Ishai Menache, Michael Schapira, and Aviv Tamar. 2023. DOTE: Rethinking (Predictive) WAN Traffic Engineering. In20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Asso- ciation, Boston, MA, 1557–1581. https://www.usenix.org/conference/ nsdi23/presentation/perry

work page 2023
[46]

Raissi, P

M. Raissi, P. Perdikaris, and G.E. Karniadakis. 2019. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.J. Comput. Phys.378 (2019), 686–707. https://doi.org/10.1016/j.jcp.2018. 10.045

work page doi:10.1016/j.jcp.2018 2019
[47]

Tyrrell Rockafellar and Stanislav Uryasev

R. Tyrrell Rockafellar and Stanislav Uryasev. 2000. Optimization of conditional value-at risk.Journal of Risk3 (2000), 21–41. https://api. semanticscholar.org/CorpusID:854622

work page 2000
[48]

Matthew Roughan, Albert Greenberg, Charles Kalmanek, Michael Rumsewicz, Jennifer Yates, and Yin Zhang. 2002. Experience in mea- suring backbone traffic variability: models, metrics, measurements and meaning. InProceedings of the 2nd ACM SIGCOMM Workshop on Internet Measurment (IMW ’02). Association for Computing Machinery, New York, NY, USA, 91–92. https:...

work page doi:10.1145/637201.637213 2002
[49]

Ke Tang and Xin Yao. 2024. Learn to Optimize—a Brief Overview. National Science Review11, 8 (April 2024), nwae132. https://doi. org/10.1093/nsr/nwae132 arXiv:https://academic.oup.com/nsr/article- pdf/11/8/nwae132/58527001/nwae132.pdf

work page doi:10.1093/nsr/nwae132 2024
[50]

Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. InProceedings of the 29th International Conference on Neu- ral Information Processing Systems - Volume 2 (NIPS’15). MIT Press, Cambridge, MA, USA, 2692–2700

work page 2015
[51]

Hao Wang, Haiyong Xie, Lili Qiu, Yang Richard Yang, Yin Zhang, and Albert Greenberg. 2006. COPE: traffic engineering in dynamic networks.SIGCOMM Comput. Commun. Rev.36, 4 (Aug. 2006), 99–110. https://doi.org/10.1145/1151659.1159926

work page doi:10.1145/1151659.1159926 2006
[52]

A millimeter wave backscatter network for two-way communication and localization,

Zhiying Xu, Francis Y. Yan, Rachee Singh, Justin T. Chiu, Alexander M. Rush, and Minlan Yu. 2023. Teal: Learning-Accelerated Optimization of WAN Traffic Engineering. InProceedings of the ACM SIGCOMM 2023 Conference (ACM SIGCOMM ’23). Association for Computing Machinery, New York, NY, USA, 378–393. https://doi.org/10.1145/ 3603269.3604857

work page arXiv 2023
[53]

yan yang, Jian Sun, Huibin Li, and Zongben Xu. 2016. Deep ADMM-Net for Compressive Sensing MRI. InAdvances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), Vol. 29. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2016/file/ 1679091c5a880faf6fb5e6087eb1b2dc-Paper.pdf

work page 2016
[54]

Zhizhen Zhong, Manya Ghobadi, Alaa Khaddaj, Jonathan Leach, Yiting Xia, and Ying Zhang. 2021. ARROW: restoration-aware traffic engi- neering. InProceedings of the 2021 ACM SIGCOMM 2021 Conference (SIGCOMM ’21). Association for Computing Machinery, New York, NY, USA, 560–579. https://doi.org/10.1145/3452296.3472921 APPENDIX A FULL NOTATION To facilitate a ...

work page doi:10.1145/3452296.3472921 2021