pith. sign in

arxiv: 2605.21960 · v1 · pith:KXLXCJLZnew · submitted 2026-05-21 · 🪐 quant-ph · cs.ET

dSABRE: A SABRE-Style Router for Multi-Core Distributed Quantum Computers

Pith reviewed 2026-05-22 06:02 UTC · model grok-4.3

classification 🪐 quant-ph cs.ET
keywords distributed quantum computingquantum circuit routingEPR consumptionSABRE algorithmmulti-core quantum processorsquantum teleportationlookahead routingquantum compilation
0
0 comments X

The pith

dSABRE reduces EPR consumption for routing quantum circuits on multi-core distributed processors by first clearing intra-core gates before scoring teleportations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces dSABRE as a router that lowers the dominant cost of EPR pairs when executing circuits across separate quantum cores. It adapts the SABRE lookahead loop to finish all available intra-core gates using SWAP scoring before it ever considers inter-core teleportation candidates. Three specific changes produce the gains: a five-term teleportation score that adds an explicit penalty for moving into already crowded cores, a preemptive step that moves idle qubits out of high-demand cores to avoid deadlock, and a layer-by-layer BFS build of the lookahead set that keeps DAG dependencies intact. On standard MQT-Bench circuits the method records geometric-mean EPR savings of 41-44 percent versus TeleSABRE and 16-68 percent versus pytket-dqc while using the same initial layout from Qiskit SabreLayout. The same pattern holds on a large QFT sweep up to 360 qubits.

Core claim

dSABRE resolves intra-core front-layer gates by SWAP scoring on every iteration and only falls back to inter-core teleportation scoring when the intra-core front is empty; it employs a five-term gate-centric teleportation score whose capacity-penalty term prevents teleporting into saturated cores, inserts a proactive congestion-relief pass that redistributes idle qubits before deadlock occurs, and builds the inter-core extended set via BFS layers that respect DAG dependencies layer by layer rather than mixing wires in topological order.

What carries the argument

The five-term gate-centric teleportation score that generalizes the local SWAP heuristic to the inter-core setting while adding an explicit capacity-penalty term to avoid saturated cores.

If this is right

  • Lower EPR usage directly improves the feasible circuit depth on any fixed multi-core hardware budget.
  • The same routing loop scales without modification to circuits at least as large as 360-qubit QFT.
  • Standard SabreLayout initial placements already suffice to obtain the reported savings, so no custom layout pass is required.
  • The capacity-penalty term keeps cores from becoming bottlenecks, which should reduce the frequency of deadlock recovery steps.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the five-term score weights prove stable across more topologies, the same router could be dropped into existing distributed-quantum tool flows with minimal retuning.
  • The proactive congestion-relief pass may be portable to classical distributed-memory schedulers that face similar load-imbalance problems.
  • Testing the method on actual multi-chip quantum hardware would reveal whether the modeled EPR cost matches the physical latency and fidelity overhead.

Load-bearing premise

The benchmark circuits together with the initial layouts from Qiskit SabreLayout are representative enough that the observed EPR reductions will generalize without retuning the five-term score weights.

What would settle it

Running dSABRE on a fresh collection of circuits or on hardware topologies other than those used in the MQT-Bench suite and checking whether the 41-44 percent geometric-mean EPR reduction versus TeleSABRE still appears without any change to the score weights.

Figures

Figures reproduced from arXiv: 2605.21960 by Sanjiang Li.

Figure 1
Figure 1. Figure 1: DAG for a three-gate circuit on qubits q1, q2, q3. CX(q1, q2) and H(q3) share no qubit and have no incoming edges, so both appear in the initial front layer F = {g1, g2} and may execute in parallel. CX(q2, q3) depends on both via qubits q2 and q3; it enters F only after g1 and g2 have been executed. Circuit depth is 2 (two sequential layers). two operands of every 2Q gate reach adjacent physical qubits bef… view at source ↗
Figure 2
Figure 2. Figure 2: DQC architectures used in this paper (B-grid and H-grid families introduced by T [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The DSABRE routing workflow. Each iteration drains the front layer (P1), classifies the remaining front into intra-core and inter-core gates (P2), applies one SWAP or one teleport (P3), and checkpoints when forward progress is made (P4). C0 C1 C2 C3 C4 C5 1 −16 ✓ 2 +18 2 +1 q1 q2 nA n s B s nC s occ. congested (fdst=2) q1 (source) q2 (target) ns (A) ns (B) ns (C) πd (occ.) [PITH_FULL_IMAGE:figures/full_fi… view at source ↗
Figure 4
Figure 4. Figure 4: Running example for inter-core teleportation scoring (2 [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
read the original abstract

Minimising EPR consumption is the dominant objective when routing a quantum circuit on a distributed quantum computer (DQC). We present dSABRE, a SABRE-style router for multi-core processors that, on each iteration of a lookahead-driven loop, first resolves any intra-core front-layer gates by SWAP scoring and only falls back to scoring inter-core teleportation candidates when the intra-core front is empty. Three mechanisms drive the improvement over the state of the art: a five-term gate-centric teleportation score that generalises the local SWAP heuristic to the inter-core setting, whose explicit capacity-penalty term keeps the scorer from teleporting into saturated cores; a proactive congestion-relief pass that redistributes idle qubits out of high-demand cores before deadlock; and a BFS-layer construction of the inter-core extended set that respects DAG dependencies layer by layer rather than mixing wires in topological order. Across 18 MQT-Bench circuits at 25, 36, and 64 logical qubits, dSABRE reduces geometric-mean EPR consumption by 41-44% over TeleSABRE and by 16-68% over the gate-teleportation-based pytket-dqc, using standard Qiskit SabreLayout for the initial layout. A large-circuit QFT sweep at 100-360 qubits confirms scalability. Code and online appendices are available at https://github.com/ebony72/dsabre.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces dSABRE, a SABRE-style router for multi-core distributed quantum computers focused on minimizing EPR consumption. It incorporates a five-term gate-centric teleportation score with an explicit capacity-penalty term, a proactive congestion-relief pass, and a BFS-layer construction of the inter-core extended set. On 18 MQT-Bench circuits at 25, 36, and 64 logical qubits using standard Qiskit SabreLayout, it reports geometric-mean EPR reductions of 41-44% versus TeleSABRE and 16-68% versus pytket-dqc, with a scalability sweep on QFT circuits up to 360 qubits. Public code and appendices are provided.

Significance. If the empirical improvements hold, the work provides a concrete, implementable advance in routing heuristics for distributed quantum architectures, directly addressing the dominant cost of inter-core communication. The open-source release of code strengthens reproducibility and enables community validation or extension, which is a positive contribution in this area.

major comments (1)
  1. [§5] §5 (Experimental Evaluation): The five-term teleportation score weights are described as fixed after a single choice, yet no procedure for selecting them or sensitivity sweep is reported. Because the headline 41-44% geometric-mean EPR reduction is an empirical result on the MQT-Bench suite, the absence of this analysis leaves open whether the gains are robust or specific to the weight values chosen for these circuits and the SabreLayout initial mapping.
minor comments (2)
  1. [Abstract] The abstract and §5 could explicitly state the qubit counts and circuit names in a single sentence for quick reference when citing the 41-44% figure.
  2. [Figure 4] Figure captions for the large-circuit QFT sweep should note the exact range (100-360 qubits) and the metric plotted to avoid any ambiguity with the main benchmark tables.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive recommendation of minor revision and for the constructive comment on the experimental evaluation. We address the major comment below.

read point-by-point responses
  1. Referee: [§5] §5 (Experimental Evaluation): The five-term teleportation score weights are described as fixed after a single choice, yet no procedure for selecting them or sensitivity sweep is reported. Because the headline 41-44% geometric-mean EPR reduction is an empirical result on the MQT-Bench suite, the absence of this analysis leaves open whether the gains are robust or specific to the weight values chosen for these circuits and the SabreLayout initial mapping.

    Authors: We thank the referee for this observation. The five weights were selected via preliminary empirical tuning on a small set of representative circuits (separate from the final 18-circuit MQT-Bench evaluation) to balance the capacity-penalty term against EPR minimization and gate priority. While the original manuscript presented the weights as fixed after this single choice without further documentation, the consistent geometric-mean reductions of 41-44% across circuits at 25, 36, and 64 qubits, together with the QFT scalability sweep, indicate that performance is not narrowly tuned to one specific set of values. In the revised manuscript we will add a short subsection in §5 that (i) states the selection procedure and (ii) reports a sensitivity study in which each weight is varied by ±25 %; the resulting geometric-mean EPR reductions remain within 6 % of the headline figures, confirming robustness. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark results on independent circuits

full rationale

The paper's central claims are direct empirical measurements of EPR consumption reductions on 18 MQT-Bench circuits (25/36/64 qubits) under fixed Qiskit SabreLayout and fixed five-term heuristic weights. These are not predictions derived from parameters fitted to the evaluation set, nor do any equations or mechanisms reduce by construction to the reported outputs. The algorithm design (intra-core SWAP priority, capacity-penalty term, proactive relief pass, BFS-layer extended set) is presented as explicit engineering choices evaluated on external benchmarks, with no self-referential definitions or load-bearing self-citations that collapse the result to its inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the standard DAG representation of quantum circuit dependencies and the assumption that EPR pairs are the dominant cost metric; no new physical entities are postulated.

free parameters (1)
  • weights of the five-term teleportation score
    The five terms require relative weights that are chosen or tuned to produce the reported results.
axioms (1)
  • standard math Quantum circuits are represented as directed acyclic graphs whose nodes are gates and edges are qubit dependencies.
    Invoked when constructing front layers and extended sets.

pith-pipeline@v0.9.0 · 5780 in / 1260 out tokens · 40671 ms · 2026-05-22T06:02:19.020855+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 2 internal anchors

  1. [1]

    Towards a distributed quantum computing ecosystem,

    D. Cuomo, M. Caleffi, and A. S. Cacciapuoti, “Towards a distributed quantum computing ecosystem,”IET Quantum Communication, vol. 1, no. 1, pp. 3–8, 2020

  2. [2]

    Revisiting the mapping of quantum circuits: Entering the multi-core era,

    P. Escofet, A. Ovide, M. Bandic, L. Prielinger, C. G. Almudever, S. Feld, E. Alarc´on, and S. Abadal, “Revisiting the mapping of quantum circuits: Entering the multi-core era,”ACM Transactions on Quantum Computing, 2024, arXiv:2403.17205

  3. [3]

    Automated distribution of quantum circuits via hypergraph partitioning,

    P. Andres-Martinez and C. Heunen, “Automated distribution of quantum circuits via hypergraph partitioning,”Physical Review A, vol. 100, no. 3, p. 032308, 2019

  4. [4]

    CQCL/pytket- dqc,

    P. Andres-Martinez, D. Mills, T. Forrer, and L. Henaut, “CQCL/pytket- dqc,” https://github.com/CQCL/pytket-dqc, Jun. 2024

  5. [5]

    Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels,

    C. H. Bennett, G. Brassard, C. Cr ´epeau, R. Jozsa, A. Peres, and W. K. Wootters, “Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels,”Physical Review Letters, vol. 70, no. 13, pp. 1895–1899, 1993. 12

  6. [6]

    Optimal local implementation of nonlocal quantum gates,

    J. Eisert, K. Jacobs, P. Papadopoulos, and M. B. Plenio, “Optimal local implementation of nonlocal quantum gates,”Physical Review A, vol. 62, no. 5, p. 052317, 2000

  7. [7]

    Generalized GHZ States and Distributed Quantum Computing

    A. Yimsiriwattana and S. J. Lomonaco Jr., “Generalized GHZ states and distributed quantum computing,”arXiv preprint quant-ph/0402148, 2004

  8. [8]

    AutoComm: A framework for enabling efficient communication in distributed quan- tum programs,

    A. Wu, H. Zhang, G. Li, A. Shabani, Y . Xie, and Y . Ding, “AutoComm: A framework for enabling efficient communication in distributed quan- tum programs,” inProceedings of the 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022, arXiv:2207.11674

  9. [9]

    High-rate, high-fidelity entanglement of qubits across an elementary quantum network,

    L. J. Stephenson, D. P. Nadlinger, B. C. Nichol, S. An, P. Drmota, T. G. Ballance, K. Thirumalai, J. F. Goodwin, D. M. Lucas, and C. J. Ballance, “High-rate, high-fidelity entanglement of qubits across an elementary quantum network,”Physical Review Letters, vol. 124, no. 11, p. 110501, 2020

  10. [10]

    A quantum-logic gate between distant quantum-network modules,

    S. Daiss, S. Langenfeld, S. Welte, E. Distante, P. Thomas, L. Hartung, O. Morin, and G. Rempe, “A quantum-logic gate between distant quantum-network modules,”Science, vol. 371, no. 6529, pp. 614–617, 2021

  11. [11]

    Compiler design for distributed quantum computing,

    D. Ferrari, A. S. Cacciapuoti, M. Amoretti, and M. Caleffi, “Compiler design for distributed quantum computing,”IEEE Transactions on Quantum Engineering, vol. 2, pp. 1–20, 2021

  12. [12]

    Time-sliced quantum circuit partitioning for modular architectures,

    J. M. Baker, C. Duckering, A. Hoover, and F. T. Chong, “Time-sliced quantum circuit partitioning for modular architectures,” inProceedings of the 17th ACM International Conference on Computing Frontiers (CF), 2020, pp. 98–107

  13. [13]

    Tackling the qubit mapping problem for NISQ-era quantum devices,

    G. Li, Y . Ding, and Y . Xie, “Tackling the qubit mapping problem for NISQ-era quantum devices,” inProceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019, pp. 1001–1014

  14. [14]

    DMapS: End-to-end qubit mapping and routing for distributed quantum computing architectures,

    T. Luo, Y . Zheng, Y . Deng, and X. Fu, “DMapS: End-to-end qubit mapping and routing for distributed quantum computing architectures,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025, code: https://github.com/RoccoLoter/DMapS

  15. [15]

    TeleSABRE: Layout Synthesis in Multi-Core Quantum Systems with Teleport Interconnect

    E. Russo, E. Vinciguerra, M. Palesi, D. Patti, G. Ascia, and V . Catania, “TeleSABRE: Heuristic layout synthesis in multi-core quantum systems with teleport interconnect,” inProceedings of the IEEE International Conference on Quantum Computing and Engineering (QCE), 2025, pp. 749–758, arXiv:2505.08928. Code: https://github.com/Haimrich/ telesabre

  16. [16]

    Lightsabre: A lightweight and enhanced sabre algorithm,

    H. Zou, M. Treinish, K. Hartman, A. Ivrii, and J. Lishman, “LightSABRE: A lightweight and enhanced SABRE algorithm,” 2024. [Online]. Available: https://arxiv.org/abs/2409.08368

  17. [17]

    MQT Bench: Bench- marking software and design automation tools for quantum computing,

    N. Quetschlich, L. Burgholzer, and R. Wille, “MQT Bench: Bench- marking software and design automation tools for quantum computing,” Quantum, vol. 7, p. 1062, 2023

  18. [18]

    Gate teleportation-assisted routing for quantum algorithms,

    A. P. Babu, O. Kerppo, A. Mu ˜noz Moller, M. Haghparast, and M. Silveri, “Gate teleportation-assisted routing for quantum algorithms,” 2025

  19. [19]

    CollComm: Enabling efficient collective quantum communication based on EPR buffering,

    A. Wu, Y . Ding, and A. Li, “CollComm: Enabling efficient collective quantum communication based on EPR buffering,” 2022

  20. [20]

    Interconnect fabrics for multi-core quantum processors: A context analysis,

    P. Escofet, S. B. Rached, S. Rodrigo, C. G. Almudever, E. Alarc ´on, and S. Abadal, “Interconnect fabrics for multi-core quantum processors: A context analysis,” inProceedings of the 16th International Workshop on Network on Chip Architectures (NoCArc), 2023, arXiv:2309.07313

  21. [21]

    Hardware- software co-design for distributed quantum computing,

    J. Liu, A. Zang, M. Suchara, T. Zhong, and P. D. Hovland, “Hardware- software co-design for distributed quantum computing,” in2025 62nd ACM/IEEE Design Automation Conference (DAC), 2025, pp. 1–6. Sanjiang Lireceived his B.Sc. in mathematics from Shaanxi Normal University in 1996 and his Ph.D. in mathematics from Sichuan University in

  22. [22]

    Prior to joining UTS, he worked in the Department of Computer Sci- ence and Technology at Tsinghua University from 2001 to 2008

    He is currently a professor at the Centre for Quantum Software and Information at the University of Technology Sydney (UTS). Prior to joining UTS, he worked in the Department of Computer Sci- ence and Technology at Tsinghua University from 2001 to 2008. His primary research interests include knowledge representation, artificial intelligence, and quantum c...