arxiv: 2604.20013 · v1 · submitted 2026-04-21 · 🪐 quant-ph

Recognition: unknown

Assessing System Capabilities and Bottlenecks of an Early Fault-Tolerant Bicycle Architecture

Kun Liu , Ben Foxman , Gian-Luca R. Anselmetti , Yongshan Ding

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:55 UTC · model grok-4.3

classification 🪐 quant-ph

keywords modular quantum computingfault-tolerant quantum computingbivariate bicycle codesmagic state factorycompiler optimizationnon-Clifford gatescircuit failure probabilityquantum compilation

0 comments

The pith

Synthesizing non-Clifford rotations at the magic state factory lowers estimated circuit failure probability by a factor of nine in modular fault-tolerant systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates bottlenecks in early modular fault-tolerant quantum computers that use bivariate bicycle codes. It finds that inter-module communication required by non-Clifford operations is the primary constraint limiting performance. The authors introduce a compilation pipeline with three optimizations, the most impactful being synthesis of arbitrary-angle rotations directly at the factory. This change reduces average estimated failure probability by a factor of 9.0 across non-Clifford benchmarks drawn from over forty categories. The gains remain stable when instruction costs, logical processing unit counts, and factory counts are varied.

Core claim

In modular architectures based on bivariate bicycle codes, inter-module communication induced by non-Clifford operations forms the dominant bottleneck. Synthesizing arbitrary-angle rotations at the magic state factory, combined with transvection-based Clifford deferral and targeted Clifford insertion, mitigates this bottleneck and yields a ninefold average reduction in estimated circuit failure probability for non-Clifford benchmarks while also shortening compile time and circuit duration.

What carries the argument

The syn@fac optimization, which performs synthesis of arbitrary-angle rotations at the magic state factory to avoid costly inter-module communication for non-Clifford operations.

If this is right

Transvection-based Clifford deferral reduces Clifford deferral compile time by 77 percent.
Clifford insertion shortens end-to-end circuit duration by 11.5 percent on average for MQTBench circuits.
The ninefold failure-probability reduction and other gains hold across sweeps of instruction cost ratios, LPU counts, and factory counts.
The optimizations extend evaluation to more than forty benchmark categories from quantum algorithms and Hamiltonian simulations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hardware designers may need to prioritize faster or larger magic state factories over further reductions in inter-module latency.
The same synthesis approach could be tested on other modular code families to check whether the factor-of-nine gain generalizes beyond bivariate bicycle codes.
Combining syn@fac with dynamic factory allocation policies might yield additional reductions in critical-path duration for very large circuits.

Load-bearing premise

The chosen model of instruction costs and inter-module communication latencies, together with bivariate bicycle codes as the error-correcting substrate, captures the dominant constraints of real early modular fault-tolerant hardware.

What would settle it

Implementing the syn@fac pipeline on physical hardware or a detailed simulator with the paper's assumed cost model and measuring whether the average failure probability reduction across the same non-Clifford benchmarks falls near or below a factor of nine.

Figures

Figures reproduced from arXiv: 2604.20013 by Ben Foxman, Gian-Luca R. Anselmetti, Kun Liu, Yongshan Ding.

**Figure 2.** Figure 2: FIG. 2. Compilation pipeline for the studied systems. We adopt the BB instruction set and the PBC-to-BB transformation [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. Lowering single-module and multi-module PBC operations to BB instructions. Two components highlighted in the [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. (a) Synthesize arbitrary-angle rotation by gate [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. Synthesize an arbitrary-angle rotation [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. A segment is a consecutive sequence of in-module [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7. Visualization of the adopted compilation optimizations. (a-b) A PBC circuit, in form of a DAG, can be viewed as [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8. Overall performance comparison between baseline [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: FIG. 9. Ablation: syn@fac vs the baseline (syn@LPU) for the reference system. Breakdown of failure probability and [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗

**Figure 10.** Figure 10: FIG. 10. Cost-ratio sensitivity of syn@LPU versus syn@fac [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 12.** Figure 12: FIG. 12. Factory count sensitivity around the ref [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

**Figure 13.** Figure 13: FIG. 13. Ablation: only Clifford insertion is applied vs the [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗

read the original abstract

Early modular fault tolerant quantum computers remain constrained by costly inter-module communication and limited magic state factory service. Understanding such bottlenecks and investigating compiler optimizations most close the gap between algorithm requirements and hardware capabilities is a concrete and practically urgent systems problem. We study the modular architectures based on Bivariate Bicycle codes and identify the dominant bottleneck: inter-module communication induced by non-Clifford operations. We build a compilation pipeline to fill the missing parts of prior works and propose compiler optimizations: synthesizing arbitrary-angle rotations at the factory (syn@fac), transvection based Clifford deferral, and Clifford insertion for critical path duration reduction. We extend the evaluation scope of the prior work to 40+ benchmark categories drawn from PennyLane and MQTBench, including quantum algorithms and Hamiltonian simulations with varying sizes. Under the present instruction cost, syn@fac reduces estimated circuit failure probability by a factor of 9.0 on average across non-Clifford benchmarks. The robustness persists across sweeps of instruction cost ratios, LPU count, and factory count. Besides, transvection reduces Clifford deferral compile time by 77.04\%, while Clifford insertion reduces end-to-end circuit duration by 11.54\% on average on MQTBench, with smaller gains on Hamiltonian simulations. We hope this work inspires the studies on compiler optimizations for early modular FTQC systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The 9x failure-probability cut from syn@fac holds inside their instruction-cost model, but the completed pipeline and 40+ benchmark results still give a practical look at modular FTQC bottlenecks.

read the letter

The main thing to know is that under their chosen instruction costs, moving arbitrary-angle rotation synthesis to the magic-state factory reduces estimated circuit failure probability by a factor of nine on average across non-Clifford benchmarks, and this stays consistent when they sweep cost ratios, LPU counts, and factory numbers. The other two passes—transvection-based Clifford deferral and critical-path Clifford insertion—cut compile time by 77% and end-to-end duration by 11.5% on MQTBench workloads, with smaller gains on Hamiltonian simulations.

Referee Report

2 major / 2 minor

Summary. The paper assesses bottlenecks in early modular fault-tolerant quantum computing architectures based on bivariate bicycle codes, identifying inter-module communication from non-Clifford operations as the dominant constraint. It introduces a compilation pipeline and three optimizations—synthesizing arbitrary-angle rotations at the factory (syn@fac), transvection-based Clifford deferral, and Clifford insertion for critical-path reduction—then evaluates them on 40+ benchmarks from PennyLane and MQTBench. Under the authors' present instruction-cost model, syn@fac yields an average 9× reduction in estimated circuit failure probability across non-Clifford benchmarks, with the benefit persisting across sweeps of instruction-cost ratios, LPU counts, and factory counts; the other optimizations cut compile time by 77% and circuit duration by 11.5% on average.

Significance. If the modeling assumptions hold, the work supplies concrete, actionable compiler techniques for mitigating communication bottlenecks in early modular FTQC and demonstrates their impact across a broad benchmark suite with parameter sweeps. The extension beyond prior limited evaluations and the explicit robustness checks are strengths that could guide hardware-aware compilation research.

major comments (2)

[Abstract and evaluation sections describing the instruction-cost model and failure-probability estimation] The headline quantitative claim (factor-of-9 failure-probability reduction under the present instruction cost) is load-bearing for the paper's central thesis yet rests on an instruction-cost and latency model whose parameters are chosen internally and swept rather than derived from first-principles hardware measurements or external calibration. Without an explicit derivation or sensitivity analysis tied to physical device data, the reported benefit remains conditional on modeling assumptions that may not match real early-modular hardware (e.g., different communication overheads or additional error sources).
[Abstract and the sections presenting the quantitative results and failure-probability model] The failure-probability model itself is stated without error bars, explicit derivation steps, or discussion of benchmark-selection criteria and possible post-hoc exclusions. This makes it difficult to assess the statistical reliability of the 9× average and the cross-benchmark robustness claims.

minor comments (2)

[Abstract] The abstract contains a minor grammatical issue: 'investigating compiler optimizations most close the gap' should read 'that most closely close the gap' or similar for clarity.
[Abstract and methods] Notation for the 'present instruction cost' operating point and the exact definition of the failure-probability estimator should be introduced earlier and used consistently to aid readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on the modeling assumptions and evaluation details in our manuscript. We address each major comment below and have revised the paper to improve clarity and transparency where possible.

read point-by-point responses

Referee: The headline quantitative claim (factor-of-9 failure-probability reduction under the present instruction cost) is load-bearing for the paper's central thesis yet rests on an instruction-cost and latency model whose parameters are chosen internally and swept rather than derived from first-principles hardware measurements or external calibration. Without an explicit derivation or sensitivity analysis tied to physical device data, the reported benefit remains conditional on modeling assumptions that may not match real early-modular hardware (e.g., different communication overheads or additional error sources).

Authors: We agree that the instruction-cost model uses parameterized values chosen to represent plausible overheads in modular bivariate bicycle code architectures, rather than direct first-principles measurements from physical hardware. Such early fault-tolerant modular systems are not yet experimentally available, so the model is necessarily based on literature-derived estimates for inter-module communication and factory latencies. The manuscript already includes extensive sensitivity sweeps over instruction-cost ratios, LPU counts, and factory counts (Sections 5.3–5.5), which demonstrate that the syn@fac benefit persists across wide ranges. In the revision we have added a new subsection (3.2) that explicitly derives the baseline parameter choices from prior modular FTQC proposals, discusses potential deviations due to unmodeled error sources, and states the conditional nature of the quantitative claims more prominently in the abstract and conclusion. revision: partial
Referee: The failure-probability model itself is stated without error bars, explicit derivation steps, or discussion of benchmark-selection criteria and possible post-hoc exclusions. This makes it difficult to assess the statistical reliability of the 9× average and the cross-benchmark robustness claims.

Authors: The failure-probability model is a standard multiplicative approximation (detailed in Section 4.1) in which circuit failure probability is estimated from the number and cost of non-Clifford operations; we have now moved the full step-by-step derivation to a new Appendix B. Because the estimator is deterministic given the input parameters, statistical error bars are not applicable; instead, we report the full per-benchmark distribution of improvements (Figure 5 and supplementary tables) so readers can evaluate variability directly. Benchmark selection criteria are stated in Section 5.1: all circuits from PennyLane and MQTBench compilable within our resource limits were included, with no post-hoc exclusions. We have added an explicit paragraph in Section 5 confirming these criteria and the absence of exclusions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results are computed outputs from an explicit parameterized model.

full rationale

The paper reports simulation-based estimates of circuit failure probability under a stated instruction-cost and latency model for bivariate bicycle codes, with the factor-of-9 reduction obtained by applying the syn@fac optimization inside that model. The central claims are not forced by construction from the inputs, nor do they rely on self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations whose validity is presupposed. Parameter sweeps are performed over the same modeling assumptions, but this does not create circularity; the work is an assessment study whose outputs are independent of the inputs once the model is fixed. No quoted derivation step reduces to its own premises.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claims rest on a domain-specific cost model for inter-module communication and on the choice of bivariate bicycle codes as the error-correcting substrate; no new physical entities are postulated.

free parameters (2)

instruction cost ratio
Used to obtain the factor-of-9 result; the paper states robustness under sweeps but the headline number is reported at one operating point.
LPU count and factory count
Architectural parameters that are swept but whose specific values affect the reported circuit durations and failure probabilities.

axioms (2)

domain assumption Bivariate bicycle codes are the error-correcting substrate for the modular architecture under study.
Invoked as the basis for identifying inter-module communication as the dominant bottleneck.
domain assumption Non-Clifford operations dominate inter-module communication cost.
Stated as the identified bottleneck that the compiler optimizations target.

pith-pipeline@v0.9.0 · 5545 in / 1498 out tokens · 48618 ms · 2026-05-10T01:55:53.295366+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

INJEQT: Improved Magic-State Injection Protocol for Fault-Tolerant Quantum Extractor Architectures
quant-ph 2026-04 unverdicted novelty 6.0

INJEQT reduces synthillation error by up to 22x, wall-clock time by 13x, and space-time cost by 7.2x in extractor FTQC architectures via auxiliary Rz synthesis and pre-fetching.

Reference graph

Works this paper leans on

55 extracted references · 40 canonical work pages · cited by 1 Pith paper · 4 internal anchors

[1]

DenoteS ℓ as the cyclic shift matrix of dimensionℓ×ℓ, where each rowiof the matrix has a 1 at columni+ 1 moduloℓ

BB Codes BB codes can be described using cyclic shift matrices. DenoteS ℓ as the cyclic shift matrix of dimensionℓ×ℓ, where each rowiof the matrix has a 1 at columni+ 1 moduloℓ. Definex=S ℓ andy=S m. Polynomials in xandyrepresent two-dimensional shifts, e.g.,x pyq = Sp ℓ Sq m. A BB code picks two polynomialsA=A 1 + A2 +A 3 andB=B 1 +B 2 +B 3 built from mo...
[2]

measurePand, if the outcome is 1, apply a Pauli correctionQthat anticommutes withP; if the outcome is 0, do nothing

Detailed PBC-to-BB Lowering Gadgets This subsection gives the full gadget-level derivation behind the condensed lowering story in Sec. II C. We now explain the circuit-level transformation from Fig. 2(c) to Fig. 2(d) using Fig. 3. Circuit language.Fig. 3 uses four basic symbols: unitary gates, Clifford rotations, Pauli measurements, andprojections, as sho...
[3]

Detailed In-Module Measurement Synthesis This subsection details the inherited in-module synthe- sis machinery summarized in Sec. II D. For gross code, extractors with practical cost have been developed [1], and they expose a small generating set of Pauli measurementsM=⟨X 0, X6, Z0, Z6⟩. BB codes support fault-tolerant shift automorphism unitariesA, which...
[4]

II C, we introduced the modular PBC compilation flow used in this paper

Clifford Insertion Recall that in Sec. II C, we introduced the modular PBC compilation flow used in this paper. To optimize the duration of a PBC, we first build a directed acyclic graph (DAG) by representing each operation as a node and its temporal dependencies as directed edges. This is shown in Fig. 2(d). Notice that we omit the measure- ments on pivo...
[5]

Here we propose an algorithm to select multiplewindowsto maximize the reduction for the segment

Multi-Window Selection For a segment, if we use only oneCto conjugate all operations, the longer the segment is, the harder it be- comes to reduce the overall cost of the segment. Here we propose an algorithm to select multiplewindowsto maximize the reduction for the segment. An intuitive metaphor for this algorithm is buying and selling stocks with trans...
[6]

T. J. Yoder, E. Schoute, P. Rall, E. Pritchett, J. M. Gam- betta, A. W. Cross, M. Carroll, and M. E. Beverland, Tour de gross: A modular quantum computer based on bivariate bicycle codes (2025), arXiv:2506.03094 [quant- ph]

work page internal anchor Pith review arXiv 2025
[7]

Optimizing Logical Mappings for Quantum Low-Density Parity Check Codes,

S. Sethi, S. Khan, M. Poster, A. Anand, and J. M. Baker, Optimizing Logical Mappings for Quantum Low-Density Parity Check Codes (2026), arXiv:2603.17167 [quant-ph]

work page arXiv 2026
[8]

Bravyi, A

S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, and T. J. Yoder, High-threshold and low- overhead fault-tolerant quantum memory, Nature627, 778 (2024)

2024
[9]

Q. Xu, J. P. B. Ataides, C. A. Pattison, N. Raveendran, D. Bluvstein, J. Wurtz, B. Vasic, M. D. Lukin, L. Jiang, and H. Zhou, Constant-Overhead Fault-Tolerant Quan- tum Computation with Reconfigurable Atom Arrays (2023), arXiv:2308.08648 [quant-ph]

work page arXiv 2023
[10]

Z. Du, S. Kan, S. Stein, Z. Liang, A. Li, and Y. Mao, Hardware-aware Compilation for Chip-to-Chip Coupler-Connected Modular Quantum Systems (2025), arXiv:2505.09036 [quant-ph]

work page arXiv 2025
[11]

M. J. Jeng, N. V. Maruszewski, C. Selna, M. Gavrincea, K. N. Smith, and N. Hardavellas, Modular Com- pilation for Quantum Chiplet Architectures (2025), arXiv:2501.08478 [quant-ph]

work page arXiv 2025
[12]

S. Sang, L. Hour, and Y. Han, Toward Scalable Quan- tum Compilation for Modular Architecture: Qubit Map- ping and Reuse via Deep Reinforcement Learning (2025), arXiv:2506.09323 [quant-ph]

work page internal anchor Pith review Pith/arXiv arXiv 2025
[13]

Cross, Z

A. Cross, Z. He, P. Rall, and T. Yoder, Improved QLDPC Surgery: Logical Measurements and Bridging Codes (2024), arXiv:2407.18393 [quant-ph]

work page arXiv 2024
[14]

D. J. Williamson and T. J. Yoder, Low-overhead fault- tolerant quantum computation by gauging logical opera- tors (2024)

2024
[15]

Universal adapters between quantum LDPC codes,

E. Swaroop, T. Jochym-O’Connor, and T. J. Yoder, Uni- versal adapters between quantum LDPC codes (2024), arXiv:2410.03628 [quant-ph]

work page arXiv 2024
[16]

How to factor 2048 bit RSA integers with less than a million noisy qubits

C. Gidney, How to factor 2048 bit RSA integers with less than a million noisy qubits (2025), arXiv:2505.15917 [quant-ph]

work page internal anchor Pith review arXiv 2048
[17]

Gidney, N

C. Gidney, N. Shutty, and C. Jones, Magic state cultiva- tion: growing T states as cheap as CNOT gates (2024)

2024
[18]

E. T. Campbell and J. O’Gorman, An efficient magic state approach to small angle rotations, Quantum Sci- ence and Technology1, 015007 (2016), arXiv:1603.04230 [quant-ph]

work page arXiv 2016
[19]

PennyLane: Automatic differentiation of hybrid quantum-classical computations

V. Bergholm, J. Izaac, M. Schuld, C. Gogolin, S. Ahmed, V. Ajith, M. S. Alam, G. Alonso-Linaje, B. Akash- Narayanan, A. Asadi, J. M. Arrazola, U. Azad, S. Ban- ning, C. Blank, T. R. Bromley, B. A. Cordier, J. Ceroni, A. Delgado, O. D. Matteo, A. Dusko, T. Garg, D. Guala, 15 A. Hayes, R. Hill, A. Ijaz, T. Isacsson, D. Ittah, S. Ja- hangiri, P. Jain, E. Jia...

work page internal anchor Pith review arXiv 2022
[20]

Quetschlich, L

N. Quetschlich, L. Burgholzer, and R. Wille, MQT Bench: Benchmarking Software and Design Automation Tools for Quantum Computing, Quantum7, 1062 (2023), arXiv:2204.13719 [quant-ph]

work page arXiv 2023
[21]

Krishna and D

A. Krishna and D. Poulin, Fault-tolerant gates on hy- pergraph product codes, Physical Review X11, 011023 (2021), arXiv:1909.07424 [quant-ph]

work page arXiv 2021
[22]

Panteleev and G

P. Panteleev and G. Kalachev, Degenerate Quantum LDPC Codes With Good Finite Length Performance, Quantum5, 585 (2021), arXiv:1904.02703 [quant-ph]

work page arXiv 2021
[23]

Panteleev and G

P. Panteleev and G. Kalachev, Quantum LDPC Codes with Almost Linear Minimum Distance, IEEE Transactions on Information Theory68, 213 (2022), arXiv:2012.04068 [quant-ph]

work page arXiv 2022
[24]

Tillich and G

J.-P. Tillich and G. Zemor, Quantum LDPC codes with positive rate and minimum distance proportional to nˆ{1/2}, IEEE Transactions on Information Theory60, 1193 (2014), arXiv:0903.0566 [quant-ph]

work page arXiv 2014
[25]

A. A. Kovalev and L. P. Pryadko, Quantum Kronecker sum-product low-density parity-check codes with finite rate, Physical Review A88, 012311 (2013)

2013
[26]

A Game of Surface Codes: Large-Scale Quantum Computing with Lattice Surgery

D. Litinski, A Game of Surface Codes: Large-Scale Quan- tum Computing with Lattice Surgery, Quantum3, 128 (2019), arXiv:1808.02892 [cond-mat, physics:quant-ph]

work page Pith review arXiv 2019
[27]

T. H. Haug, T. Hillmann, A. F. Kockum, and R. V. Laer, Lattice surgery with Bell measurements: Modular fault- tolerant quantum computation at low entanglement cost (2025), arXiv:2510.13541 [quant-ph]

work page arXiv 2025
[28]

J. Kim, D. Min, J. Cho, H. Jeong, I. Byun, J. Choi, J. Hong, and J. Kim, A Fault-Tolerant Million Qubit- Scale Distributed Quantum Computer, inProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Sys- tems, Volume 2(ACM, La Jolla CA USA, 2024) pp. 1–19

2024
[29]

S. F. Lin, J. Viszlai, K. N. Smith, G. S. Ravi, C. Yuan, F. T. Chong, and B. J. Brown, Codesign of quantum error-correcting codes and modular chiplets in the pres- ence of defects, inProceedings of the 29th ACM Inter- national Conference on Architectural Support for Pro- gramming Languages and Operating Systems, Volume 2 (ACM, La Jolla CA USA, 2024) pp. 216–231

2024
[30]

Modu- lar architectures and entanglement schemes for error-corrected distributed quantum computation,

S. Singh, F. Gu, S. de Bone, E. Villase˜ nor, D. Elkouss, and J. Borregaard, Modular Architectures and Entangle- ment Schemes for Error-Corrected Distributed Quantum Computation (2024), arXiv:2408.02837 [quant-ph]

work page arXiv 2024
[31]

X. Wu, Y. J. Joshi, H. Yan, G. Andersson, A. Anferov, C. R. Conner, B. Karimi, A. M. King, S. Li, H. L. Malc, J. M. Miller, H. Mishra, H. Qiao, M. Ryu, S. Xing, J. Shi, and A. N. Cleland, Mitigating cosmic ray-like cor- related events with a modular quantum processor (2025), arXiv:2505.15919 [quant-ph]

work page arXiv 2025
[32]

Q. Xu, A. Seif, H. Yan, N. Mannucci, B. O. Sane, R. Van Meter, A. N. Cleland, and L. Jiang, Dis- tributed quantum error correction for chip-level catas- trophic errors, Physical Review Letters129, 240502 (2022), arXiv:2203.16488 [quant-ph]

work page arXiv 2022
[33]

L. S. Herzog, L. Berent, A. Kubica, and R. Wille, Lattice Surgery Compilation Beyond the Surface Code (2025), arXiv:2504.10591 [quant-ph]

work page arXiv 2025
[34]

Hirano and K

Y. Hirano and K. Fujii, Locality-aware Pauli-based computation for local magic state preparation (2025), arXiv:2504.12091 [quant-ph]

work page arXiv 2025
[35]

S. Kan, Z. Du, C. Liu, M. Wang, Y. Ding, A. Li, Y. Mao, and S. Stein, SPARO: Surface-code Pauli-based Architec- tural Resource Optimization for Fault-tolerant Quantum Computing (2025), arXiv:2504.21854 [quant-ph]

work page arXiv 2025
[36]

Kobori, Y

T. Kobori, Y. Suzuki, Y. Ueno, T. Tanimoto, S. Todo, and Y. Tokunaga, LSQCA: Resource-Efficient Load/Store Architecture for Limited-Scale Fault- Tolerant Quantum Computing (2024), arXiv:2412.20486 [quant-ph]

work page arXiv 2024
[37]

M. Wang, C. Liu, S. Stein, Y. Ding, P. Das, P. J. Nair, and A. Li, Optimizing FTQC Programs through QEC Transpiler and Architecture Codesign (2024), arXiv:2412.15434 [quant-ph]

work page arXiv 2024
[38]

M. Wang, C. Liu, S. Garner, S. Stein, Y. Ding, P. J. Nair, and A. Li, Tableau-Based Framework for Efficient Logical Quantum Compilation (2025), arXiv:2509.02721 [quant-ph]

work page arXiv 2025
[39]

Mengoni, W

R. Mengoni, W. Nadalin, M. Rennela, J. Rotureau, T. Darras, J. Laurat, E. Diamanti, and I. Lavdas, Effi- cient Gate Reordering for Distributed Quantum Compil- ing in Data Centers (2025), arXiv:2507.01090 [quant-ph]

work page arXiv 2025
[40]

A. Wu, H. Zhang, G. Li, A. Shabani, Y. Xie, and Y. Ding, AutoComm: A Framework for Enabling Efficient Com- munication in Distributed Quantum Programs (2022), arXiv:2207.11674 [quant-ph]

work page arXiv 2022
[41]

A. Wu, Y. Ding, and A. Li, CollComm: Enabling Effi- cient Collective Quantum Communication Based on EPR buffering (2022), arXiv:2208.06724 [quant-ph]

work page arXiv 2022
[42]

A. Wu, Y. Ding, and A. Li, QuComm: Optimizing Col- lective Communication for Distributed Quantum Com- puting, inProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’23 (Association for Computing Machinery, New York, NY, USA, 2023) pp. 479–493

2023
[43]

N. J. Ross and P. Selinger, Optimal ancilla-free Clifford+T approximation of z-rotations (2016), arXiv:1403.2975 [quant-ph]

work page Pith review arXiv 2016
[44]

Selinger and N

P. Selinger and N. J. Ross, pygridsynth: Python ver- sion gridsynth program computes approximations of Z- rotations over the Clifford+T gate set (2018)

2018
[45]

Rengaswamy, R

N. Rengaswamy, R. Calderbank, S. Kadhe, and H. D. Pfister, Synthesis of Logical Clifford Operators via Sym- plectic Geometry, in2018 IEEE International Sympo- sium on Information Theory (ISIT)(IEEE Press, Vail, CO, USA, 2018) pp. 791–795, arXiv:1803.06987 [cs]

work page arXiv 2018
[46]

Z. Chen, J. O. Weinberg, and N. Rengaswamy, Fault Tol- erant Quantum Simulation via Symplectic Transvections (2025), arXiv:2504.11444 [quant-ph]

work page arXiv 2025
[47]

Ruh and S

J. Ruh and S. Devitt, Quantum Circuit Optimisation and MBQC Scheduling with a Pauli Tracking Library (2024), arXiv:2405.03970 [quant-ph]

work page arXiv 2024
[48]

Dehaene and B

J. Dehaene and B. D. Moor, The Clifford group, stabilizer states, and linear and quadratic operations over GF(2), 16 Physical Review A68, 042318 (2003), arXiv:quant- ph/0304125

work page arXiv 2003
[49]

Koenig and J

R. Koenig and J. A. Smolin, How to efficiently select an arbitrary Clifford group element, Journal of Mathemat- ical Physics55, 122202 (2014), arXiv:1406.2170 [quant- ph]

work page arXiv 2014
[50]

Architectures for heterogeneous quantum error correction codes,

S. Stein, S. Xu, A. W. Cross, T. J. Yoder, A. Javadi- Abhari, C. Liu, K. Liu, Z. Zhou, C. Guinn, Y. Ding, Y. Ding, and A. Li, Architectures for Het- erogeneous Quantum Error Correction Codes (2024), arXiv:2411.03202 [quant-ph]

work page arXiv 2024
[51]

Fold- transversal surface code cultivation

K. Sahay, P.-K. Tsai, K. Chang, Q. Su, T. B. Smith, S. Singh, and S. Puri, Fold-transversal surface code cul- tivation (2025), arXiv:2509.05212 [quant-ph]

work page arXiv 2025
[52]

Bravyi and A

S. Bravyi and A. Kitaev, Fermionic quantum computa- tion, Annals of Physics298, 210 (2002), arXiv:quant- ph/0003137

work page arXiv 2002
[53]

Suzuki, General theory of fractal path integrals with applications to many-body theories and statistical physics, Journal of Mathematical Physics32, 400 (1991)

M. Suzuki, General theory of fractal path integrals with applications to many-body theories and statistical physics, Journal of Mathematical Physics32, 400 (1991)

1991
[54]

F. C. R. Peres and E. F. Galv˜ ao, Quantum circuit com- pilation and hybrid computation using Pauli-based com- putation, Quantum7, 1126 (2023)

2023
[55]

Chamberland and E

C. Chamberland and E. T. Campbell, A circuit-level pro- tocol and analysis for twist-based lattice surgery, Physi- cal Review Research4, 023090 (2022), arXiv:2201.05678 [quant-ph]

work page arXiv 2022