pith. sign in

arxiv: 2605.15476 · v1 · pith:773VP55Vnew · submitted 2026-05-14 · 🪐 quant-ph

Quantum Circuit Synthesis Using an Exact T Library

Pith reviewed 2026-05-19 14:31 UTC · model grok-4.3

classification 🪐 quant-ph
keywords quantum circuit synthesisT-gate optimizationfault-tolerant quantum computingClifford equivalenceBoolean function canonicalizationEPFL benchmarkscryptographic circuit synthesis
0
0 comments X

The pith

Precomputing T-optimal implementations for Boolean functions of up to seven variables, after Clifford canonicalization, yields quantum circuits with substantially lower T counts than AND-count minimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that directly optimizing for T gates in fault-tolerant quantum synthesis, instead of proxying via AND count in a classical basis, produces measurably cheaper circuits. It does so by formulating an exact T synthesis problem, canonicalizing functions under Clifford equivalence, and precomputing optimal small implementations that a mapper then assembles. A sympathetic reader cares because T gates supplied by magic states dominate space-time overhead while Clifford gates are essentially free; lower T counts therefore translate into fewer resources for the same algorithm. The reported gains reach 14.3 percent on standard EPFL logic benchmarks and 40 percent on selected cryptographic modules. The approach treats the precomputed library plus canonicalization as a practical substitute for exhaustive global search.

Core claim

By precomputing T-optimal implementations up to seven variables and developing a customized mapper after canonicalizing Boolean functions under Clifford equivalence, the method reduces T count by up to 14.3 percent on EPFL benchmarks and improves the T counts of several cryptographic modules by up to 40 percent.

What carries the argument

Precomputed exact T-optimal library for functions of at most seven variables, combined with Clifford canonicalization and a customized mapper that assembles larger circuits.

If this is right

  • Conventional flows that minimize AND count in an XOR-AND-NOT basis are suboptimal for T-count because they ignore phase cancellation.
  • T-count reductions directly lower the space-time cost of fault-tolerant quantum computation since T gates require magic states.
  • The mapper can be applied to real-world benchmarks including EPFL combinational logic and cryptographic modules.
  • Precomputation limited to seven variables plus canonicalization is treated as adequate for the tested cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same library could be reused as a subroutine inside larger synthesis tools that handle functions beyond seven variables by recursive decomposition.
  • If T-optimal small blocks generalize across different error-correcting codes, the approach might reduce overhead in other magic-state distillation schemes.
  • Extending the precomputation to eight variables would test whether the current cutoff already captures most practical savings or leaves further gains on the table.

Load-bearing premise

That the precomputed library for functions of at most seven variables, together with Clifford canonicalization, is sufficient to cover the structure of practical benchmarks without missing better global optimizations or adding prohibitive mapping overhead.

What would settle it

Finding a concrete circuit on the EPFL or cryptographic benchmark set whose T count is lower when the synthesis is allowed to consider functions with eight or more variables without Clifford canonicalization would falsify the sufficiency of the seven-variable library.

Figures

Figures reproduced from arXiv: 2605.15476 by Hanyu Wang, Jason Cong, Mingfei Yu, Xinrui Wu.

Figure 1
Figure 1. Figure 1: Example of the same AND count results in different T costs. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Tensor representation for bilinear Boolean functions. We [PITH_FULL_IMAGE:figures/full_fig_p002_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Example of a multiplication tensor. The MC is given by the [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Exact T synthesis example. The solver finds [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Bilinear cut enumeration example. Our flow keeps only [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: Customized mapping flow with bilinear cut enumeration. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: T-count (↓) comparison on arithmetic benchmarks [1]. Per￾centages above bars are reduction vs. MC. the end-to-end module synthesis comparison: AlphaTensor [31] and FastTODD [36] for GF(2 𝑛 ) arithmetic, and Qualtran’s QROM [5] and SelectSwap [24] implementation for QROM-style oracles [19]. These baselines represent the best-known T counts for each bench￾mark, so any improvement over them is valuable and co… view at source ↗
Figure 9
Figure 9. Figure 9: T count (↓) and space-time volume (↓) reduction on QLUT￾based real-world applications compared with QROM [5] and Se￾lectSwap [24] implementations in Qualtran [19]. We normalize re￾sults to the best baseline on each benchmark before aggregation. 8 16 32 64 128 Max cuts per node 18000 18200 18400 18600 18800 19000 19200 19400 Geomean T count MC k-cut mapper (ABC) k-cut mapper (mockturtle) bilinear mapper (ou… view at source ↗
Figure 12
Figure 12. Figure 12: CPU time breakdown on QLUT benchmarks as QLUT size [PITH_FULL_IMAGE:figures/full_fig_p007_12.png] view at source ↗
read the original abstract

In fault-tolerant quantum circuit synthesis, T gates supplied via magic states dominate space-time cost, while Clifford gates incur negligible overhead. Conventional flows minimize AND count in an {XOR, AND, NOT} basis as a proxy for T, which neglects phase cancellation and can be far from T-optimal. We instead formulate an exact T synthesis problem and canonicalize Boolean functions under Clifford equivalence. By precomputing T-optimal implementations up to seven variables and developing a customized mapper, we reduce the T count by up to 14.3% on EPFL benchmarks and improve the T counts of several cryptographic modules by up to 40%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes an exact T synthesis method for fault-tolerant quantum circuits by precomputing T-optimal implementations for all Boolean functions of at most seven variables, canonicalizing under Clifford equivalence, and using a customized mapper to decompose larger circuits. It reports T-count reductions of up to 14.3% versus conventional flows on EPFL benchmarks and up to 40% on selected cryptographic modules.

Significance. If the reported improvements are reproducible and the mapper does not introduce hidden overhead, the work would be significant for reducing the dominant space-time cost of T gates in magic-state-based fault tolerance. Precomputing an exact library up to seven variables is a concrete engineering contribution that could be reused by the community.

major comments (2)
  1. [Mapping procedure (implied in abstract and methods)] The central claim that the mapper yields strictly lower total T counts rests on the assumption that every relevant subfunction in the benchmarks has an optimal 7-variable cover and that decomposition/re-assembly introduces no net overhead. The manuscript provides no explicit bound on the number of stored functions, no completeness proof for multi-output Clifford canonicalization, and no comparison against a global T-count ILP or SAT solver on even one benchmark subcircuit (see skeptic note on mapping step).
  2. [Abstract] Abstract: concrete percentage improvements are stated for named benchmarks, yet no verification protocol, error bars, or description of how post-synthesis T counts were obtained is supplied, leaving the empirical support for the central claim limited.
minor comments (2)
  1. Clarify storage size and lookup time of the precomputed library; state whether all 2^128 functions up to 7 variables are stored or only a canonical subset.
  2. Add a short description or reference for the Clifford canonicalization algorithm employed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and indicate planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: The central claim that the mapper yields strictly lower total T counts rests on the assumption that every relevant subfunction in the benchmarks has an optimal 7-variable cover and that decomposition/re-assembly introduces no net overhead. The manuscript provides no explicit bound on the number of stored functions, no completeness proof for multi-output Clifford canonicalization, and no comparison against a global T-count ILP or SAT solver on even one benchmark subcircuit (see skeptic note on mapping step).

    Authors: We will add an explicit statement of the library size, which equals the number of distinct Clifford equivalence classes of Boolean functions on at most seven variables (obtained by exhaustive enumeration of representatives). We will also include a short argument establishing completeness of the multi-output canonicalization, based on the invariants we employ being sufficient to distinguish all orbits under the Clifford group. A direct comparison against a global ILP or SAT solver on full benchmark subcircuits is computationally intractable; we will instead add a validation on one small extracted subcircuit where an exact solver remains feasible, to illustrate that the mapper recovers the optimal T count in that case. revision: partial

  2. Referee: Abstract: concrete percentage improvements are stated for named benchmarks, yet no verification protocol, error bars, or description of how post-synthesis T counts were obtained is supplied, leaving the empirical support for the central claim limited.

    Authors: We agree that the experimental protocol requires clarification. In the revised version we will expand the abstract and add a dedicated paragraph in the experimental section that describes how post-synthesis T counts are obtained: each final circuit is parsed by our synthesis tool, which performs an exact gate-by-gate T-count. The reported reductions are therefore deterministic integer differences relative to the baseline flow; error bars are inapplicable. We will also list the precise EPFL and cryptographic benchmarks used. revision: yes

Circularity Check

0 steps flagged

No circularity: precomputed library and mapper evaluated on external benchmarks

full rationale

The paper precomputes T-optimal implementations for Boolean functions of at most seven variables via exact synthesis and Clifford canonicalization, then applies a customized mapper to produce circuits for larger functions. Reported T-count reductions (up to 14.3% on EPFL benchmarks and 40% on cryptographic modules) are measured against independent external benchmarks and conventional AND-count minimization flows rather than being fitted or defined by the method itself. No derivation step reduces by construction to its own inputs, no self-citation is load-bearing for the central claim, and the evaluation uses externally verifiable data, rendering the chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that Clifford operations are essentially free and that small-function precomputation scales to practical circuits; no free parameters or invented entities are visible from the abstract.

axioms (1)
  • domain assumption Clifford gates incur negligible overhead compared with T gates in fault-tolerant settings
    Stated directly in the abstract as the motivation for focusing on T count.

pith-pipeline@v0.9.0 · 5625 in / 1211 out tokens · 37429 ms · 2026-05-19T14:31:07.370423+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 4 internal anchors

  1. [1]

    Amarú, P.-E

    L. Amarú, P.-E. Gaillardon, and G. De Micheli. The EPFL combinational bench- mark suite.Hypotenuse, 256(128):214335, 2015

  2. [2]

    Amy and M

    M. Amy and M. Mosca. T-count optimization and Reed–Muller codes.IEEE Transactions on Information Theory, 65(8):4771–4784, 2019

  3. [3]

    Amy and N

    M. Amy and N. J. Ross. Phase-state duality in reversible circuit design.Physical Review A, 104(5):052602, 2021

  4. [4]

    Arute, K

    F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. Barends, R. Biswas, S. Boixo, F. G. Brandao, D. A. Buell, et al. Quantum supremacy using a pro- grammable superconducting processor.Nature, 574(7779):505–510, 2019

  5. [5]

    Babbush, C

    R. Babbush, C. Gidney, D. W. Berry, N. Wiebe, J. McClean, A. Paler, A. Fowler, and H. Neven. Encoding electronic spectra in quantum circuits with linear T complexity.Physical Review X, 8(4):041015, 2018

  6. [6]

    M. E. Beverland, P. Murali, M. Troyer, K. M. Svore, T. Hoefler, V. Kliuchnikov, G. H. Low, M. Soeken, A. Sundaram, and A. Vaschillo. Assessing requirements to scale to practical quantum advantage (2022).arXiv preprint arXiv:2211.07629, 2022

  7. [7]

    Bluvstein, S

    D. Bluvstein, S. J. Evered, A. A. Geim, S. H. Li, H. Zhou, T. Manovitz, S. Ebadi, M. Cain, M. Kalinowski, D. Hangleiter, et al. Logical quantum processor based on reconfigurable atom arrays.Nature, 626(7997):58–65, 2024

  8. [8]

    Brayton and A

    R. Brayton and A. Mishchenko. Abc: An academic industrial-strength verification tool. InComputer Aided Verification: 22nd International Conference, CA V 2010, Edinburgh, UK, July 15-19, 2010. Proceedings 22, pages 24–40. Springer, 2010

  9. [9]

    Chen and J

    D. Chen and J. Cong. DAOmap: A depth-optimal area optimization mapping algorithm for FPGA designs. InIEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004., pages 752–759. IEEE, 2004

  10. [10]

    J. Cong, C. Wu, and Y. Ding. Cut ranking and pruning: Enabling a general and efficient FPGA mapping solution. InProceedings of the 1999 ACM/SIGDA Seventh International Symposium on Field Programmable Gate Arrays, pages 29–35, 1999

  11. [11]

    Fawzi, M

    A. Fawzi, M. Balog, A. Huang, T. Hubert, B. Romera-Paredes, M. Barekatain, A. Novikov, F. J. R. Ruiz, J. Schrittwieser, G. Swirszcz, et al. Discovering faster ma- trix multiplication algorithms with reinforcement learning.Nature, 610(7930):47– 53, 2022

  12. [12]

    C. Gidney. Windowed quantum arithmetic.arXiv preprint arXiv:1905.07682, 2019

  13. [13]

    C. Gidney. How to factor 2048 bit RSA integers with less than a million noisy qubits.arXiv preprint arXiv:2505.15917, 2025

  14. [14]

    Magic state cultivation: growing T states as cheap as CNOT gates

    C. Gidney, N. Shutty, and C. Jones. Magic state cultivation: growing T states as cheap as CNOT gates.arXiv preprint arXiv:2409.17595, 2024

  15. [15]

    Suppressing quantum errors by scaling a surface code logical qubit

    Google. Suppressing quantum errors by scaling a surface code logical qubit. Nature, 614(7949):676–681, 2023

  16. [16]

    Quantum error correction below the surface code threshold.Nature, 638(8052):920–926, 2025

    Google. Quantum error correction below the surface code threshold.Nature, 638(8052):920–926, 2025

  17. [17]

    Haleček, P

    I. Haleček, P. Fišer, and J. Schmidt. Are XORs in logic synthesis really necessary? In2017 IEEE 20th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), pages 134–139. IEEE, 2017

  18. [18]

    Häner and M

    T. Häner and M. Soeken. Lowering the T-depth of quantum circuits by reducing the multiplicative depth of logic networks.arXiv preprint arXiv:2006.03845, 2020

  19. [19]

    M. P. Harrigan, T. Khattar, C. Yuan, A. Peduri, N. Yosri, F. D. Malone, R. Babbush, and N. C. Rubin. Expressing and analyzing quantum algorithms with Qualtran. arXiv preprint arXiv:2409.04643, 2024

  20. [20]

    L. E. Heyfron and E. T. Campbell. An efficient quantum compiler that reduces T count.Quantum Science and Technology, 4(1):015004, 2018

  21. [21]

    Ko and J.-C

    S.-B. Ko and J.-C. Lo. Efficient decomposition techniques for FPGAs. InInter- national Conference on High-Performance Computing, pages 630–639. Springer, 2002

  22. [22]

    Litinski

    D. Litinski. A game of surface codes: Large-scale quantum computing with lattice surgery.Quantum, 3:128, 2019

  23. [23]

    Liu, Y.-T

    H.-L. Liu, Y.-T. Li, Y.-C. Chen, and C.-Y. Wang. A don’t-care-based approach to reducing the multiplicative complexity in logic networks.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(11):4821–4825, 2022

  24. [24]

    G. H. Low, V. Kliuchnikov, and L. Schaeffer. Trading T gates for dirty qubits in state preparation and unitary synthesis.Quantum, 8:1375, 2024

  25. [25]

    Luo, G.-w

    Q.-b. Luo, G.-w. Yang, X.-y. Li, and Q. Li. Quantum reversible circuits for multi- plicative inverse.EPJ Quantum Technology, 9(1):24, 2022

  26. [26]

    Meuli, B

    G. Meuli, B. Schmitt, R. Ehlers, H. Riener, and G. De Micheli. Evaluating ESOP optimization methods in quantum compilation flows. InInternational Conference on Reversible Computation, pages 191–206. Springer, 2019

  27. [27]

    Meuli, M

    G. Meuli, M. Soeken, E. Campbell, M. Roetteler, and G. De Micheli. The role of multiplicative complexity in compiling low T-count oracle circuits. In2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1–8. IEEE, 2019

  28. [28]

    Meuli, M

    G. Meuli, M. Soeken, M. Roetteler, and G. De Micheli. ROS: Resource-constrained oracle synthesis for quantum computers.arXiv preprint arXiv:2005.00211, 2020

  29. [29]

    D. E. Muller. Application of Boolean Algebra to Switching Circuit Design and to Error Detection.Transactions of the I.R.E. Professional Group on Electronic Computers, EC-3(3):6–12, 1954

  30. [30]

    Riener, R

    H. Riener, R. Ehlers, B. d. O. Schmitt, and G. De Micheli. Exact synthesis of ESOP forms. InAdvanced Boolean Techniques: Selected Papers from the 13th International Workshop on Boolean Problems, pages 177–194. Springer, 2019

  31. [31]

    F. J. Ruiz, T. Laakkonen, J. Bausch, M. Balog, M. Barekatain, F. J. Heras, A. Novikov, N. Fitzpatrick, B. Romera-Paredes, J. van de Wetering, et al. Quantum circuit optimization with AlphaTensor.Nature Machine Intelligence, pages 1–12, 2025

  32. [32]

    Sales Rodriguez, J

    P. Sales Rodriguez, J. M. Robinson, P. N. Jepsen, Z. He, C. Duckering, C. Zhao, K.-H. Wu, J. Campo, K. Bagnall, M. Kwon, et al. Experimental demonstration of logical magic state distillation.Nature, pages 1–3, 2025

  33. [33]

    D. B. Tan, M. Y. Niu, and C. Gidney. A SAT scalpel for lattice surgery: Repre- sentation and synthesis of subroutines for surface-code fault-tolerant quantum computing. In2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA), pages 325–339. IEEE, 2024

  34. [34]

    Tempia Calvino and G

    A. Tempia Calvino and G. De Micheli. Technology mapping using multi-output library cells. In2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), pages 1–9. IEEE, 2023. Hanyu Wang, Mingfei Yu, Xinrui Wu, and Jason Cong

  35. [35]

    Testa, M

    E. Testa, M. Soeken, L. Amarù, and G. De Micheli. Reducing the multiplicative complexity in logic networks for cryptography and security applications. In Proceedings of the 56th Annual Design Automation Conference 2019, pages 1–6, 2019

  36. [36]

    Vandaele, Lower T-count with faster algorithms (2024), arXiv:2407.08695 [quant-ph]

    V. Vandaele. Lower T-count with faster algorithms.arXiv preprint arXiv:2407.08695, 2024

  37. [37]

    Vandaele, S

    V. Vandaele, S. Martiel, and T. Goubault de Brugière. Phase polynomials synthesis algorithms for NISQ architectures and beyond.Quantum Science and Technology, 7(4):045027, 2022

  38. [38]

    Venere, A

    M. Venere, A. Barenghi, and G. Pelosi. A Quantum Method to Match Vector Boolean Functions Using Simon’s Solver. In2024 IEEE 42nd International Confer- ence on Computer Design (ICCD), pages 542–549. IEEE, 2024

  39. [39]

    Wang, S.-Y

    H. Wang, S.-Y. Lee, and G. De Micheli. AnySyn: A Cost-Generic Logic Synthesis Framework with Customizable Cost Functions.arXiv preprint arXiv:2311.14721, 2023

  40. [40]

    M. Yu, M. Soeken, and G. De Micheli. A New Perspective of Constructing Resource-Efficient Data-Lookup Quantum Oracles. In2025 IEEE International Conference on Quantum Computing and Engineering (QCE), pages 511–519. IEEE,

  41. [41]

    DOI: 10.1109/QCE65121.2025.00063

  42. [42]

    M. Yu, A. Tempia Calvino, M. Soeken, and G. De Micheli. Back-end-aware Fault- tolerant Quantum Oracle Synthesis. InProceedings of the 30th Asia and South Pacific Design Automation Conference, pages 930–937, 2025

  43. [43]

    H. Zhou, C. Duckering, C. Zhao, D. Bluvstein, M. Cain, A. Kubica, S.-T. Wang, and M. D. Lukin. Resource Analysis of Low-Overhead Transversal Architectures for Reconfigurable Atom Arrays. InProceedings of the 52nd Annual International Symposium on Computer Architecture, pages 1432–1448, 2025