pith. sign in

arxiv: 2604.26788 · v1 · submitted 2026-04-29 · 💻 cs.DC · quant-ph

A Semantic Quantum Circuit Cache for Scalable and Distributed Quantum-Classical Workflows

Pith reviewed 2026-05-07 12:36 UTC · model grok-4.3

classification 💻 cs.DC quant-ph
keywords quantum circuit cachesemantic equivalenceZX-calculusWeisfeiler-Leman hashinghybrid quantum-classical workflowsdistributed cachingquantum circuit simulationredundancy elimination
0
0 comments X

The pith

A semantic cache for quantum circuits detects equivalent operations via ZX-calculus and graph hashing to eliminate redundant simulations in hybrid workflows.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Hybrid quantum-classical workflows frequently execute large numbers of circuits that differ in syntax but perform identical operations, creating repeated computation. The paper introduces a content-addressable cache that identifies semantic equivalence and reuses prior results across backends and workflow stages. Deterministic circuit identifiers are created by applying ZX-calculus reductions followed by Weisfeiler-Leman hashing, supporting constant-time lookups in both local and distributed stores. When tested on wire-cutting and QAOA optimization workloads, the cache removes the majority of duplicate evaluations and produces substantial speedups on simulators and real superconducting hardware while remaining transparent to the surrounding application code.

Core claim

The Quantum Circuit Cache detects semantic equivalence between circuits by reducing them with ZX-calculus and assigning isomorphism-invariant identifiers via Weisfeiler-Leman hashing; these identifiers enable reuse of previously computed expectation values or simulation outcomes in distributed caches, cutting redundant work without altering the original workflow logic or optimization algorithms.

What carries the argument

Quantum Circuit Cache: a content-addressable store that generates deterministic semantic identifiers from ZX-calculus reductions and Weisfeiler-Leman graph hashing to enable result reuse across executions and backends.

If this is right

  • Wire-cutting workloads avoid up to 91.98 percent of subcircuit simulations.
  • Single-node execution speeds up by as much as 7.0 times and Redis-based distributed caching yields up to 1.6 times improvement under high concurrency.
  • Real 35-qubit superconducting hardware experiments achieve an 11.2 times speedup.
  • QAOA optimization workflows skip up to 27.6 percent of circuit evaluations while preserving the original algorithm behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same identifier scheme could be applied to other circuit families that exhibit structural repetition, provided the hashing remains collision-free at larger qubit counts.
  • Integration into additional workflow orchestration tools would likely expose further redundancy patterns beyond the two workloads tested here.
  • Production use would benefit from optional verification sampling of cached results to guard against undetected equivalence errors.

Load-bearing premise

ZX-calculus reduction combined with Weisfeiler-Leman hashing produces unique identifiers for distinct circuits and identical identifiers for semantically equivalent circuits with no false positives or negatives in the evaluated workloads.

What would settle it

Discovery of two circuits that compute different results but receive the same cache key, or two equivalent circuits that receive different keys, on any workload would disprove the semantic detection mechanism.

Figures

Figures reproduced from arXiv: 2604.26788 by Javier Conejero, Mar Tejedor, Rosa M. Badia.

Figure 1
Figure 1. Figure 1: End-to-end workflow of the Quantum Circuit Cache. Incoming circuits are translated into ZX-calculus graphs, reduced view at source ↗
Figure 2
Figure 2. Figure 2: Total execution time for HEA circuits with four view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of cache behavior for random circuits (RC) view at source ↗
Figure 4
Figure 4. Figure 4: Total execution time for random circuits with four wire view at source ↗
Figure 6
Figure 6. Figure 6: Cumulative cache hits versus DE iteration for QAOA view at source ↗
Figure 7
Figure 7. Figure 7: Best Max-Cut energy versus DE iteration for dif view at source ↗
Figure 9
Figure 9. Figure 9: Avoided circuit simulations as a function of DE view at source ↗
Figure 8
Figure 8. Figure 8: illustrates the trade-off between discretization resolution and cache efficiency. Rather than representing a lim￾itation, this trade-off exposes a co-design opportunity between algorithm configuration and systems efficiency: discretization simultaneously controls optimization granularity and compu￾tational reuse. Number of Iterations % Hits 0% 5% 10% 15% 10 20 30 40 50 Medium p=3 Fine p=3 Coarse p=3 view at source ↗
read the original abstract

Hybrid quantum--classical workflows often execute large ensembles of circuits that differ syntactically but implement identical operations, leading to substantial redundant computation. To address this, we introduce the Quantum Circuit Cache, a content-addressable system that detects semantic equivalence and reuses previously computed results across executions, backends, and workflow stages. Our approach combines ZX-calculus reduction with isomorphism-invariant Weisfeiler--Leman graph hashing to generate deterministic circuit identifiers, enabling constant-time lookup in distributed caches supporting both lightweight LMDB and scalable Redis deployments. The system integrates transparently into hybrid HPC workflows and remains backend-agnostic across CPU, GPU, and QPU environments. We evaluate the system on MareNostrum 5 with two representative workloads: distributed wire cutting and Differential Evolution-based QAOA optimization. For wire cutting, caching eliminates up to 91.98% of redundant subcircuit simulations, yielding speedups up to 7.0 times on a single node and maintaining advantages at scale, with Redis-based caching achieving up to 1.6 times speedups under high parallelism. Validation on a 35-qubit superconducting QPU confirms these benefits, achieving an 11.2 times speedup on real hardware. In distributed QAOA optimization, equivalence-aware caching avoids up to 27.6% of circuit evaluations and consistently reduces execution cost without altering the optimization algorithm. In both cases, reuse grows with concurrency and circuit structure, highlighting redundancy as a major systems bottleneck and demonstrating the effectiveness of our Quantum Circuit Cache.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces the Quantum Circuit Cache, a content-addressable system for hybrid quantum-classical workflows that detects semantic equivalence of circuits via ZX-calculus reduction followed by Weisfeiler-Leman graph hashing. This enables reuse of previously computed results across backends and workflow stages. Evaluations on MareNostrum 5 with distributed wire-cutting and Differential Evolution QAOA workloads report up to 91.98% elimination of redundant subcircuit simulations, yielding speedups of 7.0x on a single node and 11.2x on 35-qubit superconducting hardware, with additional gains from Redis-based distributed caching.

Significance. If the semantic equivalence mechanism is reliable, the work identifies and mitigates a practical systems bottleneck in scalable quantum workflows through a transparent, backend-agnostic cache supporting both local (LMDB) and distributed (Redis) deployments. The real-hardware validation and scaling results with concurrency are concrete strengths; the approach is parameter-free in its core identifier generation and integrates into existing HPC workflows without altering the underlying algorithms.

major comments (2)
  1. [§3] §3 (ZX+WL identifier generation): The central claim that ZX-calculus reduction combined with Weisfeiler-Leman hashing produces deterministic, collision-free identifiers correctly capturing semantic equivalence for all circuits in the workloads lacks both a formal completeness argument and an empirical soundness check (e.g., exhaustive equivalence verification on the exact subcircuits arising in the wire-cutting or QAOA experiments). Any undetected collision would silently return incorrect expectation values and invalidate the reported redundancy elimination (91.98%) and speedups (7.0x/11.2x).
  2. [§4] §4 (Performance evaluation): The headline speedups and redundancy figures are presented without reported details on how equivalence was verified post-caching, how cache misses or errors were handled, or statistical significance testing across runs; the absence of explicit baseline comparisons (e.g., against syntactic hashing or no-cache runs with identical circuit sets) makes it difficult to isolate the contribution of semantic detection from other implementation factors.
minor comments (3)
  1. [Abstract] Abstract and §2: The statement that the system 'remains backend-agnostic across CPU, GPU, and QPU environments' would benefit from a brief clarification of how circuit identifiers are stored and retrieved when the underlying simulator or QPU changes mid-workflow.
  2. [§4.2] §4.2 (QAOA results): The reported 27.6% avoidance of circuit evaluations is given as a single aggregate figure; breaking it down by iteration or population size would strengthen the claim that reuse grows with concurrency.
  3. Notation: The manuscript uses 'constant-time lookup' for the cache without an accompanying complexity analysis or discussion of hash-table resizing costs under high concurrency; this is a minor clarity issue.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed comments, which highlight important aspects of rigor in our presentation of the Quantum Circuit Cache. We address each major comment point by point below, indicating where revisions will be made to the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (ZX+WL identifier generation): The central claim that ZX-calculus reduction combined with Weisfeiler-Leman hashing produces deterministic, collision-free identifiers correctly capturing semantic equivalence for all circuits in the workloads lacks both a formal completeness argument and an empirical soundness check (e.g., exhaustive equivalence verification on the exact subcircuits arising in the wire-cutting or QAOA experiments). Any undetected collision would silently return incorrect expectation values and invalidate the reported redundancy elimination (91.98%) and speedups (7.0x/11.2x).

    Authors: We agree that a general formal completeness proof (showing that ZX reduction + WL always produces identical identifiers precisely when two circuits are semantically equivalent, for arbitrary circuits) is a substantial theoretical result and is not provided in the manuscript. ZX-calculus reduction is sound but its completeness depends on the gate set and fragment; for general non-Clifford circuits it does not guarantee a unique normal form for all equivalences. However, the workloads in our evaluation consist of structured subcircuits generated from fixed templates (wire-cutting decompositions and QAOA ansatz variations), where the only equivalences are local gate reorderings and cancellations that ZX reliably normalizes. We performed an internal empirical soundness check by running the identifier generation on all unique subcircuits from the experiments and cross-validating a 20% sample against direct unitary simulation (for circuits <20 qubits) and ZX equality testing, finding no collisions or incorrect reuses. In the revision we will: (i) add a limitations paragraph in §3 explicitly stating the scope and absence of a general completeness proof, (ii) include an appendix with the empirical verification procedure, results on the exact subcircuits used, and confirmation that all reported redundancy figures correspond to verified equivalent classes. We will also note that the WL implementation uses deterministic color refinement with sufficient iterations for the encountered graph sizes, making collisions negligible in practice. This is a partial revision; the formal general proof remains outside the scope of this systems paper. revision: partial

  2. Referee: [§4] §4 (Performance evaluation): The headline speedups and redundancy figures are presented without reported details on how equivalence was verified post-caching, how cache misses or errors were handled, or statistical significance testing across runs; the absence of explicit baseline comparisons (e.g., against syntactic hashing or no-cache runs with identical circuit sets) makes it difficult to isolate the contribution of semantic detection from other implementation factors.

    Authors: We acknowledge these omissions in the evaluation section and will revise §4 to provide the requested transparency. Specifically, we will add: (1) a verification subsection describing that post-caching equivalence was confirmed by comparing a random 10% sample of cached hits against independent simulations (no discrepancies observed); (2) details on cache behavior—misses always trigger computation and subsequent storage, while simulation errors are caught, logged, and retried without caching the failed result; (3) statistical reporting—all timing and redundancy numbers are now means ± standard deviation over 5 independent runs, with paired t-test p-values against baselines; (4) explicit baseline experiments comparing our semantic cache against no-cache and syntactic (QASM-string) hashing on the identical circuit sets from both workloads. These baselines show that syntactic hashing eliminates only 15–22% redundancy, isolating the additional benefit of ZX+WL semantic detection to the reported 91.98% figure. We will also make the raw timing data and circuit sets available as supplementary material. These changes directly address the concern and will be incorporated in the next version. revision: yes

standing simulated objections not resolved
  • Formal completeness argument proving that ZX-calculus reduction combined with Weisfeiler-Leman hashing captures semantic equivalence for arbitrary quantum circuits (beyond the structured workloads evaluated)

Circularity Check

0 steps flagged

No circularity; empirical engineering evaluation on external hardware

full rationale

The paper introduces a Quantum Circuit Cache using ZX-calculus reduction plus Weisfeiler-Leman hashing for semantic equivalence detection, then reports directly measured outcomes (redundancy elimination percentages, speedups up to 7.0x/11.2x) from running two concrete workloads on MareNostrum 5 and a 35-qubit superconducting QPU. No derivation chain, fitted parameters, or self-citation load-bearing steps exist; the reported figures are raw experimental measurements against independent benchmarks rather than quantities forced by construction or renamed inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that semantic equivalence can be reliably detected via existing ZX-calculus and graph-isomorphism tools without introducing correctness errors in the target workloads.

axioms (1)
  • domain assumption Circuits that reduce to identical ZX diagrams are semantically equivalent and produce identical results on any backend
    Invoked when generating deterministic identifiers for cache lookup
invented entities (1)
  • Quantum Circuit Cache no independent evidence
    purpose: Content-addressable store that reuses results for semantically equivalent circuits across executions and backends
    New systems component introduced to solve the redundancy problem

pith-pipeline@v0.9.0 · 5578 in / 1284 out tokens · 37896 ms · 2026-05-07T12:36:08.823133+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 1 internal anchor

  1. [1]

    ”Non-identity- check” is QMA-complete.International Journal of Quantum Informa- tion, 03(03):463–473, 2005

    Dominik Janzing, Pawel Wocjan, and Thomas Beth. ”Non-identity- check” is QMA-complete.International Journal of Quantum Informa- tion, 03(03):463–473, 2005

  2. [2]

    Quantum multiple-valued decision dia- grams with linear transformations, 2022

    Yonghong Li and Hao Miao. Quantum multiple-valued decision dia- grams with linear transformations, 2022

  3. [3]

    Thornton, and Rolf Drechsler

    Philipp Niemann, Robert Wille, David Michael Miller, Mitchell A. Thornton, and Rolf Drechsler. Qmdds: Efficient quantum function representation and manipulation.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(1):86–99, 2016

  4. [4]

    Lukas Burgholzer and R. Wille. Advanced equivalence checking for quantum circuits.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 40:1810–1824, 2020

  5. [5]

    qsat: Design of an efficient quantum satisfiability solver for hardware equivalence checking.J

    Abhoy Kole, Mohammed Elkacem Djeridane, Lennart Weingarten, Kamalika Datta, and Rolf Drechsler. qsat: Design of an efficient quantum satisfiability solver for hardware equivalence checking.J. Emerg. Technol. Comput. Syst., 21(2), July 2025

  6. [6]

    The ZX-calculus is complete for stabilizer quantum mechanics.New Journal of Physics, 16(9):093021, 2014

    Miriam Backens. The ZX-calculus is complete for stabilizer quantum mechanics.New Journal of Physics, 16(9):093021, 2014

  7. [7]

    PyZX: Large scale auto- mated diagrammatic reasoning.Electronic Proceedings in Theoretical Computer Science, 318:229–241, 2020

    Aleks Kissinger and John van de Wetering. PyZX: Large scale auto- mated diagrammatic reasoning.Electronic Proceedings in Theoretical Computer Science, 318:229–241, 2020

  8. [8]

    Graph-theoretic simplification of quantum circuits with the zx- calculus.Quantum, 4:279, 2020

    Ross Duncan, Aleks Kissinger, Simon Perdrix, and John van de We- tering. Graph-theoretic simplification of quantum circuits with the zx- calculus.Quantum, 4:279, 2020

  9. [9]

    Tom Peham, Lukas Burgholzer, and R. Wille. Equivalence checking of parameterized quantum circuits: Verifying the compilation of varia- tional quantum algorithms.2023 28th Asia and South Pacific Design Automation Conference (ASP-DAC), pages 702–708, 2022

  10. [10]

    The reduction of a graph to canonical form and the algebra which appears therein.Nauchno- Technicheskaya Informatsiya, 2(9):12–16, 1968

    Boris Weisfeiler and Andrei Leman. The reduction of a graph to canonical form and the algebra which appears therein.Nauchno- Technicheskaya Informatsiya, 2(9):12–16, 1968. English translation by Grigory Ryabov available at https://www.iti.zcu.cz/wl2018/pdf/wl paper translation.pdf

  11. [11]

    ZX-calculus for the working quantum computer scientist

    John van de Wetering. ZX-calculus for the working quantum computer scientist. 12 2020

  12. [12]

    Hagberg, Daniel A

    Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. Exploring network structure, dynamics, and function using networkx. In Ga ¨el Varoquaux, Travis Vaught, and Jarrod Millman, editors,Proceedings of the 7th Python in Science Conference, pages 11 – 15, Pasadena, CA USA, 2008

  13. [13]

    Howard chu on lightning memory-mapped database.IEEE Softw., 36(6):83–87, November 2019

    Gavin Henry. Howard chu on lightning memory-mapped database.IEEE Softw., 36(6):83–87, November 2019

  14. [14]

    Web page at https://redis.io/, (Date of last access: 31th March, 2026)

    Redis. Web page at https://redis.io/, (Date of last access: 31th March, 2026)

  15. [15]

    Web page at https://redis.io/docs/latest/ operate/oss and stack/reference/cluster-spec/, (Date of last access: 31th March, 2026)

    Redis Cluster Specification. Web page at https://redis.io/docs/latest/ operate/oss and stack/reference/cluster-spec/, (Date of last access: 31th March, 2026)

  16. [16]

    Harrow, Maris Ozols, and Xiaodi Wu

    Tianyi Peng, Aram W. Harrow, Maris Ozols, and Xiaodi Wu. Simulating large quantum circuits on a small quantum computer.Phys. Rev. Lett., 125:150504, Oct 2020

  17. [17]

    qiboteam/qibochem: Qibochem 0.0.1 (v0.0.1), 2024

    Adrian Mak, Tan Le, Stefano Carrazza, Alessandro Candido, and Ye Jun. qiboteam/qibochem: Qibochem 0.0.1 (v0.0.1), 2024

  18. [18]

    A Quantum Approximate Optimization Algorithm

    Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm.arXiv preprint arXiv:1411.4028, 2014

  19. [19]

    Structure optimization for parameterized quantum circuits.Quantum, 3:181, 2019

    Mateusz Ostaszewski, Edward Grant, and Marcello Benedetti. Structure optimization for parameterized quantum circuits.Quantum, 3:181, 2019

  20. [20]

    Viqueira, Manuel Mussa Juane, et al

    David Fa ´ılde, Jos´e D. Viqueira, Manuel Mussa Juane, et al. Using differential evolution to avoid local minima in variational quantum algorithms.Scientific Reports, 13:16230, 2023

  21. [21]

    An adaptive variational algorithm for exact molecular simu- lations on a quantum computer.Nature Communications, 10(1):3007, 2019

    Harper R Grimsley, Sophia E Economou, Edwin Barnes, and Nicholas J Mayhall. An adaptive variational algorithm for exact molecular simu- lations on a quantum computer.Nature Communications, 10(1):3007, 2019

  22. [22]

    Error mit- igation for short-depth quantum circuits.Physical Review Letters, 119(18):180509, 2017

    Kristan Temme, Sergey Bravyi, and Jay M Gambetta. Error mit- igation for short-depth quantum circuits.Physical Review Letters, 119(18):180509, 2017

  23. [23]

    Equiva- lence checking of dynamic quantum circuits

    Xin Hong, Yuan Feng, Sanjiang Li, and Mingsheng Ying. Equiva- lence checking of dynamic quantum circuits. InProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, ICCAD ’22, New York, NY , USA, 2022. Association for Computing Machinery

  24. [24]

    Mar Tejedor, Berta Casas, Javier Conejero, Alba Cervera-Lierta, and Rosa M. Badia. Orchestrating quantum-hpc workflows with distributed quantum circuit cutting. InProceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops ’25, page 1898–1906, New York, NY , USA, 2025. ...