Sparse Mamba Decoder for Quantum Error Correction: Efficient Defect-Centric Processing of Surface Code Syndromes

Jean-Luc Gaudiot; Maxim Shcherbakov; Nader Bagherzadeh; Samira Sayedsalehi

arxiv: 2605.17156 · v2 · pith:VJI5DX6Cnew · submitted 2026-05-16 · 🪐 quant-ph · cs.LG

Sparse Mamba Decoder for Quantum Error Correction: Efficient Defect-Centric Processing of Surface Code Syndromes

Samira Sayedsalehi , Nader Bagherzadeh , Maxim Shcherbakov , Jean-Luc Gaudiot This is my paper

Pith reviewed 2026-05-22 09:36 UTC · model grok-4.3

classification 🪐 quant-ph cs.LG

keywords quantum error correctionsurface codesparse decoderMamba modeldefect processingneural decoderfault-tolerant quantum computingsyndrome decoding

0 comments

The pith

A sparse Mamba decoder for surface codes processes only active detection events to reach O(k) complexity while cutting logical error rates versus MWPM.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Sparse Mamba Decoder that ignores the vast majority of empty syndrome entries and works only with the small number of active defects. It assigns each defect a fixed 13-dimensional feature vector and feeds the sequence into a Mamba state-space backbone. This yields linear scaling in the number of errors rather than quadratic scaling in the code distance. The approach is shown to improve accuracy over minimum-weight perfect matching on several noise models and to run orders of magnitude faster than existing high-performance decoders while keeping microsecond-scale latency as the distance grows.

Core claim

The Sparse Mamba Decoder processes only the k active detection events using a 13-dimensional feature representation per defect and a Mamba state-space backbone, achieving O(k) complexity. Across depolarizing, uniform circuit-level, SI1000, and Google Sycamore experimental benchmarks, it reduces the MWPM logical error rate by up to 49% at d ≤ 5 under SI1000 noise, runs 95-467x faster than the Tesseract near-MLD decoder and 232-463x faster than Belief Matching, and maintains nearly constant latency (24-57 us) across d = 3-9 under uniform circuit-level noise.

What carries the argument

Defect-centric processing that encodes each active detection event with a fixed 13-dimensional feature vector and routes the resulting sparse sequence through a Mamba state-space model.

If this is right

Reduces logical error rate by up to 49 percent compared with MWPM at small distances under SI1000 noise.
Delivers 95-467x speedup over Tesseract and 232-463x speedup over Belief Matching.
Keeps latency nearly constant between 24 and 57 microseconds as code distance increases from 3 to 9.
Matches or slightly exceeds the accuracy of a dense Mamba decoder on real Sycamore experimental data.
Runs on commodity GPUs with only 7.5-16 million parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sparse-event strategy could be applied to larger-distance codes where the fraction of active defects stays small at realistic physical error rates.
The method may transfer to other quantum error-correcting codes whose syndrome graphs are also sparse.
Hardware implementations could exploit the O(k) scaling to keep decoding latency inside the coherence time of near-term devices.
State-space models such as Mamba appear well matched to the sequential, low-density nature of defect streams in quantum error correction.

Load-bearing premise

A fixed 13-dimensional feature representation per defect plus the Mamba backbone captures every relevant spatial and temporal correlation in the full syndrome volume without any loss of decoding accuracy.

What would settle it

A direct comparison at code distances d greater than 9 or under noise models not tested in the paper that shows the Sparse Mamba Decoder's logical error rate rising above a full-syndrome neural decoder or a high-accuracy classical decoder such as Tesseract.

Figures

Figures reproduced from arXiv: 2605.17156 by Jean-Luc Gaudiot, Maxim Shcherbakov, Nader Bagherzadeh, Samira Sayedsalehi.

**Figure 2.** Figure 2: Sparse Mamba Decoder architecture. (a) Sparse defect extraction from a (d 2−1) × R syndrome volume to k defect tokens d1, . . . , dk (k ≪ d 2R at physically relevant error rates). (b) 13-dimensional feature vector per defect: spatial coordinates (x, y), normalized time t/R, stabilizer type τ , spatial and temporal neighbor flags, boundary distances bZ, bX, and the reconstructed measurement mi,t from cumula… view at source ↗

**Figure 3.** Figure 3: Logical error rate under depolarizing noise with perfect stabilizer measurements. The [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Logical error rate under uniform circuit-level noise with [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Mean logical error per round on the Google Sycamore experimental dataset at code [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Speed–accuracy Pareto front for MWPM, Belief Matching, Tesseract, and SMD at [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Speed–accuracy Pareto front under uniform circuit-level noise at [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

read the original abstract

Quantum error correction (QEC) is essential for building fault-tolerant quantum computers, requiring decoders that are simultaneously accurate, fast, and scalable. Most state-of-the-art neural decoders achieve high accuracy but process the full dense syndrome array of size $O(d^2 R) $regardless of the actual error rate, where d is the code distance and R is the number of measurement rounds. At physically relevant error rates (p ~ 0.1%), fewer than 5% of syndrome entries contain active detection events -- yet existing decoders process the entire syndrome volume. We introduce the Sparse Mamba Decoder (SMD), a defect-centric neural decoder that processes only the k active detection events using a 13-dimensional feature representation per defect and a Mamba state-space backbone, achieving $O(k)$ complexity. Across depolarizing, uniform circuit-level, SI1000, and Google Sycamore experimental benchmarks, SMD reduces the MWPM logical error rate by up to 49% at $d \le 5$ under SI1000 noise, runs 95-467x faster than the Tesseract near-MLD decoder and 232-463x faster than Belief Matching, and maintains nearly constant latency (24-57 us) across d = 3-9 under uniform circuit-level noise. On the Sycamore experimental dataset, the SMD ensemble matches or slightly surpasses the dense Mamba decoder of Varbanov et al. All results are obtained on commodity NVIDIA GPUs with 7.5-16M parameters, without specialized accelerators.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows a sparse Mamba-based decoder that processes only active defects in surface codes to cut complexity to O(k) and deliver large speedups over Tesseract while matching dense neural baselines on experimental data.

read the letter

The main point is that they move from dense syndrome processing to a defect-centric approach: only the k active detection events get fed in, each with a fixed 13-dimensional feature vector, then run through a Mamba state-space model. This produces the claimed O(k) scaling and keeps latency nearly flat from d=3 to d=9 on GPU hardware. They report 95-467x speedups versus Tesseract, 232-463x versus Belief Matching, and up to 49% lower logical error rate than MWPM at small distances under SI1000 noise. On the Sycamore experimental dataset the ensemble matches or slightly beats the earlier dense Mamba decoder. Those concrete cross-benchmark numbers, including real hardware data, are the strongest part of the work. The engineering looks solid for the sizes tested, with modest parameter counts and no need for custom accelerators. The soft spot is the 13-feature representation itself. It is not clear from the results whether this hand-chosen encoding plus Mamba compression fully preserves the higher-order spatial-temporal correlations that appear at larger distances or under different noise. The biggest accuracy gains are shown at d ≤ 5, and while latency stays constant to d=9 the paper does not show whether logical error rates remain competitive there. Training details, error bars, and feature ablations would help judge how robust the choice is. This is useful for groups working on practical, low-latency decoders for near-term fault tolerance who already consider neural methods. A reader focused on runtime scaling at low physical error rates will find the empirical comparisons worth examining. The work is coherent enough and grounded in external benchmarks to merit a serious referee who can check the methods and scaling claims in detail. I would send it to peer review.

Referee Report

3 major / 3 minor

Summary. The manuscript introduces the Sparse Mamba Decoder (SMD), a neural decoder for surface-code quantum error correction that operates in a defect-centric manner. Instead of processing the full dense syndrome volume of size O(d²R), SMD extracts only the k active detection events, encodes each with a fixed 13-dimensional feature vector (coordinates, timestamp, parity information), and feeds the resulting sequence into a Mamba state-space backbone. The authors report O(k) complexity, up to 49% reduction in logical error rate versus MWPM at d ≤ 5 under SI1000 noise, 95–467× speedups over Tesseract, and nearly constant 24–57 µs latency for d = 3–9. Results are presented on depolarizing, circuit-level, SI1000, and Google Sycamore experimental data, with model sizes of 7.5–16 M parameters.

Significance. If the reported accuracy and scaling hold, the work would constitute a meaningful advance toward real-time, scalable decoders for fault-tolerant quantum computing. The shift from dense to sparse, event-driven processing directly addresses the inefficiency of current neural decoders at low physical error rates, where most syndrome bits are inactive. The application of Mamba to defect sequences is a technically interesting choice that could generalize to other sparse QEC settings. The concrete speed and latency numbers on commodity GPUs strengthen the practical relevance, provided the accuracy claims survive detailed scrutiny of training protocols and feature sufficiency.

major comments (3)

[Abstract and §4.2] Abstract and §4.2: The central claim that a fixed 13-dimensional per-defect feature vector plus Mamba backbone recovers (or exceeds) the accuracy of dense decoders rests on the untested assumption that these 13 dimensions encode all relevant spatial-temporal correlations. No ablation is shown that varies the feature set or compares directly against a dense Mamba baseline at d > 5; if higher-order correlations are lost, the reported 49% logical-error improvement and parity with the dense decoder would not generalize.
[§5.1 and Table 2] §5.1 and Table 2: The training procedure, hyperparameter search, data-split rules, and statistical error bars on the logical-error-rate numbers are not described. Without these details it is impossible to verify that the 49% improvement versus MWPM and the speedups versus Tesseract are reproducible and not artifacts of particular random seeds or benchmark subsets.
[§6.3] §6.3: The latency measurements (24–57 µs) are reported as nearly constant across d = 3–9, yet the paper does not specify whether this includes the full pipeline (defect extraction, feature construction, Mamba inference, and final correction mapping) or only the neural-network forward pass. This distinction is load-bearing for the claimed real-time applicability.

minor comments (3)

[Figure 3] Figure 3: The caption does not state the number of Monte Carlo shots used to generate each data point or whether error bars represent standard error or 95% confidence intervals.
[§3.1] §3.1: The exact definition of the 13-dimensional feature vector is given only in prose; a compact table listing each component and its normalization would improve reproducibility.
[References] References: Several recent works on sparse or event-driven decoders (e.g., recent neural MWPM hybrids) are cited only in passing; a short related-work paragraph would better situate the contribution.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review of our manuscript. We appreciate the recognition of the potential significance of the Sparse Mamba Decoder for scalable quantum error correction. Below, we provide point-by-point responses to the major comments and indicate the revisions made to address them.

read point-by-point responses

Referee: [Abstract and §4.2] The central claim that a fixed 13-dimensional per-defect feature vector plus Mamba backbone recovers (or exceeds) the accuracy of dense decoders rests on the untested assumption that these 13 dimensions encode all relevant spatial-temporal correlations. No ablation is shown that varies the feature set or compares directly against a dense Mamba baseline at d > 5; if higher-order correlations are lost, the reported 49% logical-error improvement and parity with the dense decoder would not generalize.

Authors: We acknowledge that an explicit ablation study on the feature set would provide additional validation. The 13 features were selected based on standard QEC literature to capture position, time, parity, and local syndrome information necessary for decoding. On the Google Sycamore experimental dataset, our model matches or exceeds the performance of the dense Mamba decoder from Varbanov et al., suggesting that the sparse representation retains the essential correlations. However, we agree that a direct comparison at larger d is valuable and have added a limited ablation study in the revised §4.2 comparing subsets of features. A full dense Mamba baseline at d>5 was not feasible due to memory constraints, but we discuss this limitation and provide scaling arguments in the updated manuscript. revision: partial
Referee: [§5.1 and Table 2] The training procedure, hyperparameter search, data-split rules, and statistical error bars on the logical-error-rate numbers are not described. Without these details it is impossible to verify that the 49% improvement versus MWPM and the speedups versus Tesseract are reproducible and not artifacts of particular random seeds or benchmark subsets.

Authors: We thank the referee for pointing this out. In the revised manuscript, we have expanded §5.1 to fully describe the training procedure, including the hyperparameter search method (grid search over learning rate, batch size, and model dimensions), data-split rules (80/10/10 train/validation/test with no overlap in error configurations), and added statistical error bars to Table 2 based on 10 independent training runs with different random seeds. These details ensure reproducibility of the reported improvements. revision: yes
Referee: [§6.3] The latency measurements (24–57 µs) are reported as nearly constant across d = 3–9, yet the paper does not specify whether this includes the full pipeline (defect extraction, feature construction, Mamba inference, and final correction mapping) or only the neural-network forward pass. This distinction is load-bearing for the claimed real-time applicability.

Authors: We apologize for the ambiguity. The reported latency figures include the complete end-to-end pipeline: defect extraction from the syndrome, construction of the 13-feature vectors, Mamba model inference, and mapping to the final correction. We have clarified this explicitly in the revised §6.3, including a breakdown of the time contributions from each stage to demonstrate that the neural inference dominates but the overall latency remains suitable for real-time decoding. revision: yes

Circularity Check

0 steps flagged

No circularity in Sparse Mamba Decoder claims; performance is empirically benchmarked

full rationale

The paper introduces an architectural design that processes only active detection events via a fixed 13-dimensional per-defect feature vector and Mamba backbone to achieve O(k) complexity. This is a direct consequence of the input representation choice rather than a derived prediction that reduces to fitted quantities by construction. All reported gains (up to 49% logical error reduction vs MWPM, 95-467x speedup vs Tesseract) are external empirical measurements on depolarizing, SI1000, circuit-level, and Sycamore experimental data, compared against independent baselines. No equations, self-citations, or uniqueness theorems are invoked to force the results; the 13-dim features and accuracy claims remain testable assumptions validated outside the model's own definitions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on trained neural-network weights and the modeling assumption that sparse local features suffice for global decoding accuracy.

free parameters (1)

Neural network weights
7.5-16 million parameters fitted during training on syndrome data.

axioms (1)

domain assumption A 13-dimensional feature vector per defect is informationally sufficient for accurate decoding.
Invoked by the choice of sparse input representation in the decoder design.

pith-pipeline@v0.9.0 · 5834 in / 1301 out tokens · 47264 ms · 2026-05-22T09:36:52.919421+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce the Sparse Mamba Decoder (SMD), a defect-centric neural decoder that processes only the k active detection events using a 13-dimensional feature representation per defect and a Mamba state-space backbone, achieving O(k) complexity.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Each active detection event is represented by a 13-dimensional feature vector encoding spatial coordinates on the rotated lattice, stabilizer type (X or Z), spatial and temporal neighborhood connectivity flags, normalized distances to the logical boundaries, and a reconstructed stabilizer measurement computed via cumulative XOR.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 3 internal anchors

[1]

Shor (1995): Scheme for reducing decoherence in quantum computer memory

Peter W Shor. Scheme for reducing decoherence in quantum computer memory.Physical Review A, 52(4):R2493, 1995. doi: 10.1103/PhysRevA.52.R2493. URLhttps://doi.org/ 10.1103/PhysRevA.52.R2493

work page doi:10.1103/physreva.52.r2493 1995
[2]

Fault-tolerant quantum computation by anyons

A Yu Kitaev. Fault-tolerant quantum computation by anyons.Annals of Physics, 303 (1):2–30, 2003. doi: 10.1016/S0003-4916(02)00018-0. URLhttps://doi.org/10.1016/ S0003-4916(02)00018-0

work page internal anchor Pith review doi:10.1016/s0003-4916(02)00018-0 2003
[3]

Journal of Mathemat- ical Physics43(9), 4452–4505 (2002) https://doi.org/10.1063/1.1499754

Eric Dennis, Alexei Kitaev, Andrew Landahl, and John Preskill. Topological quantum memory.Journal of Mathematical Physics, 43(9):4452–4505, 2002. doi: 10.1063/1.1499754. URLhttps://doi.org/10.1063/1.1499754

work page doi:10.1063/1.1499754 2002
[4]

Fowler, Matteo Mariantoni, John M

Austin G Fowler, Matteo Mariantoni, John M Martinis, and Andrew N Cleland. Sur- face codes: Towards practical large-scale quantum computation.Physical Review A, 86 (3):032324, 2012. doi: 10.1103/PhysRevA.86.032324. URLhttps://doi.org/10.1103/ PhysRevA.86.032324

work page doi:10.1103/physreva.86.032324 2012
[5]

Del Barrio, Guillermo Botella, and Ratko Pilipović

Samira Sayedsalehi, Nader Bagherzadeh, Alberto A. Del Barrio, Guillermo Botella, and Ratko Pilipović. Developing and analyzing the defect-based surface codes using optimization algorithms.Quantum Reports, 7(2):25, 2025. doi: 10.3390/quantum7020025. URLhttps: //doi.org/10.3390/quantum7020025

work page doi:10.3390/quantum7020025 2025
[6]

Suppressing quantum errors by scaling a surface code logical qubit

Google Quantum AI. Suppressing quantum errors by scaling a surface code logical qubit. Nature, 614:676–681, 2023. doi: 10.1038/s41586-022-05434-1. URLhttps://doi.org/10. 1038/s41586-022-05434-1

work page doi:10.1038/s41586-022-05434-1 2023
[7]

Sparse blossom: correcting a million errors per core second with minimum-weight matching.Quantum, 9:1600, January 2025

Oscar Higgott and Craig Gidney. Sparse blossom: correcting a million errors per core second with minimum-weight matching.Quantum, 9:1600, January 2025. doi: 10.22331/ q-2025-01-20-1600. URLhttps://doi.org/10.22331/q-2025-01-20-1600

work page doi:10.22331/q-2025-01-20-1600 2025
[8]

Pymatching: A python package for decoding quantum codes with minimum-weight perfect matching,

Oscar Higgott. PyMatching: A python package for decoding quantum codes with minimum- weight perfect matching.ACM Transactions on Quantum Computing, 3(3):1–16, 2022. doi: 10.1145/3505637. URLhttps://doi.org/10.1145/3505637

work page doi:10.1145/3505637 2022
[9]

Dickerson

Laleh Aghababaie Beni, Oscar Higgott, and Noah Shutty. Tesseract: A search-based decoder for quantum error correction.arXiv preprint arXiv:2503.10988, 2025. doi: 10.48550/arXiv. 2503.10988. URLhttps://arxiv.org/abs/2503.10988. 20

work page internal anchor Pith review doi:10.48550/arxiv 2025
[10]

Learning high-accuracy error decoding for quantum processors.Nature, 635:834–840, 2024

Johannes Bausch, Andrew W Senior, Francisco JH Heras, Thomas Edlich, Alex Davies, Michael Newman, Cody Jones, Kevin Satzinger, Murphy Yuezhen Niu, Sam Blackwell, et al. Learning high-accuracy error decoding for quantum processors.Nature, 635:834–840, 2024. doi: 10.1038/s41586-024-08148-8. URLhttps://doi.org/10.1038/s41586-024-08148-8

work page doi:10.1038/s41586-024-08148-8 2024
[11]

A scalable and real-time neural decoder for topological quantum codes.arXiv preprint arXiv:2512.07737, 2025

Andrew W Senior, Thomas Edlich, Francisco JH Heras, Lei M Zhang, Oscar Higgott, James S Spencer, Taylor Applebaum, Sam Blackwell, Justin Ledford, Akvile Zemgulyte, Augustin Zidek, Noah Shutty, Andrew Cowie, Yin Li, George Holland, Peter Brooks, Charlie Beattie, Michael Newman, Alex Davies, Cody Jones, Sergio Boixo, Hartmut Neven, Push- meet Kohli, and Joh...

work page doi:10.48550/arxiv.2512.07737 2025
[12]

Announcing Trillium, the sixth generation of Google Cloud TPU

Amin Vahdat. Announcing Trillium, the sixth generation of Google Cloud TPU. Google Cloud Blog, May 2024. URLhttps://cloud.google.com/blog/products/compute/ introducing-trillium-6th-gen-tpus. Accessed: 2025

work page 2024
[13]

Changwon Lee, Tak Hur, and Daniel K. Park. Scalable neural decoders for practical real- time quantum error correction.arXiv preprint arXiv:2510.22724, 2025. doi: 10.48550/ arXiv.2510.22724. URLhttps://arxiv.org/abs/2510.22724

work page arXiv 2025
[14]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2024. URLhttps://arxiv.org/abs/2312.00752

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Data-driven decoding of quantum error correcting codes using graph neural networks.Physical Review Research, 7(2):023181, 2025

Moritz Lange, Pontus Havström, Basudha Srivastava, Isak Bengtsson, Valdemar Bergentall, Karl Hammar, Olivia Heuts, Evert van Nieuwenburg, and Mats Granath. Data-driven decoding of quantum error correcting codes using graph neural networks.Physical Review Research, 7(2):023181, 2025. doi: 10.1103/PhysRevResearch.7.023181. URLhttps://doi. org/10.1103/PhysRe...

work page doi:10.1103/physrevresearch.7.023181 2025
[16]

Hardness of decoding quantum stabilizer codes.IEEE Transactions on Information Theory, 61(9):5209–5223, 2015

Pavithran Iyer and David Poulin. Hardness of decoding quantum stabilizer codes.IEEE Transactions on Information Theory, 61(9):5209–5223, 2015. doi: 10.1109/TIT.2015. 2422294. URLhttps://doi.org/10.1109/TIT.2015.2422294

work page doi:10.1109/tit.2015 2015
[17]

Efficient algorithms for maximum likelihood decoding in the surface code,

Sergey Bravyi, Martin Suchara, and Alexander Vargo. Efficient algorithms for maximum likelihood decoding in the surface code.Physical Review A, 90(3):032326, 2014. doi: 10. 1103/PhysRevA.90.032326. URLhttps://doi.org/10.1103/PhysRevA.90.032326

work page doi:10.1103/physreva.90.032326 2014
[18]

Almost-linear time decoding algorithm for topo- logical codes.Quantum, 5:595, 2021

Nicolas Delfosse and Naomi H Nickerson. Almost-linear time decoding algorithm for topo- logical codes.Quantum, 5:595, 2021. doi: 10.22331/q-2021-12-02-595. URLhttps: //doi.org/10.22331/q-2021-12-02-595

work page doi:10.22331/q-2021-12-02-595 2021
[19]

Improved decoding of circuit noise and fragile boundaries of tailored surface codes.Physical Review X, 13(3):031007, 2023

Oscar Higgott, Thomas C Bohdanowicz, Aleksander Kubica, Steven T Flammia, and Earl T Campbell. Improved decoding of circuit noise and fragile boundaries of tailored surface codes.Physical Review X, 13(3):031007, 2023. doi: 10.1103/PhysRevX.13.031007. URL https://doi.org/10.1103/PhysRevX.13.031007

work page doi:10.1103/physrevx.13.031007 2023
[20]

Improved accuracy for decoding surface codes with matching synthesis.arXiv preprint arXiv:2408.12135, 2024

Cody Jones. Improved accuracy for decoding surface codes with matching synthesis.arXiv preprint arXiv:2408.12135, 2024. URLhttps://arxiv.org/abs/2408.12135

work page arXiv 2024
[21]

Symbolic discovery of optimization algorithms

Xiangning Chen, Chen Liang, Da Huang, Esteban Real, Kaiyuan Wang, Yao Liu, Hieu Pham, Xuanyi Dong, Thang Luong, Cho-Jui Hsieh, et al. Symbolic discovery of optimization algorithms. InAdvances in Neural Information Processing Systems, 2023. URLhttps: //arxiv.org/abs/2302.06675. 21

work page arXiv 2023
[22]

Stim: a fast stabilizer circuit simulator.Quantum, 5:497, July 2021

Craig Gidney. Stim: a fast stabilizer circuit simulator.Quantum, 5:497, 2021. doi: 10. 22331/q-2021-07-06-497. URLhttps://doi.org/10.22331/q-2021-07-06-497

work page doi:10.22331/q-2021-07-06-497 2021
[23]

Abanin, Laleh Aghababaie-Beni, Igor Aleiner, Trond I

Google Quantum AI. Quantum error correction below the surface code threshold.Nature, 638:920–926, 2025. doi: 10.1038/s41586-024-08449-y. URLhttps://doi.org/10.1038/ s41586-024-08449-y. 22

work page doi:10.1038/s41586-024-08449-y 2025

[1] [1]

Shor (1995): Scheme for reducing decoherence in quantum computer memory

Peter W Shor. Scheme for reducing decoherence in quantum computer memory.Physical Review A, 52(4):R2493, 1995. doi: 10.1103/PhysRevA.52.R2493. URLhttps://doi.org/ 10.1103/PhysRevA.52.R2493

work page doi:10.1103/physreva.52.r2493 1995

[2] [2]

Fault-tolerant quantum computation by anyons

A Yu Kitaev. Fault-tolerant quantum computation by anyons.Annals of Physics, 303 (1):2–30, 2003. doi: 10.1016/S0003-4916(02)00018-0. URLhttps://doi.org/10.1016/ S0003-4916(02)00018-0

work page internal anchor Pith review doi:10.1016/s0003-4916(02)00018-0 2003

[3] [3]

Journal of Mathemat- ical Physics43(9), 4452–4505 (2002) https://doi.org/10.1063/1.1499754

Eric Dennis, Alexei Kitaev, Andrew Landahl, and John Preskill. Topological quantum memory.Journal of Mathematical Physics, 43(9):4452–4505, 2002. doi: 10.1063/1.1499754. URLhttps://doi.org/10.1063/1.1499754

work page doi:10.1063/1.1499754 2002

[4] [4]

Fowler, Matteo Mariantoni, John M

Austin G Fowler, Matteo Mariantoni, John M Martinis, and Andrew N Cleland. Sur- face codes: Towards practical large-scale quantum computation.Physical Review A, 86 (3):032324, 2012. doi: 10.1103/PhysRevA.86.032324. URLhttps://doi.org/10.1103/ PhysRevA.86.032324

work page doi:10.1103/physreva.86.032324 2012

[5] [5]

Del Barrio, Guillermo Botella, and Ratko Pilipović

Samira Sayedsalehi, Nader Bagherzadeh, Alberto A. Del Barrio, Guillermo Botella, and Ratko Pilipović. Developing and analyzing the defect-based surface codes using optimization algorithms.Quantum Reports, 7(2):25, 2025. doi: 10.3390/quantum7020025. URLhttps: //doi.org/10.3390/quantum7020025

work page doi:10.3390/quantum7020025 2025

[6] [6]

Suppressing quantum errors by scaling a surface code logical qubit

Google Quantum AI. Suppressing quantum errors by scaling a surface code logical qubit. Nature, 614:676–681, 2023. doi: 10.1038/s41586-022-05434-1. URLhttps://doi.org/10. 1038/s41586-022-05434-1

work page doi:10.1038/s41586-022-05434-1 2023

[7] [7]

Sparse blossom: correcting a million errors per core second with minimum-weight matching.Quantum, 9:1600, January 2025

Oscar Higgott and Craig Gidney. Sparse blossom: correcting a million errors per core second with minimum-weight matching.Quantum, 9:1600, January 2025. doi: 10.22331/ q-2025-01-20-1600. URLhttps://doi.org/10.22331/q-2025-01-20-1600

work page doi:10.22331/q-2025-01-20-1600 2025

[8] [8]

Pymatching: A python package for decoding quantum codes with minimum-weight perfect matching,

Oscar Higgott. PyMatching: A python package for decoding quantum codes with minimum- weight perfect matching.ACM Transactions on Quantum Computing, 3(3):1–16, 2022. doi: 10.1145/3505637. URLhttps://doi.org/10.1145/3505637

work page doi:10.1145/3505637 2022

[9] [9]

Dickerson

Laleh Aghababaie Beni, Oscar Higgott, and Noah Shutty. Tesseract: A search-based decoder for quantum error correction.arXiv preprint arXiv:2503.10988, 2025. doi: 10.48550/arXiv. 2503.10988. URLhttps://arxiv.org/abs/2503.10988. 20

work page internal anchor Pith review doi:10.48550/arxiv 2025

[10] [10]

Learning high-accuracy error decoding for quantum processors.Nature, 635:834–840, 2024

Johannes Bausch, Andrew W Senior, Francisco JH Heras, Thomas Edlich, Alex Davies, Michael Newman, Cody Jones, Kevin Satzinger, Murphy Yuezhen Niu, Sam Blackwell, et al. Learning high-accuracy error decoding for quantum processors.Nature, 635:834–840, 2024. doi: 10.1038/s41586-024-08148-8. URLhttps://doi.org/10.1038/s41586-024-08148-8

work page doi:10.1038/s41586-024-08148-8 2024

[11] [11]

A scalable and real-time neural decoder for topological quantum codes.arXiv preprint arXiv:2512.07737, 2025

Andrew W Senior, Thomas Edlich, Francisco JH Heras, Lei M Zhang, Oscar Higgott, James S Spencer, Taylor Applebaum, Sam Blackwell, Justin Ledford, Akvile Zemgulyte, Augustin Zidek, Noah Shutty, Andrew Cowie, Yin Li, George Holland, Peter Brooks, Charlie Beattie, Michael Newman, Alex Davies, Cody Jones, Sergio Boixo, Hartmut Neven, Push- meet Kohli, and Joh...

work page doi:10.48550/arxiv.2512.07737 2025

[12] [12]

Announcing Trillium, the sixth generation of Google Cloud TPU

Amin Vahdat. Announcing Trillium, the sixth generation of Google Cloud TPU. Google Cloud Blog, May 2024. URLhttps://cloud.google.com/blog/products/compute/ introducing-trillium-6th-gen-tpus. Accessed: 2025

work page 2024

[13] [13]

Changwon Lee, Tak Hur, and Daniel K. Park. Scalable neural decoders for practical real- time quantum error correction.arXiv preprint arXiv:2510.22724, 2025. doi: 10.48550/ arXiv.2510.22724. URLhttps://arxiv.org/abs/2510.22724

work page arXiv 2025

[14] [14]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2024. URLhttps://arxiv.org/abs/2312.00752

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Data-driven decoding of quantum error correcting codes using graph neural networks.Physical Review Research, 7(2):023181, 2025

Moritz Lange, Pontus Havström, Basudha Srivastava, Isak Bengtsson, Valdemar Bergentall, Karl Hammar, Olivia Heuts, Evert van Nieuwenburg, and Mats Granath. Data-driven decoding of quantum error correcting codes using graph neural networks.Physical Review Research, 7(2):023181, 2025. doi: 10.1103/PhysRevResearch.7.023181. URLhttps://doi. org/10.1103/PhysRe...

work page doi:10.1103/physrevresearch.7.023181 2025

[16] [16]

Hardness of decoding quantum stabilizer codes.IEEE Transactions on Information Theory, 61(9):5209–5223, 2015

Pavithran Iyer and David Poulin. Hardness of decoding quantum stabilizer codes.IEEE Transactions on Information Theory, 61(9):5209–5223, 2015. doi: 10.1109/TIT.2015. 2422294. URLhttps://doi.org/10.1109/TIT.2015.2422294

work page doi:10.1109/tit.2015 2015

[17] [17]

Efficient algorithms for maximum likelihood decoding in the surface code,

Sergey Bravyi, Martin Suchara, and Alexander Vargo. Efficient algorithms for maximum likelihood decoding in the surface code.Physical Review A, 90(3):032326, 2014. doi: 10. 1103/PhysRevA.90.032326. URLhttps://doi.org/10.1103/PhysRevA.90.032326

work page doi:10.1103/physreva.90.032326 2014

[18] [18]

Almost-linear time decoding algorithm for topo- logical codes.Quantum, 5:595, 2021

Nicolas Delfosse and Naomi H Nickerson. Almost-linear time decoding algorithm for topo- logical codes.Quantum, 5:595, 2021. doi: 10.22331/q-2021-12-02-595. URLhttps: //doi.org/10.22331/q-2021-12-02-595

work page doi:10.22331/q-2021-12-02-595 2021

[19] [19]

Improved decoding of circuit noise and fragile boundaries of tailored surface codes.Physical Review X, 13(3):031007, 2023

Oscar Higgott, Thomas C Bohdanowicz, Aleksander Kubica, Steven T Flammia, and Earl T Campbell. Improved decoding of circuit noise and fragile boundaries of tailored surface codes.Physical Review X, 13(3):031007, 2023. doi: 10.1103/PhysRevX.13.031007. URL https://doi.org/10.1103/PhysRevX.13.031007

work page doi:10.1103/physrevx.13.031007 2023

[20] [20]

Improved accuracy for decoding surface codes with matching synthesis.arXiv preprint arXiv:2408.12135, 2024

Cody Jones. Improved accuracy for decoding surface codes with matching synthesis.arXiv preprint arXiv:2408.12135, 2024. URLhttps://arxiv.org/abs/2408.12135

work page arXiv 2024

[21] [21]

Symbolic discovery of optimization algorithms

Xiangning Chen, Chen Liang, Da Huang, Esteban Real, Kaiyuan Wang, Yao Liu, Hieu Pham, Xuanyi Dong, Thang Luong, Cho-Jui Hsieh, et al. Symbolic discovery of optimization algorithms. InAdvances in Neural Information Processing Systems, 2023. URLhttps: //arxiv.org/abs/2302.06675. 21

work page arXiv 2023

[22] [22]

Stim: a fast stabilizer circuit simulator.Quantum, 5:497, July 2021

Craig Gidney. Stim: a fast stabilizer circuit simulator.Quantum, 5:497, 2021. doi: 10. 22331/q-2021-07-06-497. URLhttps://doi.org/10.22331/q-2021-07-06-497

work page doi:10.22331/q-2021-07-06-497 2021

[23] [23]

Abanin, Laleh Aghababaie-Beni, Igor Aleiner, Trond I

Google Quantum AI. Quantum error correction below the surface code threshold.Nature, 638:920–926, 2025. doi: 10.1038/s41586-024-08449-y. URLhttps://doi.org/10.1038/ s41586-024-08449-y. 22

work page doi:10.1038/s41586-024-08449-y 2025