pith. machine review for the scientific record. sign in

arxiv: 2603.22149 · v2 · submitted 2026-03-23 · 🪐 quant-ph · cs.AR

Recognition: no theorem link

Low Latency GNN Accelerator for Quantum Error Correction

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:43 UTC · model grok-4.3

classification 🪐 quant-ph cs.AR
keywords quantum error correctionsurface codegraph neural networkFPGA acceleratordecoder latencysuperconducting qubitslogical error ratereal-time decoding
0
0 comments X

The pith

An FPGA accelerator for a graph neural network decoder performs quantum error correction in under one microsecond with lower error rates than prior methods for surface codes up to distance 7.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a high-accuracy graph neural network decoder for surface-code quantum error correction can be adapted with hardware-aware optimizations and mapped to an FPGA to satisfy the strict one-microsecond decoding deadline set by superconducting qubit coherence times. This yields both the required speed and a lower logical error rate than existing decoders for code distances up to seven. A reader would care because real-time decoding is the principal bottleneck preventing fault-tolerant operation of near-term quantum hardware; meeting the latency bound without accuracy loss removes one concrete barrier to scaling logical qubits. The work shows that neural-network decoders need not trade accuracy for speed when the implementation is co-designed with the target hardware.

Core claim

By applying hardware-aware optimizations to a high-accuracy GNN-based decoder and implementing several accelerator-level improvements on an FPGA, the system reaches a decoding latency smaller than one microsecond while producing a lower logical error rate than the state-of-the-art for surface codes of distance up to d=7.

What carries the argument

Hardware-aware GNN decoder mapped to an FPGA accelerator that enforces the one-microsecond latency bound while preserving decoding accuracy.

If this is right

  • Decoding finishes inside the coherence window of current superconducting qubits, allowing error correction to keep pace with physical operations.
  • The same optimized GNN model delivers lower logical error rates than lookup-table or minimum-weight perfect-matching decoders for distances up to seven.
  • FPGA resource usage remains compatible with integration alongside qubit control electronics on the same board.
  • The approach removes the accuracy-latency trade-off that previously forced designers to accept higher logical error rates to meet the one-microsecond deadline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the same optimization pattern extends to distance nine or eleven, the latency margin could accommodate more complex decoding graphs without additional hardware.
  • Embedding the accelerator directly in the cryogenic control stack could eliminate the round-trip communication delay that currently adds to total correction time.
  • The technique may transfer to other neural-network decoders for color codes or heavy-hexagon codes once equivalent hardware-aware pruning rules are derived.

Load-bearing premise

The hardware-aware optimizations applied to the GNN decoder preserve its accuracy sufficiently to outperform prior decoders while meeting the one-microsecond timing constraint.

What would settle it

Direct measurement on the target FPGA showing that, for code distance seven, the logical error rate rises above the best competing decoder once latency is forced below one microsecond.

Figures

Figures reproduced from arXiv: 2603.22149 by Alessio Cicero, Luigi Altamura, Mats Granath, Moritz Lange, Pedro Trancoso.

Figure 1
Figure 1. Figure 1: The host computer sends a quantum program to the controller, which [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Surface code of distance 3, with qubits highlighted according to their [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Three pipeline stages architecture of the GNN. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Tail probabilities of the GNN input graph node count for code distance [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Effect of weights, layers output features, or biases only quan [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Latency as a function of the number of input graph nodes [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of logical error rate and latency across different [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
read the original abstract

Quantum computers have the potential to solve certain complex problems in a much more efficient way than classical computers. Nevertheless, current quantum computer implementations are limited by high physical error rates. This issue is addressed by Quantum Error Correction (QEC) codes, which use multiple physical qubits to form a logical qubit to achieve a lower logical error rate, with the surface code being one of the most commonly used. The most time-critical step in this process is interpreting the measurements of the physical qubits to determine which errors have most likely occurred - a task called decoding. Consequently, the main challenge for QEC is to achieve error correction with high accuracy within the tight $1\mu s$ decoding time budget imposed by superconducting qubits. State-of-the-art QEC approaches trade accuracy for latency. In this work, we propose an FPGA accelerator for a Neural Network based decoder as a way to achieve a lower logical error rate than current methods within the tight time constraint, for code distance up to d=7. We achieved this goal by applying different hardware-aware optimizations to a high-accuracy GNN-based decoder. In addition, we propose several accelerator optimizations leading to the FPGA-based decoder achieving a latency smaller than $1\mu s$, with a lower error rate compared to the state-of-the-art.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an FPGA accelerator for a Graph Neural Network (GNN)-based decoder for surface-code quantum error correction. Through hardware-aware optimizations including quantization and pruning, it claims to deliver end-to-end decoding latency below 1 μs while achieving lower logical error rates than state-of-the-art methods (MWPM and prior NN decoders) for code distances up to d=7.

Significance. If the accuracy-preservation and latency claims hold under identical noise models, the work would be significant for practical QEC: it directly targets the sub-1 μs coherence-time constraint of superconducting qubits and supplies a concrete, synthesizable FPGA implementation rather than an abstract algorithm. Reproducible hardware results and explicit baseline comparisons would strengthen its utility for near-term fault-tolerant experiments.

major comments (2)
  1. [Results section] Results section: the post-optimization logical error rates for d=7 are stated to be lower than SOTA, yet no side-by-side table compares the optimized GNN against MWPM and the exact prior NN baselines under the same noise model, code distances, and measurement protocol; without this, the central outperformance claim cannot be verified.
  2. [Hardware-Aware Optimizations section] Hardware-Aware Optimizations section: the manuscript does not report the logical error rate of the unoptimized GNN versus the quantized/pruned version for d=7, nor does it supply error bars or statistical details on how accuracy was measured after fixed-point conversion; this leaves the accuracy-preservation assumption untested and load-bearing for the latency-accuracy tradeoff claim.
minor comments (2)
  1. [Abstract] Abstract: quantitative latency and error-rate numbers are asserted but not supplied, reducing clarity for readers.
  2. [Figures] Figure captions: several figures lack explicit axis units or legend definitions for the noise model parameters.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help strengthen the clarity and verifiability of our claims. We address each major point below and have revised the manuscript to incorporate the requested comparisons and details.

read point-by-point responses
  1. Referee: [Results section] Results section: the post-optimization logical error rates for d=7 are stated to be lower than SOTA, yet no side-by-side table compares the optimized GNN against MWPM and the exact prior NN baselines under the same noise model, code distances, and measurement protocol; without this, the central outperformance claim cannot be verified.

    Authors: We agree that a direct side-by-side comparison table is necessary to substantiate the outperformance claim. In the revised manuscript, we have added a new table in the Results section that explicitly compares the logical error rates of the optimized GNN decoder against MWPM and the prior NN baselines. All entries use identical noise models, code distances up to d=7, and the same measurement protocol, confirming the lower error rates achieved by our approach. revision: yes

  2. Referee: [Hardware-Aware Optimizations section] Hardware-Aware Optimizations section: the manuscript does not report the logical error rate of the unoptimized GNN versus the quantized/pruned version for d=7, nor does it supply error bars or statistical details on how accuracy was measured after fixed-point conversion; this leaves the accuracy-preservation assumption untested and load-bearing for the latency-accuracy tradeoff claim.

    Authors: We acknowledge the need for these details to validate accuracy preservation. The revised Hardware-Aware Optimizations section now reports the logical error rates for the unoptimized GNN versus the quantized/pruned version at d=7. We have also added error bars derived from multiple independent simulation runs and included a description of the statistical methodology and fixed-point conversion protocol used to measure post-optimization accuracy. revision: yes

Circularity Check

0 steps flagged

No circularity: engineering implementation of GNN decoder accelerator

full rationale

The paper presents an FPGA-based hardware accelerator for a pre-existing GNN decoder, applying standard optimizations such as quantization and pruning to meet latency constraints. No mathematical derivation chain, equations, or predictions are shown that reduce claimed performance metrics to parameters fitted from the same data or to self-citations. Claims rest on empirical benchmarking and hardware measurements rather than any self-definitional or fitted-input structure.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not enumerate free parameters or axioms; the central claim implicitly rests on the assumption that a pre-existing GNN decoder architecture can be ported to FPGA with accuracy-preserving optimizations, but no explicit free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.0 · 5528 in / 1107 out tokens · 29278 ms · 2026-05-15T00:43:07.669982+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

  1. [1]

    Benjamin, and Xiao Yuan

    Sam McArdle, Suguru Endo, Al ´an Aspuru-Guzik, Simon C. Benjamin, and Xiao Yuan. Quantum computational chemistry. Reviews of Modern Physics, 92:015003, Mar 2020

  2. [2]

    Emerging quantum computing algorithms for quantum chemistry

    Mario Motta and Julia E Rice. Emerging quantum computing algorithms for quantum chemistry. Wiley Interdisciplinary Reviews: Computational Molecular Science, 12(3):e1580, 2022

  3. [3]

    Evaluating the evidence for exponential quantum advantage in ground-state quantum chemistry

    Seunghoon Lee, Joonho Lee, Huanchen Zhai, Yu Tong, Alexander M Dalzell, Ashutosh Kumar, Phillip Helms, Johnnie Gray, Zhi-Hao Cui, Wenyuan Liu, et al. Evaluating the evidence for exponential quantum advantage in ground-state quantum chemistry. Nature communications, 14(1):1952, 2023

  4. [4]

    Quantum algorithms for quantum chemistry and quantum materials science

    Bela Bauer, Sergey Bravyi, Mario Motta, and Garnet Kin-Lic Chan. Quantum algorithms for quantum chemistry and quantum materials science. Chemical Reviews, 120(22):12685–12717, 2020

  5. [5]

    Quantum-centric supercom- puting for materials science: A perspective on challenges and future directions

    Yuri Alexeev, Maximilian Amsler, Marco Antonio Barroca, Sanzio Bassini, Torey Battelle, Daan Camps, David Casanova, Young Jay Choi, Frederic T Chong, Charles Chung, et al. Quantum-centric supercom- puting for materials science: A perspective on challenges and future directions. Future Generation Computer Systems, 160:666–710, 2024

  6. [6]

    Real-time decoding for fault-tolerant quantum computing: Progress, challenges and outlook

    Francesco Battistel, Christopher Chamberland, Kauser Johar, Ramon WJ Overwater, Fabio Sebastiano, Luka Skoric, Yosuke Ueno, and Muham- mad Usman. Real-time decoding for fault-tolerant quantum computing: Progress, challenges and outlook. Nano Futures, 7(3):032003, 2023

  7. [7]

    Nielsen and Isaac L

    Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 2023

  8. [8]

    Quantum error correction for dummies

    Avimita Chatterjee, Koustubh Phalak, and Swaroop Ghosh. Quantum error correction for dummies. In 2023 IEEE International Conference on Quantum Computing and Engineering (QCE), volume 1, pages 70–81. IEEE, 2023

  9. [9]

    Surface codes: Towards practical large-scale quantum compu- tation

    Austin G Fowler, Matteo Mariantoni, John M Martinis, and Andrew N Cleland. Surface codes: Towards practical large-scale quantum compu- tation. Physical Review A—Atomic, Molecular, and Optical Physics, 86(3):032324, 2012

  10. [10]

    Suppressing quantum errors by scaling a surface code logical qubit

    Google Quantum AI. Suppressing quantum errors by scaling a surface code logical qubit. Nature, 614(7949):676–681, 2023

  11. [11]

    Quantum error correction for quantum memories

    Barbara M Terhal. Quantum error correction for quantum memories. Reviews of Modern Physics, 87(2):307–346, 2015

  12. [12]

    Building logical qubits in a superconducting quantum computing system

    Jay M Gambetta, Jerry M Chow, and Matthias Steffen. Building logical qubits in a superconducting quantum computing system. npj quantum information, 3(1):2, 2017

  13. [13]

    Promatch: Extending the reach of real-time quantum error correction with adaptive predecoding

    Narges Alavisamani, Suhas Vittal, Ramin Ayanzadeh, Poulami Das, and Moinuddin Qureshi. Promatch: Extending the reach of real-time quantum error correction with adaptive predecoding. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, V olume3, pages 818– 833, 2024

  14. [14]

    Lilliput: a lightweight low-latency lookup-table decoder for near-term quantum error cor- rection

    Poulami Das, Aditya Locharla, and Cody Jones. Lilliput: a lightweight low-latency lookup-table decoder for near-term quantum error cor- rection. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 541–553, 2022

  15. [15]

    Parallel window decoding enables scalable fault tolerant quantum computation

    Luka Skoric, Dan E Browne, Kenton M Barnes, Neil I Gillespie, and Earl T Campbell. Parallel window decoding enables scalable fault tolerant quantum computation. Nature Communications, 14(1):7040, 2023

  16. [16]

    Micro blossom: Accel- erated minimum-weight perfect matching decoding for quantum error correction

    Yue Wu, Namitha Liyanage, and Lin Zhong. Micro blossom: Accel- erated minimum-weight perfect matching decoding for quantum error correction. In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, V olume2, pages 639–654, 2025

  17. [17]

    Astrea: Accu- rate quantum error-decoding via practical minimum-weight perfect- matching

    Suhas Vittal, Poulami Das, and Moinuddin Qureshi. Astrea: Accu- rate quantum error-decoding via practical minimum-weight perfect- matching. In Proceedings of the 50th Annual International Symposium on Computer Architecture, pages 1–16, 2023

  18. [18]

    Data-driven decoding of quantum error cor- recting codes using graph neural networks

    Moritz Lange, Pontus Havstr ¨om, Basudha Srivastava, Isak Bengtsson, Valdemar Bergentall, Karl Hammar, Olivia Heuts, Evert van Nieuwen- burg, and Mats Granath. Data-driven decoding of quantum error cor- recting codes using graph neural networks. Physical Review Research, 7(2):023181, 2025

  19. [19]

    Quantum error correction below the surface code threshold

    Google Quantum AI. Quantum error correction below the surface code threshold. Nature, 2024

  20. [20]

    Demonstration of quantum volume 64 on a superconducting quantum computing system

    Petar Jurcevic, Ali Javadi-Abhari, Lev S Bishop, Isaac Lauer, Daniela F Bogorin, Markus Brink, Lauren Capelluto, Oktay G ¨unl¨uk, Toshinari Itoko, Naoki Kanazawa, Abhinav Kandala, George A Keefe, Kevin Kr- sulich, William Landers, Eric P Lewandowski, Douglas T McClure, Gia- como Nannicini, Adinath Narasgond, Hasan M Nayfeh, Emily Pritchett, Mary Beth Roth...

  21. [21]

    How to factor 2048 bit RSA integers with less than a million noisy qubits

    Craig Gidney. How to factor 2048 bit rsa integers with less than a million noisy qubits. arXiv preprint arXiv:2505.15917, 2025

  22. [22]

    Demonstration of fault-tolerant universal quantum gate operations

    Lukas Postler, Sascha Heuβen, Ivan Pogorelov, Manuel Rispler, Thomas Feldker, Michael Meth, Christian D Marciniak, Roman Stricker, Martin Ringbauer, Rainer Blatt, et al. Demonstration of fault-tolerant universal quantum gate operations. Nature, 605(7911):675–680, 2022

  23. [23]

    Quantum error correction below the surface code threshold

    Google Quantum AI. Quantum error correction below the surface code threshold. Nature, 638(8052):920–926, 2025

  24. [24]

    Almost-linear time decoding algorithm for topological codes

    Nicolas Delfosse and Naomi H Nickerson. Almost-linear time decoding algorithm for topological codes. Quantum, 5:595, 2021

  25. [25]

    Blossom v: a new implementation of a min- imum cost perfect matching algorithm

    Vladimir Kolmogorov. Blossom v: a new implementation of a min- imum cost perfect matching algorithm. Mathematical Programming Computation, 1:43–67, 2009

  26. [26]

    Generalized belief propagation algo- rithms for decoding of surface codes

    Josias Old and Manuel Rispler. Generalized belief propagation algo- rithms for decoding of surface codes. Quantum, 7:1037, 2023

  27. [27]

    Quantum low-density parity-check codes

    Nikolas P Breuckmann and Jens Niklas Eberhardt. Quantum low-density parity-check codes. Prx Quantum, 2(4):040101, 2021

  28. [28]

    De- coding surface code with a distributed neural network–based decoder

    Savvas Varsamopoulos, Koen Bertels, and Carmen G Almudever. De- coding surface code with a distributed neural network–based decoder. Quantum Machine Intelligence, 2:1–12, 2020

  29. [29]

    Neural network decoder for near-term surface-code experiments

    Boris M Varbanov, Marc Serra-Peralta, David Byfield, and Barbara M Terhal. Neural network decoder for near-term surface-code experiments. Physical Review Research, 7(1):013029, 2025

  30. [30]

    Learning to decode the surface code with a recurrent, transformer-based neural network

    Johannes Bausch, Andrew W Senior, Francisco JH Heras, Thomas Edlich, Alex Davies, Michael Newman, Cody Jones, Kevin Satzinger, Murphy Yuezhen Niu, Sam Blackwell, et al. Learning to decode the surface code with a recurrent, transformer-based neural network. arXiv preprint arXiv:2310.05900, 2023

  31. [31]

    Fpga-based distributed union-find decoder for surface codes

    Namitha Liyanage, Yue Wu, Siona Tagare, and Lin Zhong. Fpga-based distributed union-find decoder for surface codes. IEEE Transactions on Quantum Engineering, 2024

  32. [32]

    Qubic: An open-source fpga-based control and measurement system for superconducting quantum information processors

    Yilun Xu, Gang Huang, Jan Balewski, Ravi Naik, Alexis Morvan, Bradley Mitchell, Kasra Nowrouzi, David I Santiago, and Irfan Siddiqi. Qubic: An open-source fpga-based control and measurement system for superconducting quantum information processors. IEEE Transactions on Quantum Engineering, 2:1–11, 2021

  33. [33]

    Pymatching: A python package for decoding quantum codes with minimum-weight perfect matching

    Oscar Higgott. Pymatching: A python package for decoding quantum codes with minimum-weight perfect matching. ACM Transactions on Quantum Computing, 3(3):1–16, 2022

  34. [34]

    Improving post-training structured pruning via two-stage reconstruction

    Chenhao Li, Lin Li, Zhibin Zhang, Qiang Qiu, Jiafeng Guo, and Xueqi Cheng. Improving post-training structured pruning via two-stage reconstruction. Expert Systems with Applications, page 128930, 2025

  35. [35]

    Slimgpt: Layer-wise structured pruning for large language models

    Gui Ling, Ziyang Wang, and Qingwen Liu. Slimgpt: Layer-wise structured pruning for large language models. Advances in Neural Information Processing Systems, 37:107112–107137, 2024

  36. [36]

    Fluctuation-based adaptive structured pruning for large language models

    Yongqi An, Xu Zhao, Tao Yu, Ming Tang, and Jinqiao Wang. Fluctuation-based adaptive structured pruning for large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 10865–10873, 2024

  37. [37]

    Otov2: Automatic, generic, user-friendly

    Tianyi Chen, Luming Liang, Tianyu Ding, Zhihui Zhu, and Ilya Zharkov. Otov2: Automatic, generic, user-friendly. arXiv preprint arXiv:2303.06862, 2023

  38. [38]

    A signal propagation perspective for pruning neural networks at initialization

    Namhoon Lee, Thalaiyasingam Ajanthan, Stephen Gould, and Philip HS Torr. A signal propagation perspective for pruning neural networks at initialization. arXiv preprint arXiv:1906.06307, 2019

  39. [39]

    An automatic network structure search via channel pruning for accelerating human activity inference on mobile devices

    Junjie Liang, Lei Zhang, Can Bu, Dongzhou Cheng, Hao Wu, and Aiguo Song. An automatic network structure search via channel pruning for accelerating human activity inference on mobile devices. Expert Systems with Applications, 238:122180, 2024

  40. [40]

    Mahoney, and Kurt Keutzer

    Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. A survey of quantization methods for efficient neural network inference, 2021

  41. [41]

    Synthesis of control circuits in folded pipelined dsp architectures

    Keshab K Parhi, C-Y Wang, and Andrew P Brown. Synthesis of control circuits in folded pipelined dsp architectures. IEEE Journal of Solid-State Circuits, 27(1):29–43, 2002

  42. [42]

    Stim: a fast stabilizer circuit simulator

    Craig Gidney. Stim: a fast stabilizer circuit simulator. Quantum, 5:497, July 2021

  43. [43]

    Eod: Enabling low latency gnn inference via near-memory concate- nate aggregation

    Taehwan Kim, Yunki Han, Seohye Ha, Jiwan Kim, and Lee-Sup Kim. Eod: Enabling low latency gnn inference via near-memory concate- nate aggregation. In Proceedings of the 52nd Annual International Symposium on Computer Architecture, pages 1125–1139, 2025

  44. [44]

    Omega: A low-latency gnn serving system for large graphs

    Geon-Woo Kim, Donghyun Kim, Jeongyoon Moon, Henry Liu, Taran- num Khan, Anand Iyer, Daehyeok Kim, and Aditya Akella. Omega: A low-latency gnn serving system for large graphs. arXiv preprint arXiv:2501.08547, 2025

  45. [45]

    Model-architecture co-design for high performance temporal gnn inference on fpga

    Hongkuan Zhou, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, and Carl Busart. Model-architecture co-design for high performance temporal gnn inference on fpga. In 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 1108–1117. IEEE, 2022

  46. [46]

    Low-latency mini- batch gnn inference on cpu-fpga heterogeneous platform

    Bingyi Zhang, Hanqing Zeng, and Viktor Prasanna. Low-latency mini- batch gnn inference on cpu-fpga heterogeneous platform. In 2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC), pages 11–21. IEEE, 2022

  47. [47]

    Gnnbuilder: An automated frame- work for generic graph neural network accelerator generation, simu- lation, and optimization

    Stefan Abi-Karam and Cong Hao. Gnnbuilder: An automated frame- work for generic graph neural network accelerator generation, simu- lation, and optimization. In 2023 33rd International Conference on Field-Programmable Logic and Applications (FPL), pages 212–218. IEEE, 2023

  48. [48]

    Ll-gnn: Low latency graph neural networks on fpgas for high energy physics

    Zhiqiang Que, Hongxiang Fan, Marcus Loo, He Li, Michaela Blott, Maurizio Pierini, Alexander Tapper, and Wayne Luk. Ll-gnn: Low latency graph neural networks on fpgas for high energy physics. ACM Transactions on Embedded Computing Systems, 23(2):1–28, 2024