Recognition: no theorem link
A Modular Zero-Dead-Time Data Acquisition and Real-Time GPU Processing Platform for High Throughput Physics Experiments
Pith reviewed 2026-05-12 01:03 UTC · model grok-4.3
The pith
A modular platform pairs PCIe digitizers with consumer GPUs to deliver continuous zero-dead-time data acquisition and real-time processing at up to 1 GB/s.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present a modular, software-defined data acquisition platform that combines PCIe digitizers with NVIDIA GPUs to enable continuous, zero-dead-time operation. Real-time processing via CUDA supports FFTs and averaging, with the system sustaining up to 500 MSa/s sampling rates and 1 GB/s throughput. End-to-end tests confirm fractional data loss below 10^{-12}, and a one-month run verifies long-term stability. In its deployment for a dark matter experiment, the platform operates at 124 MSa/s with 0.1 Hz resolution bandwidth, cutting storage needs through on-the-fly spectral averaging.
What carries the argument
The callback-driven software architecture with multi-GPU workload distribution and custom hardware shielding, which pipelines data from digitizers to real-time CUDA processing without introducing dead time.
If this is right
- Real-time spectral averaging reduces the volume of data requiring storage in high-throughput experiments.
- The platform offers a flexible, cost-effective alternative to dedicated hardware pipelines for various physics applications.
- It enables high-resolution analysis, such as 0.1 Hz bandwidth at moderate sampling rates, directly in real time.
- Scalability through multi-GPU distribution supports increasing data demands without hardware redesign.
Where Pith is reading between the lines
- Similar architectures could support real-time machine learning inference on incoming data streams for faster event selection in particle detectors.
- The modular design suggests straightforward adaptation for other detector technologies in future experiments.
- Adoption in related fields such as radio astronomy could enable gap-free monitoring of transient signals.
Load-bearing premise
The phase continuity tests and one-month stability demonstration are assumed to represent all potential failure modes for data loss or dead time that could arise in diverse signal environments and prolonged real-world physics operations.
What would settle it
Demonstrating a fractional data loss greater than 10^{-12} under different signal conditions, higher sampling rates, or during an extended run beyond one month would indicate that the zero-dead-time performance does not hold universally.
Figures
read the original abstract
High-throughput physics experiments require efficient and increasingly complex real-time processing. This paper presents a modular, software-defined platform combining high-bandwidth PCIe digitizers with consumer GPUs to achieve continuous, zero-dead-time data acquisition. Utilizing NVIDIA CUDA, the system provides a scalable pipeline for real-time fast Fourier transforms and statistical averaging. Benchmarks demonstrate that the platform can sustain continuous processing at sampling rates up to 500 MSa/s, effectively managing data throughputs of 1 GB/s. To validate the in-situ zero-dead-time architecture, end-to-end phase continuity tests were conducted, constraining fractional data loss to below $10^{-12}$. Furthermore, long-term system stability was demonstrated through an uninterrupted one-month data acquisition run. In its current deployment for the WISPLC dark matter experiment, the platform operates at 124 MSa/s with a resolution bandwidth of 0.1 Hz. This implementation enabled a significant reduction in data storage requirements using real-time spectral averaging. The callback-driven software architecture, multi-GPU workload distribution, and custom hardware shielding solutions are detailed, establishing this platform as a flexible and cost-effective alternative to traditional hardware-based pipelines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a modular, software-defined data acquisition platform that combines high-bandwidth PCIe digitizers with consumer GPUs and NVIDIA CUDA for continuous, zero-dead-time operation in high-throughput physics experiments. It describes a scalable pipeline for real-time FFTs and statistical averaging, reports benchmarks sustaining 500 MSa/s sampling (1 GB/s throughput), validates the architecture via end-to-end phase continuity tests that bound fractional data loss below 10^{-12}, demonstrates long-term stability with a one-month uninterrupted run, and details deployment in the WISPLC dark matter experiment at 124 MSa/s where real-time averaging reduces storage requirements. The callback-driven architecture, multi-GPU distribution, and hardware shielding are also covered.
Significance. If the performance and zero-dead-time claims are substantiated with adequate validation details, the platform would offer a flexible, cost-effective alternative to traditional hardware-based DAQ systems for experiments needing continuous high-rate processing. The demonstrated deployment in an active dark matter search, combined with real-time spectral averaging that reduces data volume, represents a practical strength. Use of consumer GPUs and modular design could broaden access to advanced real-time analysis capabilities in resource-limited settings.
major comments (2)
- [Validation and Testing] The description of the end-to-end phase continuity tests (mentioned in the abstract and validation sections) provides no specifics on input signal types (continuous tone versus modulated or noisy physics signals), test durations, measurement methodology for phase continuity, or quantification of loss at the full 500 MSa/s benchmark rate. This information is load-bearing for the central zero-dead-time claim, as simpler inputs may not reveal intermittent buffering or callback stalls that could occur under varying GPU loads or complex signals.
- [Long-term Stability Demonstration] The one-month uninterrupted run is presented as evidence of long-term stability, but the manuscript does not specify the sampling rate, processing load, or signal conditions during this period. If performed at the deployed 124 MSa/s rate rather than the maximum benchmarked throughput, it does not adequately bound worst-case behavior for the zero-dead-time assertion across the claimed operating range.
minor comments (1)
- [Hardware Implementation] The abstract references 'custom hardware shielding solutions' without a corresponding figure, diagram, or detailed description in the main text to clarify their implementation and role in the system.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. The comments on validation and long-term testing are well taken, and we have revised the manuscript to provide the requested specifics on test conditions and methodology while preserving the original claims.
read point-by-point responses
-
Referee: [Validation and Testing] The description of the end-to-end phase continuity tests (mentioned in the abstract and validation sections) provides no specifics on input signal types (continuous tone versus modulated or noisy physics signals), test durations, measurement methodology for phase continuity, or quantification of loss at the full 500 MSa/s benchmark rate. This information is load-bearing for the central zero-dead-time claim, as simpler inputs may not reveal intermittent buffering or callback stalls that could occur under varying GPU loads or complex signals.
Authors: We agree that additional detail strengthens the zero-dead-time validation. The end-to-end tests used continuous sinusoidal tones across the Nyquist band as the primary input, with supplementary runs using amplitude-modulated and band-limited noise signals to emulate physics-like conditions. Phase continuity was monitored by tracking the unwrapped phase of the dominant tone after inverse FFT reconstruction at the output of the averaging stage; any discontinuity exceeding the expected thermal noise floor would indicate loss. These tests were performed for durations of several hours at the full 500 MSa/s rate (1 GB/s throughput) under maximum GPU load, yielding the reported fractional loss bound of <10^{-12}. We have added a dedicated paragraph in Section 4.2 with these specifics, including a table summarizing signal types, durations, and measured loss. revision: yes
-
Referee: [Long-term Stability Demonstration] The one-month uninterrupted run is presented as evidence of long-term stability, but the manuscript does not specify the sampling rate, processing load, or signal conditions during this period. If performed at the deployed 124 MSa/s rate rather than the maximum benchmarked throughput, it does not adequately bound worst-case behavior for the zero-dead-time assertion across the claimed operating range.
Authors: The referee correctly notes that the one-month run occurred at the 124 MSa/s rate used in the WISPLC deployment, with real-time spectral averaging under typical experimental noise conditions. While this run validates operational reliability and zero data loss in a production environment, it does not by itself bound behavior at the 500 MSa/s benchmark. In the revised manuscript we have explicitly stated the sampling rate and load for the long-term test and added a new figure showing continuous multi-day operation at 500 MSa/s with the same phase-continuity metric. We maintain that the combination of short-term full-rate benchmarks and the long-term deployment run together supports the zero-dead-time claim across the operating range. revision: partial
Circularity Check
No circularity; empirical benchmarks and tests
full rationale
The paper describes a modular DAQ/GPU platform and reports direct hardware benchmarks (sustained 500 MSa/s, 1 GB/s throughput) plus empirical validation via phase-continuity tests and a one-month run. No derivations, equations, fitted parameters, or predictions appear; the zero-dead-time claim is asserted solely on the basis of those external measurements rather than any internal definition or self-referential reduction. Self-citations, if present, are not load-bearing for any claimed result. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
The QICK (Quantum Instrumentation Control Kit): Readout and control for qubits and detectors,
L. Stefanazziet al., “The QICK (Quantum Instrumentation Control Kit): Readout and control for qubits and detectors,”Rev. Sci. Instrum., vol. 93, no. 4, p. 044709, 2022
work page 2022
-
[2]
The upgraded HADES trigger and data acquisition system,
J. Michelet al., “The upgraded HADES trigger and data acquisition system,”JINST, vol. 6, p. C12056, 2011
work page 2011
-
[3]
The Challenges of Hardware Synthesis from C-Like Languages,
S. A. Edwards, “The Challenges of Hardware Synthesis from C-Like Languages,” 2007. [Online]. Available: https://arxiv.org/abs/0710.4683
-
[4]
Reducing FPGA Compile Time with Separate Compilation for FPGA Building Blocks,
Y . Xiao, D. Park, A. Butt, H. Giesen, Z. Han, R. Ding, N. Magnezi, R. Rubin, and A. DeHon, “Reducing FPGA Compile Time with Separate Compilation for FPGA Building Blocks,” in2019 International Confer- ence on Field-Programmable Technology (ICFPT), 2019, pp. 153–161
work page 2019
-
[5]
Performance evaluation of xilinx zynq ultrascale+ rfsoc device for low latency applications,
A. Javaid, T. Ahmed, and S. Ali, “Performance evaluation of xilinx zynq ultrascale+ rfsoc device for low latency applications,” in2022 19th International Bhurban Conference on Applied Sciences and Technology (IBCAST), 2022, pp. 1041–1046
work page 2022
-
[6]
Comparative Evaluation of Xilinx RFSoC Platform for Low-Level RF Systems,
S. D. Murthy, V . Moore, Q. Du, A. Jurado, M. Chin, K. Penney, D. Nett, and B. Flugstad, “Comparative Evaluation of Xilinx RFSoC Platform for Low-Level RF Systems,” in2025 Low Level Radio Frequency Workshop, 10 2025
work page 2025
-
[7]
Development of a Data Acquisition Software for the CULTASK Experiment,
S. Lee, “Development of a Data Acquisition Software for the CULTASK Experiment,”Journal of Physics: Conference Series, vol. 898, no. 3, p. 032035, oct 2017. [Online]. Available: https: //doi.org/10.1088/1742-6596/898/3/032035
-
[8]
Fast daq system with image rejection for axion dark matter searches,
Ahn, S. and Lee, M.J. and Yi, A.K. and Yeo, B. and Ko, B.R. and Semertzidis, Y .K., “Fast daq system with image rejection for axion dark matter searches,”Journal of Instrumentation, vol. 17, no. 05, p. P05025, may 2022. [Online]. Available: https://doi.org/10.1088/ 1748-0221/17/05/P05025
work page 2022
-
[9]
First results from the WISPDMX radio frequency cavity searches for hidden photon dark matter,
L. H. Nguyen, A. Lobanov, and D. Horns, “First results from the WISPDMX radio frequency cavity searches for hidden photon dark matter,”JCAP, vol. 10, p. 014, 2019
work page 2019
-
[10]
Search for dark matter with an LCcircuit,
Z. Zhang, D. Horns, and O. Ghosh, “Search for dark matter with an LCcircuit,”Phys. Rev. D, vol. 106, p. 023003, Jul 2022. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevD.106.023003
-
[11]
DBBC3: VLBI at 32 Gbits per second,
G. Tuccariet al., “DBBC3: VLBI at 32 Gbits per second,” in Proceedings of the 11th European VLBI Network Symposium & Users Meeting. PoS, 2016, p. 073. [Online]. Available: https: //doi.org/10.22323/1.178.0073
-
[12]
”First results from BRASS-p broadband searches for hidden photon dark matter
F. Bajjali, S. Dornbusch, M. Ekmed ˇzi´c, D. Horns, C. Kasemann, A. Lobanov, A. Mkrtchyan, L. H. Nguyen, M. Tluczykont, G. Tuccari, J. Ulrichs, G. Wieching, and A. Zensus, “”First results from BRASS-p broadband searches for hidden photon dark matter”,”Journal of Cosmology and Astroparticle Physics, vol. 2023, no. 08, p. 077, aug
work page 2023
-
[13]
Available: https://doi.org/10.1088/1475-7516/2023/08/ 077
[Online]. Available: https://doi.org/10.1088/1475-7516/2023/08/ 077
- [14]
- [15]
-
[16]
(2026) Dear ImGui: Bloat-free Graphical User Interface for C++
ocornut. (2026) Dear ImGui: Bloat-free Graphical User Interface for C++. [Online]. Available: https://github.com/ocornut/imgui
work page 2026
-
[17]
“IEEE Standard for Information Technology–Portable Operating System Interface (POSIX(TM)) Base Specifications, Issue 7,”IEEE Std 1003.1- 2017 (Revision of IEEE Std 1003.1-2008), pp. 1–3951, 2018
work page 2017
-
[18]
Hierarchical Data Format, version 5,
The HDF Group, “Hierarchical Data Format, version 5,” 1997–2024, https://www.hdfgroup.org/HDF5/
work page 1997
-
[19]
NVIDIA. cufft. [Online]. Available: https://developer.nvidia.com/cufft [19]NVIDIA GeForce RTX 2080 Ti User Guide, NVIDIA Corporation, 2018, [Online]. Available: https://www.nvidia.com/content/geforce-gtx/ GEFORCE RTX 2080Ti User Guide.pdf. [20]NVIDIA RTX A4000 Datasheet, NVIDIA Corporation, 2021, [Online]. Available: https://www.nvidia.com/content/dam/en-...
work page 2080
-
[20]
ADAMOS: Axion Daily Modulation Searches for Dark Matter at 20 GHz,
M. Maroudas, T.-S. Cezar, A. Gardikiotis, and D. Horns, “ADAMOS: Axion Daily Modulation Searches for Dark Matter at 20 GHz,” 2 2026
work page 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.