pith. machine review for the scientific record. sign in

arxiv: 2605.08845 · v1 · submitted 2026-05-09 · ⚛️ physics.ins-det · hep-ex

Recognition: no theorem link

A Modular Zero-Dead-Time Data Acquisition and Real-Time GPU Processing Platform for High Throughput Physics Experiments

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:03 UTC · model grok-4.3

classification ⚛️ physics.ins-det hep-ex
keywords zero-dead-time acquisitionGPU data processingreal-time FFThigh-throughput experimentsspectral averagingPCIe digitizersphysics instrumentation
0
0 comments X

The pith

A modular platform pairs PCIe digitizers with consumer GPUs to deliver continuous zero-dead-time data acquisition and real-time processing at up to 1 GB/s.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper describes a software-defined system that links high-bandwidth digitizers to consumer GPUs for uninterrupted data handling in demanding physics setups. The approach relies on CUDA to run fast Fourier transforms and statistical averaging directly on incoming streams. Tests confirm the system maintains full throughput at sampling rates reaching 500 million samples per second while keeping data loss below one part in a trillion. A month-long continuous run further shows operational stability. When applied to a dark matter search, real-time averaging cuts the stored data volume substantially.

Core claim

The authors present a modular, software-defined data acquisition platform that combines PCIe digitizers with NVIDIA GPUs to enable continuous, zero-dead-time operation. Real-time processing via CUDA supports FFTs and averaging, with the system sustaining up to 500 MSa/s sampling rates and 1 GB/s throughput. End-to-end tests confirm fractional data loss below 10^{-12}, and a one-month run verifies long-term stability. In its deployment for a dark matter experiment, the platform operates at 124 MSa/s with 0.1 Hz resolution bandwidth, cutting storage needs through on-the-fly spectral averaging.

What carries the argument

The callback-driven software architecture with multi-GPU workload distribution and custom hardware shielding, which pipelines data from digitizers to real-time CUDA processing without introducing dead time.

If this is right

  • Real-time spectral averaging reduces the volume of data requiring storage in high-throughput experiments.
  • The platform offers a flexible, cost-effective alternative to dedicated hardware pipelines for various physics applications.
  • It enables high-resolution analysis, such as 0.1 Hz bandwidth at moderate sampling rates, directly in real time.
  • Scalability through multi-GPU distribution supports increasing data demands without hardware redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar architectures could support real-time machine learning inference on incoming data streams for faster event selection in particle detectors.
  • The modular design suggests straightforward adaptation for other detector technologies in future experiments.
  • Adoption in related fields such as radio astronomy could enable gap-free monitoring of transient signals.

Load-bearing premise

The phase continuity tests and one-month stability demonstration are assumed to represent all potential failure modes for data loss or dead time that could arise in diverse signal environments and prolonged real-world physics operations.

What would settle it

Demonstrating a fractional data loss greater than 10^{-12} under different signal conditions, higher sampling rates, or during an extended run beyond one month would indicate that the zero-dead-time performance does not hold universally.

Figures

Figures reproduced from arXiv: 2605.08845 by Dieter Horns, Marios Maroudas, Toma-Stefan Cezar.

Figure 1
Figure 1. Figure 1: Design of the custom aluminum Faraday cage utilized to isolate the [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sequence diagram illustrating the concurrent thread architecture. The [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the sequential memory layout and workload distribution [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Experimental verification of zero-dead-time continuous data acqui [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Histograms of the amplitude change across consecutive file boundaries for six injected frequencies: [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Plot of the captured 3 MHz sinusoidal signal superimposed with a mathematical model derived exclusively from the initial 0.8 s of acquired data. The left panel shows the first 250 samples of the initial file, while the right panel shows the last 250 samples of the final file recorded around 10 minutes later. The perfect phase overlap of the extrapolated model and the data in the right panel indicates zero … view at source ↗
Figure 7
Figure 7. Figure 7: Performance benchmarks of the triple GPU setup across 100 different input data volumes. The left panel details the absolute time spent on individual [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
read the original abstract

High-throughput physics experiments require efficient and increasingly complex real-time processing. This paper presents a modular, software-defined platform combining high-bandwidth PCIe digitizers with consumer GPUs to achieve continuous, zero-dead-time data acquisition. Utilizing NVIDIA CUDA, the system provides a scalable pipeline for real-time fast Fourier transforms and statistical averaging. Benchmarks demonstrate that the platform can sustain continuous processing at sampling rates up to 500 MSa/s, effectively managing data throughputs of 1 GB/s. To validate the in-situ zero-dead-time architecture, end-to-end phase continuity tests were conducted, constraining fractional data loss to below $10^{-12}$. Furthermore, long-term system stability was demonstrated through an uninterrupted one-month data acquisition run. In its current deployment for the WISPLC dark matter experiment, the platform operates at 124 MSa/s with a resolution bandwidth of 0.1 Hz. This implementation enabled a significant reduction in data storage requirements using real-time spectral averaging. The callback-driven software architecture, multi-GPU workload distribution, and custom hardware shielding solutions are detailed, establishing this platform as a flexible and cost-effective alternative to traditional hardware-based pipelines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents a modular, software-defined data acquisition platform that combines high-bandwidth PCIe digitizers with consumer GPUs and NVIDIA CUDA for continuous, zero-dead-time operation in high-throughput physics experiments. It describes a scalable pipeline for real-time FFTs and statistical averaging, reports benchmarks sustaining 500 MSa/s sampling (1 GB/s throughput), validates the architecture via end-to-end phase continuity tests that bound fractional data loss below 10^{-12}, demonstrates long-term stability with a one-month uninterrupted run, and details deployment in the WISPLC dark matter experiment at 124 MSa/s where real-time averaging reduces storage requirements. The callback-driven architecture, multi-GPU distribution, and hardware shielding are also covered.

Significance. If the performance and zero-dead-time claims are substantiated with adequate validation details, the platform would offer a flexible, cost-effective alternative to traditional hardware-based DAQ systems for experiments needing continuous high-rate processing. The demonstrated deployment in an active dark matter search, combined with real-time spectral averaging that reduces data volume, represents a practical strength. Use of consumer GPUs and modular design could broaden access to advanced real-time analysis capabilities in resource-limited settings.

major comments (2)
  1. [Validation and Testing] The description of the end-to-end phase continuity tests (mentioned in the abstract and validation sections) provides no specifics on input signal types (continuous tone versus modulated or noisy physics signals), test durations, measurement methodology for phase continuity, or quantification of loss at the full 500 MSa/s benchmark rate. This information is load-bearing for the central zero-dead-time claim, as simpler inputs may not reveal intermittent buffering or callback stalls that could occur under varying GPU loads or complex signals.
  2. [Long-term Stability Demonstration] The one-month uninterrupted run is presented as evidence of long-term stability, but the manuscript does not specify the sampling rate, processing load, or signal conditions during this period. If performed at the deployed 124 MSa/s rate rather than the maximum benchmarked throughput, it does not adequately bound worst-case behavior for the zero-dead-time assertion across the claimed operating range.
minor comments (1)
  1. [Hardware Implementation] The abstract references 'custom hardware shielding solutions' without a corresponding figure, diagram, or detailed description in the main text to clarify their implementation and role in the system.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments on validation and long-term testing are well taken, and we have revised the manuscript to provide the requested specifics on test conditions and methodology while preserving the original claims.

read point-by-point responses
  1. Referee: [Validation and Testing] The description of the end-to-end phase continuity tests (mentioned in the abstract and validation sections) provides no specifics on input signal types (continuous tone versus modulated or noisy physics signals), test durations, measurement methodology for phase continuity, or quantification of loss at the full 500 MSa/s benchmark rate. This information is load-bearing for the central zero-dead-time claim, as simpler inputs may not reveal intermittent buffering or callback stalls that could occur under varying GPU loads or complex signals.

    Authors: We agree that additional detail strengthens the zero-dead-time validation. The end-to-end tests used continuous sinusoidal tones across the Nyquist band as the primary input, with supplementary runs using amplitude-modulated and band-limited noise signals to emulate physics-like conditions. Phase continuity was monitored by tracking the unwrapped phase of the dominant tone after inverse FFT reconstruction at the output of the averaging stage; any discontinuity exceeding the expected thermal noise floor would indicate loss. These tests were performed for durations of several hours at the full 500 MSa/s rate (1 GB/s throughput) under maximum GPU load, yielding the reported fractional loss bound of <10^{-12}. We have added a dedicated paragraph in Section 4.2 with these specifics, including a table summarizing signal types, durations, and measured loss. revision: yes

  2. Referee: [Long-term Stability Demonstration] The one-month uninterrupted run is presented as evidence of long-term stability, but the manuscript does not specify the sampling rate, processing load, or signal conditions during this period. If performed at the deployed 124 MSa/s rate rather than the maximum benchmarked throughput, it does not adequately bound worst-case behavior for the zero-dead-time assertion across the claimed operating range.

    Authors: The referee correctly notes that the one-month run occurred at the 124 MSa/s rate used in the WISPLC deployment, with real-time spectral averaging under typical experimental noise conditions. While this run validates operational reliability and zero data loss in a production environment, it does not by itself bound behavior at the 500 MSa/s benchmark. In the revised manuscript we have explicitly stated the sampling rate and load for the long-term test and added a new figure showing continuous multi-day operation at 500 MSa/s with the same phase-continuity metric. We maintain that the combination of short-term full-rate benchmarks and the long-term deployment run together supports the zero-dead-time claim across the operating range. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical benchmarks and tests

full rationale

The paper describes a modular DAQ/GPU platform and reports direct hardware benchmarks (sustained 500 MSa/s, 1 GB/s throughput) plus empirical validation via phase-continuity tests and a one-month run. No derivations, equations, fitted parameters, or predictions appear; the zero-dead-time claim is asserted solely on the basis of those external measurements rather than any internal definition or self-referential reduction. Self-citations, if present, are not load-bearing for any claimed result. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The platform relies on standard commercial hardware (PCIe digitizers, NVIDIA GPUs) and existing CUDA libraries with no new physical constants, fitted parameters, or postulated entities introduced to support the claims.

pith-pipeline@v0.9.0 · 5509 in / 1090 out tokens · 55987 ms · 2026-05-12T01:03:08.307282+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

  1. [1]

    The QICK (Quantum Instrumentation Control Kit): Readout and control for qubits and detectors,

    L. Stefanazziet al., “The QICK (Quantum Instrumentation Control Kit): Readout and control for qubits and detectors,”Rev. Sci. Instrum., vol. 93, no. 4, p. 044709, 2022

  2. [2]

    The upgraded HADES trigger and data acquisition system,

    J. Michelet al., “The upgraded HADES trigger and data acquisition system,”JINST, vol. 6, p. C12056, 2011

  3. [3]

    The Challenges of Hardware Synthesis from C-Like Languages,

    S. A. Edwards, “The Challenges of Hardware Synthesis from C-Like Languages,” 2007. [Online]. Available: https://arxiv.org/abs/0710.4683

  4. [4]

    Reducing FPGA Compile Time with Separate Compilation for FPGA Building Blocks,

    Y . Xiao, D. Park, A. Butt, H. Giesen, Z. Han, R. Ding, N. Magnezi, R. Rubin, and A. DeHon, “Reducing FPGA Compile Time with Separate Compilation for FPGA Building Blocks,” in2019 International Confer- ence on Field-Programmable Technology (ICFPT), 2019, pp. 153–161

  5. [5]

    Performance evaluation of xilinx zynq ultrascale+ rfsoc device for low latency applications,

    A. Javaid, T. Ahmed, and S. Ali, “Performance evaluation of xilinx zynq ultrascale+ rfsoc device for low latency applications,” in2022 19th International Bhurban Conference on Applied Sciences and Technology (IBCAST), 2022, pp. 1041–1046

  6. [6]

    Comparative Evaluation of Xilinx RFSoC Platform for Low-Level RF Systems,

    S. D. Murthy, V . Moore, Q. Du, A. Jurado, M. Chin, K. Penney, D. Nett, and B. Flugstad, “Comparative Evaluation of Xilinx RFSoC Platform for Low-Level RF Systems,” in2025 Low Level Radio Frequency Workshop, 10 2025

  7. [7]

    Development of a Data Acquisition Software for the CULTASK Experiment,

    S. Lee, “Development of a Data Acquisition Software for the CULTASK Experiment,”Journal of Physics: Conference Series, vol. 898, no. 3, p. 032035, oct 2017. [Online]. Available: https: //doi.org/10.1088/1742-6596/898/3/032035

  8. [8]

    Fast daq system with image rejection for axion dark matter searches,

    Ahn, S. and Lee, M.J. and Yi, A.K. and Yeo, B. and Ko, B.R. and Semertzidis, Y .K., “Fast daq system with image rejection for axion dark matter searches,”Journal of Instrumentation, vol. 17, no. 05, p. P05025, may 2022. [Online]. Available: https://doi.org/10.1088/ 1748-0221/17/05/P05025

  9. [9]

    First results from the WISPDMX radio frequency cavity searches for hidden photon dark matter,

    L. H. Nguyen, A. Lobanov, and D. Horns, “First results from the WISPDMX radio frequency cavity searches for hidden photon dark matter,”JCAP, vol. 10, p. 014, 2019

  10. [10]

    Search for dark matter with an LCcircuit,

    Z. Zhang, D. Horns, and O. Ghosh, “Search for dark matter with an LCcircuit,”Phys. Rev. D, vol. 106, p. 023003, Jul 2022. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevD.106.023003

  11. [11]

    DBBC3: VLBI at 32 Gbits per second,

    G. Tuccariet al., “DBBC3: VLBI at 32 Gbits per second,” in Proceedings of the 11th European VLBI Network Symposium & Users Meeting. PoS, 2016, p. 073. [Online]. Available: https: //doi.org/10.22323/1.178.0073

  12. [12]

    ”First results from BRASS-p broadband searches for hidden photon dark matter

    F. Bajjali, S. Dornbusch, M. Ekmed ˇzi´c, D. Horns, C. Kasemann, A. Lobanov, A. Mkrtchyan, L. H. Nguyen, M. Tluczykont, G. Tuccari, J. Ulrichs, G. Wieching, and A. Zensus, “”First results from BRASS-p broadband searches for hidden photon dark matter”,”Journal of Cosmology and Astroparticle Physics, vol. 2023, no. 08, p. 077, aug

  13. [13]

    Available: https://doi.org/10.1088/1475-7516/2023/08/ 077

    [Online]. Available: https://doi.org/10.1088/1475-7516/2023/08/ 077

  14. [14]

    [Online]

    Spectrum Instrumentation GmbH,M4i.44xx Series User Manual: 14/16 bit digitizer (A/D converter), PCIe x8 interface, Spectrum Instrumentation GmbH, 2026. [Online]. Available: https://spectrum-instrumentation.com/en/m4i4420-x8

  15. [15]

    [Online]

    ——,M2p.59xx Series User Manual: 16 bit digitizer (A/D converter), PCIe x4 interface, Spectrum Instrumentation GmbH, 2026. [Online]. Available: https://spectrum-instrumentation.com/products/ details/M2p5941-x4.php

  16. [16]

    (2026) Dear ImGui: Bloat-free Graphical User Interface for C++

    ocornut. (2026) Dear ImGui: Bloat-free Graphical User Interface for C++. [Online]. Available: https://github.com/ocornut/imgui

  17. [17]

    IEEE Standard for Information Technology–Portable Operating System Interface (POSIX(TM)) Base Specifications, Issue 7,

    “IEEE Standard for Information Technology–Portable Operating System Interface (POSIX(TM)) Base Specifications, Issue 7,”IEEE Std 1003.1- 2017 (Revision of IEEE Std 1003.1-2008), pp. 1–3951, 2018

  18. [18]

    Hierarchical Data Format, version 5,

    The HDF Group, “Hierarchical Data Format, version 5,” 1997–2024, https://www.hdfgroup.org/HDF5/

  19. [19]

    NVIDIA. cufft. [Online]. Available: https://developer.nvidia.com/cufft [19]NVIDIA GeForce RTX 2080 Ti User Guide, NVIDIA Corporation, 2018, [Online]. Available: https://www.nvidia.com/content/geforce-gtx/ GEFORCE RTX 2080Ti User Guide.pdf. [20]NVIDIA RTX A4000 Datasheet, NVIDIA Corporation, 2021, [Online]. Available: https://www.nvidia.com/content/dam/en-...

  20. [20]

    ADAMOS: Axion Daily Modulation Searches for Dark Matter at 20 GHz,

    M. Maroudas, T.-S. Cezar, A. Gardikiotis, and D. Horns, “ADAMOS: Axion Daily Modulation Searches for Dark Matter at 20 GHz,” 2 2026