pith. sign in

arxiv: 2605.16183 · v1 · pith:ZPQVW7ONnew · submitted 2026-05-15 · 🌌 astro-ph.IM · gr-qc

Rapid data quality investigations of gravitational-wave events with the Data Quality Report Builder toolkit

Pith reviewed 2026-05-19 18:23 UTC · model grok-4.3

classification 🌌 astro-ph.IM gr-qc
keywords gravitational wavesdata qualityLIGOVirgoKAGRAobserving runautomated vettingevent candidates
0
0 comments X

The pith

The Data Quality Report Builder toolkit identifies 96% of the data problems humans found in third observing run gravitational-wave candidates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DQRbuild, a toolkit of automated tools and scientific tests built to vet data quality around gravitational-wave event candidates ahead of the fourth observing run. The authors apply these tools to every significant public alert from the third observing run and compare the output directly to the problems that human experts had flagged manually. This matters because the next run is expected to deliver far more candidates, making full manual review too slow for timely decisions. The toolkit aims to handle the bulk of the checks automatically while leaving room for human follow-up on uncertain cases.

Core claim

We present the Data Quality Report Builder toolkit, DQRbuild, a suite of data quality tools developed to vet gravitational-wave events. Running the toolkit on all significant candidates shared as public alerts in the third observing run shows that the automated tools identify 96% of the problems previously found by humans, with a 24% false alarm rate. The paper closes with a discussion of prospects and challenges for fully automating data quality vetting in future observing runs.

What carries the argument

The DQRbuild toolkit, which implements a collection of scientific tests to assess data quality around candidate gravitational-wave events.

If this is right

  • The majority of issues previously caught only by humans can now be flagged automatically.
  • Event validation for the fourth observing run can proceed much faster than in the third.
  • Some additional human review will still be required for the 24% of cases that trigger false alarms.
  • Full automation of the entire vetting process remains limited by challenges the authors identify.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar automated checks could be run in near real time to monitor detector data as new candidates appear.
  • Faster vetting may shorten the time between an alert and a confirmed detection announcement.
  • The same test framework could be extended to other gravitational-wave observatories or even non-GW transient searches.

Load-bearing premise

The data quality problems that appear in the fourth observing run will be similar enough to those in the third that the current set of tests will still catch most of them.

What would settle it

Applying DQRbuild to fourth observing run candidates and directly comparing its detections and false alarms against independent human reviews of the same events.

Figures

Figures reproduced from arXiv: 2605.16183 by Adrian Helmling-Cornell, Airene Ahuja, Annudesh Liyanage, Benjamin Mannix, Beverly Berger, Caitlin Rawcliffe, Chayan Chatterjee, Christiano Palomba, Derek Davis, Dimitrios Pesios, Francesco Di Renzo, Franz Herbst, Hirotaka Yuzurihara, Jess McIver, Joseph Areeda, Julian Ding, Man Leong Chan, Marissa Walker, Max Trevor, Nicolas Arnaud, Olivia Godwin, Paolina Doliva, Philippe Nguyen, Rachael Huxford, Raymond Frey, Raymond Ng, Robert Schofield, Ronaldas Macas, Sofia Alvarez-Lopez, Sophie Perry, Viola Sordini, Yannick Lecoeuche, Zach Yarbrough.

Figure 1
Figure 1. Figure 1: A visualization of the inter-connected components of the [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example HTML results page for the candidate S200129m. This example highlights the [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: An example result page for the HVeto task. The displayed candidate is S191011af, which [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: An example result page for the GSpyNetTree task. The displayed candidate is S191225aq, [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The receiver operating characteristic curve for all tasks in the [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Results from O3 for each individual statistical task considered in this analysis. Two separate [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Sankey diagram [91] of how the DQRbuild toolkit could have been used in a semi-autonomous fashion to reduce the amount of human vetting required to process low latency candidates. In this diagram, all candidates with DQ issues identified by the DQRbuild toolkit are passed to an additional human vetting stage. The candidate is then manually sorted into the “DQ Pass,” “Retract,” or “DQ Warning” categories. C… view at source ↗
read the original abstract

We present the Data Quality Report Builder toolkit, DQRbuild, a suite of data quality tools that have been developed to vet gravitational-wave events in preparation for the fourth LIGO-Virgo-KAGRA observing run. We explain the main functionality and the many scientific tests that we support. To validate the performance of the tools included in the toolkit, we run a series of tests on all significant candidates shared as public alerts in the third observing run to compare against what was manually reported using human intervention. We find that these automated tools can now identify 96% of the problems identified by humans during this previous observing run, with a 24% false alarm rate. We conclude with a commentary on the prospects and potential challenges for fully automating the process of vetting the data quality for gravitational-wave events identified in future observing runs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper presents the Data Quality Report Builder (DQRbuild) toolkit, a suite of automated tools and scientific tests for rapid vetting of gravitational-wave event data quality ahead of O4. It describes the toolkit's functionality, validates performance by running the tests on all significant O3 public-alert candidates and comparing results to prior human-generated reports, reports 96% recall of human-identified issues with a 24% false-alarm rate, and comments on prospects and challenges for full automation in future runs.

Significance. If the central performance claim holds, the work supplies a practical, deployable toolkit that can materially speed up data-quality investigations for the higher event rates expected in O4. The concrete O3 validation numbers and the explicit comparison against independent human reports constitute a reproducible benchmark that strengthens the manuscript's utility for the LIGO-Virgo-KAGRA collaboration.

major comments (1)
  1. [Validation results] Validation results (the section reporting the 96 % recall / 24 % false-alarm figures): the metrics treat the set of human-flagged problems on O3 alerts as complete ground truth. Any automated flag not present in the human logs is counted as a false alarm. If the human process itself had incomplete coverage of subtle non-stationarities or auxiliary-channel features, a non-negligible fraction of the reported 24 % false alarms may be genuine issues; this directly affects the claimed readiness metric and should be quantified or bounded.
minor comments (2)
  1. [Abstract] Abstract: the statement that the tools 'identify 96 % of the problems' would be clearer if it briefly indicated how test thresholds were chosen and whether the comparison accounts for changes in detector configuration between O3 and O4.
  2. [Description of scientific tests] The manuscript would benefit from an explicit statement of which scientific tests are new versus re-implementations of existing checks, to help readers assess incremental novelty.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comment on the validation results. We address the point below and will update the manuscript accordingly.

read point-by-point responses
  1. Referee: Validation results (the section reporting the 96 % recall / 24 % false-alarm figures): the metrics treat the set of human-flagged problems on O3 alerts as complete ground truth. Any automated flag not present in the human logs is counted as a false alarm. If the human process itself had incomplete coverage of subtle non-stationarities or auxiliary-channel features, a non-negligible fraction of the reported 24 % false alarms may be genuine issues; this directly affects the claimed readiness metric and should be quantified or bounded.

    Authors: We agree that the human-flagged issues serve as our reference standard and that some automated flags absent from the human logs may correspond to genuine data-quality issues missed during the original manual review. This implies that the reported 24% false-alarm rate is an upper bound on the true rate at which the toolkit raises alerts on events that are in fact clean. In the revised manuscript we will add a short paragraph in the validation section that explicitly states this interpretation and notes that deriving a tighter numerical bound would require an independent, exhaustive audit of the full O3 dataset—an effort beyond the scope of the present work. The core performance numbers remain unchanged. revision: yes

Circularity Check

0 steps flagged

No circularity: validation uses external human reports as independent benchmark

full rationale

The paper validates DQRbuild by running its automated tests on O3 public-alert candidates and directly comparing outputs to the set of issues previously flagged by human analysts. The reported 96% recall and 24% false-alarm figures are computed from this external comparison rather than from any internal fit, self-defined quantity, or self-citation chain. No equations, parameters, or uniqueness claims are present that reduce to the toolkit's own definitions or prior author work; the central performance claim therefore rests on an independent ground-truth source and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central performance claim rests on the assumption that the implemented tests are representative of all relevant data quality issues and that O3 human reports constitute a reliable ground truth; no new physical entities or fitted parameters are introduced.

axioms (1)
  • domain assumption A fixed set of scientific tests can reliably flag the data quality problems that affect gravitational-wave event validation.
    The toolkit's design and the 96% recovery figure both depend on this assumption.

pith-pipeline@v0.9.0 · 5807 in / 1201 out tokens · 53007 ms · 2026-05-19T18:23:23.660069+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

96 extracted references · 96 canonical work pages · 24 internal anchors

  1. [1]

    Capote Eet al.2025Phys. Rev. D111062002 (Preprint2411.14607)

  2. [2]

    Acernese Fet al.(VIRGO) 2023J. Phys. Conf. Ser.2429012040

  3. [3]

    Abe Het al.(KAGRA) 2023PTEP202310A101 (Preprint2203.07011)

  4. [4]

    Abbott B P, Abbott R, Abbott T D, Abernathy M Ret al.2016Physical Review Letters116 061102 (Preprint1602.03837)

  5. [5]

    GW170817: Observation of Gravitational Waves from a Binary Neutron Star Inspiral

    Abbott B Pet al.(LIGO Scientific, Virgo) 2017Phys. Rev. Lett.119161101 (Preprint 1710.05832) 17 PrePrint Candidate O3 conclusion H1 result L1 result V1 result S190408an - - - - S190412m - - - - S190421ar - - -# S190426c DQ issue - DQ issue - S190503bf DQ issue - DQ issue - S190510g DQ issue - DQ issue - S190512at - - - - S190513bm - - - - S190517h - - - -...

  6. [6]

    Abbott B Pet al.(LIGO Scientific, Virgo) 2019Phys. Rev. X031040 (Preprint1811.12907)

  7. [7]

    Abbott Ret al.(LIGO Scientific, Virgo) 2021Phys. Rev. X11021053 (Preprint2010.14527)

  8. [8]

    Abbott Ret al.(LIGO Scientific, Virgo) 2021 ArXiv:2108.01045Preprint2108.01045

  9. [9]

    Abbott Ret al.(KAGRA, Virgo, LIGO Scientific) 2023Phys. Rev. X13041039 (Preprint 2111.03606)

  10. [10]

    Abac A Get al.(LIGO Scientific, VIRGO, KAGRA) 2025Preprint2508.18082

  11. [11]

    Davis D and Walker M 2022Galaxies1012

  12. [12]

    Capote E, Dartez L and Davis D 2024Class. Quant. Grav.41185001 (Preprint2404.04761)

  13. [13]

    Abbott B Pet al.(LIGO Scientific, Virgo) 2018Class. Quant. Grav.35065010 (Preprint 1710.02185)

  14. [14]

    Davis Det al.(LIGO Scientific) 2021Class. Quant. Grav.38135014 (Preprint2101.11673)

  15. [15]

    Abbott B Pet al.(LIGO Scientific, Virgo) 2020Class. Quant. Grav.37055002 (Preprint 1908.11170)

  16. [16]

    Powell J 2018Class. Quant. Grav.35155017 (Preprint1803.11346)

  17. [17]

    Kwok J Y L, Lo R K L, Weinstein A J and Li T G F 2022Phys. Rev. D105024066 (Preprint 2109.07642)

  18. [18]

    Mozzon S, Ashton G, Nuttall L K and Williamson A R 2022Phys. Rev. D106043504 (Preprint2110.11731)

  19. [19]

    Macas R, Pooley J, Nuttall L K, Davis Det al.2022Phys. Rev. D105103021 (Preprint 2202.00344)

  20. [20]

    Hourihane S, Chatziioannou K, Wijngaarden M, Davis Det al.2022Phys. Rev. D106042006 (Preprint2205.13580)

  21. [21]

    Ghonge S, Brandt J, Sullivan J M, Millhouse Met al.2023Preprint2311.09159

  22. [22]

    Davis D, Littenberg T B, Romero-Shaw I M, Millhouse Met al.2022Class. Quant. Grav.39 245013 (Preprint2207.03429)

  23. [23]

    Acernese Fet al.(Virgo) 2023Class. Quant. Grav.40185006 (Preprint2210.15633)

  24. [24]

    Akutsu Tet al.(KAGRA) 2021PTEP202105A102 (Preprint2009.09305)

  25. [25]

    Gouaty R and the LIGO Scientific Collaboration 2008Class. Quant. Grav.25184006 (PreprintarXiv:0805.2412)

  26. [26]

    Abadie Jet al.(LIGO Scientific, Virgo) 2010Phys.Rev.D82102001 (Preprint1005.4655)

  27. [27]

    Abadie Jet al.(LIGO Collaboration, Virgo Collaboration) 2012Phys.Rev.D85082002 (Preprint1111.7314)

  28. [28]

    Abbott B Pet al.(LIGO Scientific, Virgo) 2016Class. Quant. Grav.33134001 (Preprint 1602.03844)

  29. [29]

    Acernese Fet al.(Virgo) 2023Class. Quant. Grav.40185005 (Preprint2210.15634)

  30. [30]

    LIGO Scientific Collaboration and Virgo Collaboration 2018 Data quality report user documentation docs.ligo.org/detchar/data-quality-report/

  31. [31]

    Chaudhary S Set al.2024Proc. Nat. Acad. Sci.121e2316474121 (Preprint2308.04545)

  32. [32]

    Di Renzo F 2022PoSICHEP2022110

  33. [33]

    Arnaud N 2023Nucl. Instrum. Meth. A1048167945

  34. [34]

    Di Renzo F (VIRGO) 2025PoSICHEP2024677

  35. [35]

    Soni Set al.(LIGO) 2025Class. Quant. Grav.42085016 (Preprint2409.02831) 20 PrePrint

  36. [36]

    Abac A Get al.(LIGO Scientific, VIRGO, KAGRA) 2025Preprint2508.18081

  37. [37]

    Hourihane S and Chatziioannou K 2025Phys. Rev. D112084006 (Preprint2506.21869)

  38. [38]

    Vazsonyi L and Davis D 2023Class. Quant. Grav.40035008 (Preprint2208.12338)

  39. [39]

    Mozzon S, Nuttall L K, Lundgren A, Dent Tet al.2020Class. Quant. Grav.37215014 (Preprint2002.09407)

  40. [40]

    Zevin Met al.2017Class. Quant. Grav.34064003 (Preprint1611.04596)

  41. [41]

    Alvarez-Lopez S, Liyanage A, Ding J, Ng R and McIver J 2023Preprint2304.09977

  42. [42]

    Cornish N J and Littenberg T B 2015Class. Quant. Grav.32135012 (Preprint1410.3835)

  43. [43]

    Yamamura S, Yuzurihara H, Yamamoto T and Uchiyama T 2024Class. Quant. Grav.41 205008 (Preprint2403.12731)

  44. [44]

    Chatterji Set al.2004Class. Quant. Grav.21S1809

  45. [45]

    Payne E, Hourihane S, Golomb J, Udall Ret al.2022Phys. Rev. D106104017 (Preprint 2206.11932)

  46. [46]

    Macas R, Lundgren A and Ashton G 2023Preprint2311.09921

  47. [47]

    Udall R, Bini S, Chatziioannou K, Davis Det al.2025Preprint2510.05029

  48. [48]

    Ray A, Banagiri S, Thrane E and Lasky P D 2025Preprint2510.07228

  49. [49]

    Nguyen Pet al.(AdvLIGO) 2021Class. Quant. Grav.38145001 (Preprint2101.09935)

  50. [50]

    Helmling-Cornell A, Nguyen P, Schofield R and Frey R 2023Preprint2312.00735

  51. [51]

    Nitz A H, Dal Canton T, Davis D and Reyes S 2018Phys. Rev. D98024050 (Preprint 1805.11174)

  52. [52]

    Messick Cet al.2017Phys. Rev. D95042001 (Preprint1604.04324)

  53. [53]

    Adams T, Buskulic D, Germain V, Guidi G Met al.2016Class. Quant. Grav.33175012 (Preprint1512.02864)

  54. [54]

    Chu Qet al.2022Phys. Rev. D105024023 (Preprint2011.06787)

  55. [55]

    Heinzel J, Talbot C, Ashton G and Vitale S 2023Mon. Not. Roy. Astron. Soc.5235972–84 (Preprint2304.02665)

  56. [56]

    Ruiz-Rocha K, Yelikar A B, Lange J, Gabella Wet al.2025Astrophys. J. Lett.985L37 (Preprint2502.17681)

  57. [57]

    Acernese Fet al.(Virgo) 2022Class. Quant. Grav.39235009 (Preprint2203.04014)

  58. [58]

    Smith J R, Abbott T, Hirose E, Leroy Net al.2011Class. Quant. Grav.28235005 (Preprint 1107.2948)

  59. [59]

    Essick R, Godwin P, Hanna C, Blackburn L and Katsavounidis E 2020Machine Learning: Science and Technology2015004https://doi.org/10.1088/2632-2153/abab5f

  60. [60]

    Kawabe Ket al.2020 O3 RRT, lessons learned Tech. Rep. G2001971-v2 LIGO Laboratory and LIGO Scientific Collaborationhttps://dcc.ligo.org/LIGO-G2001971-v2

  61. [61]

    Sachdev Set al.2020Astrophys. J. Lett.905L25 (Preprint2008.04288)

  62. [62]

    Magee Ret al.2021Astrophys. J. Lett.910L21 (Preprint2102.04555)

  63. [63]

    Developers G GraceDB https://www.lsc-group.phys.uwm.edu/daswg/projects/gracedb.html

  64. [64]

    Coughlin M Wet al.2019Astrophys. J. Lett.885L19 (Preprint1907.12645)

  65. [65]

    Comput.33100425 (Preprint2001.06551) 21 PrePrint

    Herner Ket al.(DECam) 2020Astron. Comput.33100425 (Preprint2001.06551) 21 PrePrint

  66. [66]

    Ahumada Tet al.2024Publ. Astron. Soc. Pac.136114201 (Preprint2405.12403)

  67. [67]

    Andreoni Iet al.2022Astrophys. J. Supp.26018 (Preprint2111.01945)

  68. [68]

    Coughlin M Wet al.2020Mon. Not. Roy. Astron. Soc.4971181–96 (Preprint2006.14756)

  69. [69]

    J.90183 (Preprint2006.07385)

    Morgan Ret al.(DES) 2020Astrophys. J.90183 (Preprint2006.07385)

  70. [70]

    Drout M Ret al.2017Science3581570–4 (Preprint1710.05443)

  71. [71]

    J.905145 (Preprint2006.11306)

    Kasliwal M Met al.2020Astrophys. J.905145 (Preprint2006.11306)

  72. [72]

    Amiri Met al.(CHIME/FRB) 2018The Astrophysical Journal86348 (Preprint1803.11235) https://doi.org/10.3847/1538-4357/aad188

  73. [73]

    James C W, Anderson G E, Wen L, Bosveld Jet al.2019Mon. Not. Roy. Astron. Soc.489 L75–9 (Preprint1908.08688)

  74. [74]

    Tohuvavohu Aet al.2024Astrophys. J. Lett.975L19 (Preprint2410.05720)

  75. [75]

    Litzkow M, Livny M and Mutka M 1988 Condor - a hunter of idle workstationsProceedings of the 8th International Conference of Distributed Computing Systems

  76. [76]

    Couvares P, Kosar T, Roy A, Weber J and Wenger K 2007Workflow Management in Condor (London: Springer London) pp 357–75 ISBN 978-1-84628-757-2 https://doi.org/10.1007/978-1-84628-757-2_22

  77. [77]

    LIGO Scientific Collaboration, Virgo Collaboration, KAGRA Collaboration 2025 igwn-alert Documentation https://igwn-alert.readthedocs.io/

  78. [78]

    Abac A Get al.(LIGO Scientific, VIRGO, KAGRA) 2026 In Preparation

  79. [79]

    Macleod D, Goetz E, Davis D, Bidler Jet al.2025 gwdetchar/gwdetchar: 2.3.1 https://doi.org/10.5281/zenodo.15530809

  80. [80]

    Robinet F, Arnaud N, Leroy N, Lundgren Aet al.2020SoftwareX12100620 (Preprint 2007.11374)

Showing first 80 references.