Recognition: unknown
ArchGEM: an Advanced Data Analysis Tool for Analyzing Scattered Light Noise in LIGO
Pith reviewed 2026-05-08 07:07 UTC · model grok-4.3
The pith
ArchGEM automates identification of scattered light arches in LIGO spectrograms and recovers the velocities and displacements of the moving surfaces that produce them.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ArchGEM uses prominence-based peak-finding combined with Gaussian Mixture Model clustering to capture a range of scattered-light morphologies in LIGO spectrograms, then infers the physical motion of the scattering surfaces; on O3 and O4 data it reports average frequencies spanning 15-25 Hz in O3a and O4 but rising to 20-40 Hz in O3b, with typical velocities 0.2-0.5 μm/s and displacements 0.1-0.3 μm, while showing mean frequency offsets within 5 Hz of a Gravity Spy baseline for complex features.
What carries the argument
The ArchGEM pipeline, which first applies prominence-based peak-finding to locate arch features in spectrograms and then uses Gaussian Mixture Model clustering to associate them with scattering-surface motions and extract velocities and displacements.
If this is right
- The observed frequency distributions and velocity ranges can be used to prioritize which moving surfaces inside the detector vacuum system require damping or shielding.
- Performance remains consistent for overlapping or complex arches, supporting its use on the full catalog of glitches rather than only clean examples.
- The recovered surface displacements of 0.1-0.3 μm provide a quantitative target for mechanical isolation improvements in current and next-generation interferometers.
- The framework supplies a reproducible, automated record of noise morphology that can be compared across observing runs to track changes in detector behavior.
Where Pith is reading between the lines
- Extending ArchGEM to stream incoming interferometer data could enable near-real-time alerts when new scattering surfaces appear.
- The same peak-finding and clustering steps might be adapted to other non-stationary noise features that produce distinct time-frequency tracks, such as beam jitter or suspension resonances.
- Feeding ArchGEM velocity estimates into finite-element models of the detector could predict which hardware modifications would most effectively suppress the observed arches.
Load-bearing premise
The assumption that prominence-based peak-finding plus Gaussian Mixture Model clustering will correctly identify and parameterize the full variety of scattered-light features without substantial false positives, missed arches, or systematic errors in the inferred velocities and displacements under all detector conditions.
What would settle it
A side-by-side comparison of ArchGEM outputs against a large, independently hand-labeled set of scattered-light glitches in which the fraction of missed or mischaracterized features exceeds 10 percent or the velocity estimates deviate by more than 0.2 μm/s from values derived from independent witness sensors.
Figures
read the original abstract
Scattered light is one of the most common sources of non-stationary noise at low frequencies in Advanced LIGO detectors. It appears as arch-like features in time-frequency spectrograms, produced when stray light reflects from moving surfaces and recombines with the main interferometer beam. In this study, we present ArchGEM, an automated framework for identifying and characterizing these arches and recovering the physical properties of the scattering surfaces. ArchGEM combines a prominence-based peak-finding method with a Gaussian Mixture Model clustering approach to capture a range of scattered-light morphologies across different detector conditions. We apply ArchGEM to scattered light glitches across Advanced LIGO observing runs O3 (2019--2020) and O4 (2023--2024). We find that the average frequency distributions of this noise span 15--25 Hz in O3a and O4, but increase to 20--40 Hz during O3b. Typical inferred surface velocities are 0.2--0.5 $\mu$m/s, and inferred surface displacements are 0.1--0.3 $\mu$m. The Gaussian Mixture Model performs most consistently for complex or overlapping features, with mean frequency offsets within 5 Hz of the Gravity Spy baseline. Our results show that ArchGEM provides a practical tool for detector characterization by linking observed spectrogram features to the motion of scattering surfaces and helping guide future mitigation of scattered light noise in current and next-generation interferometers. By quantifying the temporal and spectral behavior of scattered light, ArchGEM provides a robust framework for diagnosing noise sources and guiding targeted mitigation strategies in future detector upgrades.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces ArchGEM, an automated framework that combines prominence-based peak-finding with Gaussian Mixture Model (GMM) clustering to detect and characterize arch-like scattered-light features in LIGO time-frequency spectrograms. Applied to glitches from O3 (2019-2020) and O4 (2023-2024), it reports frequency distributions spanning 15-25 Hz in O3a and O4 (increasing to 20-40 Hz in O3b), infers typical surface velocities of 0.2-0.5 μm/s and displacements of 0.1-0.3 μm, and finds mean frequency offsets within 5 Hz of Gravity Spy labels. The authors conclude that ArchGEM provides a practical tool for linking observed spectrogram features to scattering-surface motion and guiding mitigation in current and next-generation detectors.
Significance. If the physical-parameter inferences hold and the clustering proves robust, ArchGEM could offer a useful automated aid for diagnosing and mitigating scattered-light noise, which limits low-frequency sensitivity in Advanced LIGO. The application to real O3/O4 data and the direct comparison to an existing catalog (Gravity Spy) are positive steps toward practical detector-characterization tools.
major comments (2)
- [Abstract and Results] Abstract and Results section: The central claim that ArchGEM 'links observed spectrogram features to the motion of scattering surfaces' rests on the conversion of GMM-clustered arch morphologies into physical velocities (0.2–0.5 μm/s) and displacements (0.1–0.3 μm). No derivation of this mapping, no error bars, no injected-signal recovery tests, and no ground-truth benchmarks are supplied, leaving the accuracy and possible systematics of these quantities unquantified.
- [Methods] Methods section: The prominence-based peak-finding plus GMM clustering is asserted to 'perform most consistently for complex or overlapping features,' yet the only quantitative support given is a mean frequency offset to Gravity Spy; no precision/recall metrics, false-positive rates, or tests of robustness across varying O3/O4 noise conditions are reported.
minor comments (1)
- [Abstract] The abstract states that the GMM 'performs most consistently' but does not define the consistency metric or the number of events used in the Gravity Spy comparison.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. The two major comments highlight important gaps in the presentation of the physical-parameter mapping and in the quantitative validation of the clustering performance. We address each point below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract and Results] Abstract and Results section: The central claim that ArchGEM 'links observed spectrogram features to the motion of scattering surfaces' rests on the conversion of GMM-clustered arch morphologies into physical velocities (0.2–0.5 μm/s) and displacements (0.1–0.3 μm). No derivation of this mapping, no error bars, no injected-signal recovery tests, and no ground-truth benchmarks are supplied, leaving the accuracy and possible systematics of these quantities unquantified.
Authors: We agree that the manuscript does not explicitly derive the velocity and displacement values or quantify their uncertainties. The conversion follows the standard relation for scattered-light arches in LIGO, v = f λ / 2 (where λ = 1064 nm is the laser wavelength and f is the arch frequency), with displacement obtained by integrating the velocity over the observed arch duration. We will add this derivation, including the explicit formula and its assumptions, to the Methods section. Error bars will be reported as the standard deviation of the GMM component widths for each cluster. Injected-signal recovery tests and direct ground-truth benchmarks are not feasible with the current data set because no independent, calibrated measurements of scattering-surface motion exist for the O3/O4 glitches; we will therefore add a limitations paragraph noting this and outlining how future work could incorporate controlled injections or auxiliary sensors. revision: partial
-
Referee: [Methods] Methods section: The prominence-based peak-finding plus GMM clustering is asserted to 'perform most consistently for complex or overlapping features,' yet the only quantitative support given is a mean frequency offset to Gravity Spy; no precision/recall metrics, false-positive rates, or tests of robustness across varying O3/O4 noise conditions are reported.
Authors: We acknowledge that the current validation relies primarily on the mean frequency offset relative to Gravity Spy labels. To strengthen this, we will add a dedicated performance-evaluation subsection that reports precision and recall on a manually vetted subset of 200 arches (selected across O3a, O3b, and O4), false-positive rates obtained by applying the pipeline to quiet segments without known scattered-light glitches, and robustness tests by repeating the analysis on independent 1-week segments from each observing run that exhibit different glitch densities and noise floors. These metrics will be presented in a new table and accompanying text. revision: yes
Circularity Check
No circularity detected; physical inferences rest on external scattering models
full rationale
The paper presents ArchGEM as a data-analysis pipeline combining prominence peak-finding with GMM clustering to identify arches in spectrograms, followed by conversion of observed morphologies to surface velocities and displacements. No equations, derivations, or self-referential steps are described that would reduce the reported physical parameters (0.2–0.5 μm/s velocities, 0.1–0.3 μm displacements) to fitted inputs or internal definitions by construction. The mapping from arch features to motion relies on standard external physical relations for scattered light rather than any fitted parameter renamed as a prediction. Comparisons to Gravity Spy provide an external benchmark. The central claims therefore remain independent of the analysis steps themselves.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
, " * write output.state after.block = add.period write newline
ENTRY address archivePrefix author booktitle chapter doi edition editor eprint howpublished institution journal key month number organization pages publisher school series title misctitle type volume year version url label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts ...
-
[2]
write newline
" write newline "" before.all 'output.state := FUNCTION format.url url empty "" new.block "" url * "" * if FUNCTION format.eprint eprint empty "" archivePrefix empty "" archivePrefix "arXiv" = new.block " " eprint * " " * new.block " " eprint * " " * if if if FUNCTION format.doi doi empty "" " " doi * " " * if FUNCTION format.pid doi empty eprint empty ur...
-
[3]
thebibliography [1] 20pt to REFERENCES 6pt =0pt -12pt 10pt plus 3pt =0pt =0pt =1pt plus 1pt =0pt =0pt -12pt =13pt plus 1pt =20pt =13pt plus 1pt \@M =10000 =-1.0em =0pt =0pt 0pt =0pt =1.0em @enumiv\@empty 10000 10000 `\.\@m \@noitemerr \@latex@warning Empty `thebibliography' environment \@ifnextchar \@reference \@latexerr Missing key on reference command E...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.