pith. sign in

arxiv: 2606.17214 · v1 · pith:JEL6UUDAnew · submitted 2026-06-15 · 🌀 gr-qc · astro-ph.IM

CASPER: Interpretable ResNet based Classifier with FastShap Explainer for Gravitational Wave Detection

Pith reviewed 2026-06-27 02:45 UTC · model grok-4.3

classification 🌀 gr-qc astro-ph.IM
keywords gravitational wave detectionResNet classifierFastSHAP explainerLIGO datachirp morphologyinterpretable machine learningreal detector noise
0
0 comments X

The pith

CASPER pairs a ResNet classifier with FastSHAP to detect gravitational waves at 91% AUC from 260 real LIGO events without synthetic data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CASPER as an end-to-end pipeline that trains a residual neural network on 260 real gravitational wave events from the LIGO H1 and L1 detectors. It reports an AUC of 91 percent along with low false alarm rates after applying focal loss and Platt calibration. FastSHAP attribution maps are shown to recover the full chirp morphology and supply visual explanations for each classification decision. The pipeline is presented as using fewer parameters than typical deep learning models and running on standard CPUs. A sympathetic reader would care because the work aims to address generalization problems in detector noise while adding interpretability that matched filtering lacks.

Core claim

CASPER is an end-to-end pipeline that combines a residual convolutional neural network classifier with a FastSHAP explainer. Trained on 260 distinct real events fetched from the Gravitational Wave Open Science Centre across SNR range 7-42 from both H1 and L1 detectors with no synthetic augmentation, the classifier reaches an AUC of 91 percent with a low false alarm rate. Focal loss and Platt calibration are used to sharpen the decision boundary. The FastSHAP component produces attribution maps that recover the complete chirp morphology and supply detailed visual maps for interpreting the model's decisions. The full pipeline contains fewer parameters than standard deep learning models and req

What carries the argument

The residual convolutional neural network classifier paired with the FastSHAP explainer that produces attribution maps highlighting signal features driving each classification.

If this is right

  • The classifier achieves 91 percent AUC and low false alarm rates on real detector data without synthetic augmentation.
  • FastSHAP attribution maps recover the complete chirp morphology and supply visual interpretations of decisions.
  • Focal loss and Platt calibration improve the decision boundary and generalization.
  • The pipeline uses fewer parameters than standard deep learning models and runs on standard CPUs.
  • It supplies an interpretable alternative to matched filtering that does not require pre-computed waveform templates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the attribution maps reliably isolate physical signal features, the same pipeline could be tested on other transient signals in noisy time-series data.
  • The CPU-only design opens the possibility of running the detector on modest hardware at multiple observatory sites.
  • Retraining on events from additional detectors would test whether the model has learned detector-specific noise patterns.

Load-bearing premise

The 260 real events from H1 and L1 across the given SNR range are sufficient to produce a model that generalizes to new real detector noise and avoids class overlap or train-test mismatch.

What would settle it

A held-out set of additional real LIGO events in the same SNR range where the model drops well below 91 percent AUC or shows elevated false alarms would falsify the generalization claim.

Figures

Figures reproduced from arXiv: 2606.17214 by R. Rai, R. Verma, Somya.

Figure 1
Figure 1. Figure 1: Data Pipeline Architecture. Data cleaning and Spectrogram creation the performance along with the use of FastSHAP [16] as an ex￾plainer in the said order. 2. Methods 2.1. Data Acquisition and Conditioning The strain data required for the model training and testing were automatically fetched from the LIGO open science cen￾ter (GWOSC) site [17]. The data downloaded was sampled at 4096 Hz at 16s using GWpy li… view at source ↗
Figure 2
Figure 2. Figure 2: Training dataset compilation output (terminal log). A total of 194 [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Test dataset compilation output (terminal log). All 66 requested [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: CASPER pipeline architecture. Left (blue): Data ingestion and preprocessing. Centre (green): ResNet CNN classifier. Right (orange): U-Net FastSHAP explainer (Eq. 1). Bottom (purple): Shared evaluation outputs. prediction consistency objective: L = Em     X j ϕj(x)mj − [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Training history across all epochs. Early stopping (patience 8) terminates training at the epoch of maximum validation AUC, with the best-checkpoint [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Mean per-event ROC curve (±1σ shaded band) Each event con￾tributes an individual per-event AUC; the mean and standard deviation across events are reported in the legend. in classification we can assess attribution quality without using models [16]. We see that the average confidence of the clas￾sifier drops by about 0.05 when the top 1% of most attributed pixels are masked; it also drops to about 0.19 when… view at source ↗
Figure 8
Figure 8. Figure 8: FastSHAP perturbation fidelity on the test set, titled “Interpretability [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: CASPER inference on GW150914 (H1 detector). The panels are sharing a common time axis (±0.5 s around merger). A portrait orientation places the three temporally aligned panels—Q-transform (top), classifier confidence (middle), and Shapley attribution map (bottom)—one above the other so that a feature at a given time coordinate can be read vertically without horizontal eye movement. The frequency axis (30–4… view at source ↗
read the original abstract

Traditional matched filtering has been the standard for Gravitational waves (GW) detection ever since LIGO was established, even though it requires pre-computed waveform templates and provides no accounts of information about which signal drove the decision of classification. Deep-learning alternatives showed competitive sensitivity, but system biasesincluding class overlap, imbalanced class weighting, limited sample variation, and traintest mismatchcontinue to cause problems with generalisation in real detector noise. We introduce CASPER-Classification with Attribution via ShaPlEy in Residual neural networks, an end-to-end pipeline combining residual convolutional neural network (CNN) classifier with a FastSHAP explainer. 260 distinct events from the Gravitational Wave open Science Centre were fetched across SNR range of 7-42 from both H1 and L1 detectors with no synthetic augmentation. The classifier achieves AUC (Area Under Curve) of 91% across the model with a low false alarm rate. Focal Loss and Platt Calibration were used to improve decision boundary and generalisation. FastSHAP attribution maps recover the complete chirp morphology and provides detailed maps for a visual interpretation of the decision. The complete pipeline contains fewer parameters than standard deep learning models and requires no hardware except a standard CPU making our model an effective lightweight pipeline for Gravitational Wave Detection under real life conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces CASPER, an end-to-end pipeline that combines a residual CNN classifier with a FastSHAP explainer for gravitational wave detection. It trains on 260 distinct real events fetched from GWOSC (SNR range 7-42, H1 and L1 detectors, no synthetic augmentation), reports an AUC of 91% with low false alarm rate using Focal Loss and Platt Calibration, claims that FastSHAP attribution maps recover complete chirp morphology for visual interpretability, and asserts that the pipeline has fewer parameters than standard deep learning models and runs on standard CPU hardware.

Significance. If the performance and generalization claims hold under rigorous validation, the work could offer a lightweight, interpretable alternative to matched filtering that provides visual explanations of classification decisions. The focus on real (non-augmented) data and CPU-only execution addresses practical constraints in gravitational wave analysis pipelines.

major comments (3)
  1. [Abstract / Methods] Abstract and Methods (dataset description): Training relies on exactly 260 distinct real events with no synthetic augmentation or large accompanying noise corpus. This scale is insufficient to substantiate generalizability claims in the presence of the very biases the abstract enumerates (class overlap, imbalance, limited sample variation, train-test mismatch), as standard GW deep-learning pipelines require 10^4–10^6 samples to achieve robustness against non-stationary detector noise.
  2. [Results] Results section: The 91% AUC, low false-alarm-rate, and chirp-recovery claims are presented without baseline comparisons, cross-validation procedure, error bars, statistical significance tests, or explicit handling of the listed biases. No quantitative metrics accompany the FastSHAP attribution maps beyond visual inspection.
  3. [Abstract] Abstract: The statement that the pipeline 'contains fewer parameters than standard deep learning models' is unsupported by any parameter counts, architecture table, or direct comparison.
minor comments (2)
  1. [Abstract] Abstract contains typographical errors: 'system biasesincluding' (missing space), 'traintest mismatch' (hyphenation), and subject-verb disagreement in 'FastSHAP attribution maps recover ... and provides'.
  2. [Abstract] The abstract asserts 'low false alarm rate' without defining the operating threshold or providing the corresponding precision-recall or ROC operating point.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments identify key gaps in validation, statistical rigor, and support for claims. We address each major comment below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and Methods (dataset description): Training relies on exactly 260 distinct real events with no synthetic augmentation or large accompanying noise corpus. This scale is insufficient to substantiate generalizability claims in the presence of the very biases the abstract enumerates (class overlap, imbalance, limited sample variation, train-test mismatch), as standard GW deep-learning pipelines require 10^4–10^6 samples to achieve robustness against non-stationary detector noise.

    Authors: We agree that 260 real events without augmentation is a small scale that weakens generalizability claims, especially given the biases listed in the abstract. The design choice prioritized unaugmented real detector data over synthetic augmentation, but this limitation must be stated more explicitly. We will revise the abstract and methods to temper generalizability language, add a dedicated limitations paragraph discussing sample size and bias mitigation via focal loss and Platt scaling, and outline plans for larger datasets in future work. revision: yes

  2. Referee: [Results] Results section: The 91% AUC, low false-alarm-rate, and chirp-recovery claims are presented without baseline comparisons, cross-validation procedure, error bars, statistical significance tests, or explicit handling of the listed biases. No quantitative metrics accompany the FastSHAP attribution maps beyond visual inspection.

    Authors: These omissions are valid concerns. The revised manuscript will detail the cross-validation procedure, include error bars and statistical tests where feasible, add baseline comparisons (e.g., to simple matched-filtering thresholds or other lightweight CNNs), and introduce quantitative metrics for attribution maps such as the percentage of positive attribution overlapping expected chirp time-frequency support. revision: yes

  3. Referee: [Abstract] Abstract: The statement that the pipeline 'contains fewer parameters than standard deep learning models' is unsupported by any parameter counts, architecture table, or direct comparison.

    Authors: We will add an architecture table with exact parameter counts for CASPER and direct comparisons to standard models (e.g., ResNet-18/50 variants and other GW detection CNNs) in the methods or results section, and update the abstract accordingly. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML performance on fetched events, no derivations or self-referential reductions

full rationale

The paper presents an end-to-end ML pipeline (ResNet classifier + FastSHAP) trained on 260 real GWOSC events and reports an empirical AUC of 91%. No equations, first-principles derivations, or predictions appear that reduce the reported metrics to fitted inputs by construction. No self-citation chains or uniqueness theorems are invoked as load-bearing. The performance claim is a direct empirical result on the described dataset, consistent with the reader's assessment of score 2 (or lower). Standard data-scale concerns are separate from circularity analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no model architecture details, hyperparameter values, or explicit assumptions beyond the listed data source are provided, preventing enumeration of free parameters or axioms.

pith-pipeline@v0.9.1-grok · 5767 in / 1288 out tokens · 40073 ms · 2026-06-27T02:45:55.959199+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 3 canonical work pages · 2 internal anchors

  1. [1]

    P., et al

    Abbott, B. P., et al. 2016,Phys. Rev. Lett., 116, 061102

  2. [2]

    P., et al

    Abbott, B. P., et al. 2017,Phys. Rev. Lett., 119, 161101

  3. [3]

    2021,Phys

    Abbott, R., et al. 2021,Phys. Rev. X, 11, 021053

  4. [4]

    2023,Phys

    Abbott, R., et al. 2023,Phys. Rev. X, 13, 041039

  5. [5]

    G., Brady, P

    Allen, B., Anderson, W. G., Brady, P. R., Brown, D. A., & Creighton, J. D. E. 2012,Phys. Rev. D, 85, 122006

  6. [6]

    Dhurandhar, S. V . & Sathyaprakash, B. S. 1994,Phys. Rev. D, 49, 1707

  7. [7]

    Owen, B. J. & Sathyaprakash, B. S. 1999,Phys. Rev. D, 60, 022002

  8. [8]

    & Huerta, E

    George, D. & Huerta, E. A. 2018,Phys. Rev. D, 97, 044039

  9. [9]

    2018,Phys

    Gabbard, H., Williams, M., Hayes, F., & Messenger, C. 2018,Phys. Rev. Lett., 120, 141103 8

  10. [10]

    D., Kilbertus, N., Harry, I., & Schölkopf, B

    Gebhard, T. D., Kilbertus, N., Harry, I., & Schölkopf, B. 2019,Phys. Rev. D, 100, 063015

  11. [11]

    2020,Sci

    Corizzo, R., Ceci, M., Zdravevski, E., & Japkowicz, N. 2020,Sci. Rep., 10, 20681

  12. [12]

    2022,Phys

    Yan, Z., et al. 2022,Phys. Rev. D, 105, 043006

  13. [13]

    & Messenger, C

    Nagarajan, N. & Messenger, C. 2025, arXiv:2501.13846 [gr-qc]

  14. [14]

    C., Wills, L., Saleem, M., Moreno, E., et al

    Marx, E., Benoit, W., Gunny, A., Omer, R., Chatterjee, D., Venterea, R. C., Wills, L., Saleem, M., Moreno, E., et al. 2023,Phys. Rev. D, 108, 102004

  15. [15]

    Beheshtipour, B., & Papa, M. A. 2021,Phys. Rev. D, 103, 064027

  16. [16]

    2022,Proc

    Jethani, N., Sudarshan, M., Covert, I., Lee, S.-I., & Ran- ganath, R. 2022,Proc. AISTATS, PMLR 151

  17. [17]

    Vallisneri, M., et al. 2015,J. Phys.: Conf. Ser., 610, 012021

  18. [18]

    M., et al

    Macleod, D. M., et al. 2021,SoftwareX, 13, 100657

  19. [19]

    1930,Exp

    Butterworth, S. 1930,Exp. Wireless Wireless Eng., 7, 536

  20. [20]

    S., et al

    Park, D. S., et al. 2019,Proc. Interspeech, 2613

  21. [21]

    2017,Proc

    Lin, T.-Y ., Goyal, P., Girshick, R., He, K., & Dollár, P. 2017,Proc. ICCV, 2999

  22. [22]

    & Walker, M

    Davis, D. & Walker, M. 2021,Galaxies, 10, 12

  23. [23]

    2019,Class

    Cabero, M., et al. 2019,Class. Quant. Grav., 36, 155010

  24. [24]

    Pedregosa, F., et al. 2011,J. Mach. Learn. Res., 12, 2825

  25. [25]

    2016,Proc

    He, K., Zhang, X., Ren, S., & Sun, J. 2016,Proc. CVPR, 770

  26. [26]

    & Szegedy, C

    Ioffe, S. & Szegedy, C. 2015,Proc. ICML, PMLR 37, 448

  27. [27]

    2015,Proc

    Tompson, J., Goroshin, R., Jain, A., LeCun, Y ., & Bregler, C. 2015,Proc. CVPR, 4060

  28. [28]

    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. 2014,J. Mach. Learn. Res., 15, 1929

  29. [29]

    Kingma, D. P. & Ba, J. 2015,Proc. ICLR 2015, arXiv:1412.6980

  30. [30]

    2015,Proc

    Ronneberger, O., Fischer, P., & Brox, T. 2015,Proc. MIC- CAI, LNCS 9351, 234

  31. [31]

    Platt, J. C. 1999, inAdvances in Large Margin Classifiers, MIT Press, 61

  32. [32]

    & Caruana, R

    Niculescu-Mizil, A. & Caruana, R. 2005,Proc. ICML, 625

  33. [33]

    Guo, C., Pleiss, G., Sun, Y ., & Weinberger, K. Q. 2017, Proc. ICML, PMLR 70, 1321

  34. [34]

    A Horizon Study for Cosmic Explorer: Science, Observatories, and Community

    Evans, M., et al. 2021, arXiv:2109.09882 [astro-ph.IM]

  35. [35]

    B., Ohme, F., & Nitz, A

    Schäfer, M. B., Ohme, F., & Nitz, A. H. 2020,Phys. Rev. D, 102, 063015

  36. [36]

    2017,Phys

    Messick, C., et al. 2017,Phys. Rev. D, 95, 042001

  37. [37]

    2023,Phys

    Nousi, P., et al. 2023,Phys. Rev. D, 108, 024022 9 Appendix A. Training Hyperparameters Table A.4: Training hyperparameters and model resource summary for CASPER. Parameter Value ResNet CNN Classifier Optimizer Adam [29] Initial learning rate 10 −4 Batch size 64 Max epochs 30 Early stopping Patience 8, monitor: val_auc LR reduction Patience 4, factor 0.5 ...