Recovering Sub-threshold S-wave Arrivals in Deep Learning Phase Pickers via Shape-Aware Loss
Pith reviewed 2026-05-18 00:22 UTC · model grok-4.3
The pith
Deep learning seismic pickers miss clear S-waves because pointwise losses create an optimization trap; shape-then-align via conditional GAN recovers them and boosts detections by 64%.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Phase arrival labels are structured shapes rather than independent probability estimates, requiring training objectives that preserve coherence. Temporal uncertainty in S-wave arrivals, CNN bias toward amplitude boundaries, and pointwise loss limitations interact to trap predictions below the detection threshold. The shape-then-align strategy, implemented through a conditional GAN, recovers previously sub-threshold signals and achieves a 64% increase in effective S-phase detections. Loss landscape visualization and numerical simulation provide a general methodology for analyzing how label designs and loss functions interact with temporal uncertainty.
What carries the argument
The shape-then-align strategy, which first enforces coherence between the predicted waveform shape and the label shape before refining temporal alignment, implemented as a conditional GAN.
Load-bearing premise
The three diagnosed factors are the dominant causes of the sub-threshold failure mode and the conditional GAN version of shape-then-align generalizes beyond the tested datasets and models.
What would settle it
Retrain a standard phase picker on the same data using the shape-aware loss and measure whether the fraction of sub-threshold S-wave predictions drops by roughly 64% on an independent test set; if the gain disappears, the claim is falsified.
Figures
read the original abstract
Deep learning has transformed seismic phase picking, but a systematic failure mode persists: for some S-wave arrivals that appear unambiguous to human analysts, the model produces only a distorted peak trapped below the detection threshold, even as the P-wave prediction on the same record appears flawless. By examining training dynamics and loss landscape geometry, we diagnose this amplitude suppression as an optimization trap arising from three interacting factors. Temporal uncertainty in S-wave arrivals, CNN bias toward amplitude boundaries, and the inability of pointwise loss to provide lateral corrective forces combine to create the trap. The diagnosis reveals that phase arrival labels are structured shapes rather than independent probability estimates, requiring training objectives that preserve coherence. We formalize this as the shape-then-align strategy and validate it through a conditional GAN proof of concept, recovering previously sub-threshold signals and achieving a 64% increase in effective S-phase detections. Beyond this implementation, the loss landscape visualization and numerical simulation techniques we introduce provide a general methodology for analyzing how label designs and loss functions interact with temporal uncertainty, transforming these choices from trial-and-error into principled analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper diagnoses a systematic failure mode in deep learning seismic phase pickers: unambiguous S-wave arrivals produce only distorted sub-threshold peaks despite flawless P-wave predictions on the same record. It attributes this to three interacting factors (temporal uncertainty in S arrivals, CNN amplitude bias, and pointwise loss limitations) that create an optimization trap, then formalizes a shape-then-align strategy and validates it via a conditional GAN proof-of-concept that recovers signals and yields a 64% increase in effective S-phase detections. The work also introduces loss-landscape visualization and numerical simulation techniques for analyzing label-loss interactions under temporal uncertainty.
Significance. If the central claim holds after proper isolation of the mechanism, the result would be significant for improving automated S-phase detection in seismic monitoring, with potential carry-over to other time-series tasks involving structured labels and uncertainty. Credit is given for the empirical cGAN demonstration, the formalization of shape-then-align, and the methodological tools for principled loss-function analysis.
major comments (2)
- [Abstract] Abstract: the reported 64% gain in effective S-phase detections is presented without data-split details, baseline model specification, error bars, or tests on alternative architectures, leaving the magnitude and robustness of the improvement unverified.
- [Results] Results / proof-of-concept section: the attribution of recovery specifically to the shape-then-align component lacks an isolating ablation (e.g., replacing the conditional GAN with a direct shape-preserving regularizer such as trace-wise cross-correlation or SSIM on the probability sequence while holding model and data fixed); without this control the observed gain cannot be confidently distinguished from generic adversarial regularization.
minor comments (2)
- [Abstract] Abstract: define 'effective S-phase detections' explicitly and state how the 64% figure is computed relative to the baseline.
- [Methods] Methods: clarify the precise architecture of the conditional GAN generator and discriminator and the form of the shape-aware loss term.
Simulated Author's Rebuttal
We are grateful to the referee for the positive assessment of our work's significance and for the constructive major comments. We address each point below, proposing revisions where appropriate to enhance the clarity and robustness of our claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: the reported 64% gain in effective S-phase detections is presented without data-split details, baseline model specification, error bars, or tests on alternative architectures, leaving the magnitude and robustness of the improvement unverified.
Authors: We acknowledge that the abstract prioritizes conciseness and omits specific experimental details. The manuscript details the data splits, baseline model (standard phase picker with pointwise loss), and includes error bars in the results section. To address this, we will revise the abstract to include a brief statement on the evaluation setup and baseline for better context. Tests on alternative architectures are noted as future work given the proof-of-concept nature, but we will add a discussion on potential generalizability. revision: partial
-
Referee: [Results] Results / proof-of-concept section: the attribution of recovery specifically to the shape-then-align component lacks an isolating ablation (e.g., replacing the conditional GAN with a direct shape-preserving regularizer such as trace-wise cross-correlation or SSIM on the probability sequence while holding model and data fixed); without this control the observed gain cannot be confidently distinguished from generic adversarial regularization.
Authors: This is a valid point regarding the need for stronger isolation of the mechanism. Our cGAN implementation serves as a proof-of-concept for the shape-then-align strategy, where the adversarial component encourages shape coherence. To better distinguish from generic adversarial effects, we agree to include an ablation study using a direct shape-preserving regularizer such as SSIM on the probability sequence in the revised version, while keeping the model and data fixed. This will help confirm the specific contribution of the shape-aware approach. revision: yes
Circularity Check
No significant circularity; empirical validation stands independent of inputs
full rationale
The paper diagnoses three interacting factors (temporal uncertainty, CNN amplitude bias, pointwise loss limitations) from training dynamics and loss landscape geometry, then formalizes a shape-then-align strategy validated empirically via conditional GAN on seismic data, reporting a 64% increase in S-phase detections. No equations, derivations, or self-citations are shown that reduce the claimed recovery or performance gain to a fitted parameter, self-referential quantity, or ansatz by construction. The central result is presented as an outcome of the proposed loss strategy on held-out records rather than a closed loop equivalent to the input diagnoses. The derivation chain remains self-contained as a methodological proposal with external empirical support.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Phase arrival labels are structured shapes rather than independent probability estimates
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We formalize this as the shape-then-align strategy and validate it through a conditional GAN proof of concept... holistic Gaussian constraint remains robustly anchored
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
ISSN 0956-540X. doi:10.1093/gji/ggy423. 11 arXivTemplateA PREPRINT S. Mostafa Mousavi, William L. Ellsworth, Weiqiang Zhu, Lindsay Y . Chuang, and Gregory C. Beroza. Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking.Nature Communications, 11(1):3952,
-
[2]
Wu-Yu Liao, En-Jui Lee, Dawei Mu, Po Chen, and Ruey-Juin Rau
doi:10.1038/s41467-020-17591-w. Wu-Yu Liao, En-Jui Lee, Dawei Mu, Po Chen, and Ruey-Juin Rau. ARRU phase picker: Attention recurrent-residual u-net for picking seismic p - and s -phase arrivals.Seismological Research Letters, 92(4):2410–2428,
-
[3]
ISSN 0895-0695. doi:10.1785/0220200382. Yuanming Li, Dongsik Yoon, Bonhwa Ku, and Hanseok Ko. ConSeisGen: Controllable synthetic seismic waveform generation.IEEE Geoscience and Remote Sensing Letters, 21:1–5,
-
[4]
ISSN 1545-598X. doi:10.1109/lgrs.2023.3338652. Tiantong Wang, Daniel Trugman, and Youzuo Lin. SeismoGen: Seismic waveform synthesis using GAN with application to seismic data augmentation.Journal of Geophysical Research: Solid Earth, 126(4),
-
[5]
ISSN 2169-9313. doi:10.1029/2020jb020077. Yongsoo Park and Gregory C Beroza. Reducing the parameter dependency of phase-picking neural networks with dice loss.The Seismic Record, 5(1):55–63,
-
[6]
ISSN 2694-4006. doi:10.1785/0320240028. Jesse Williams, Greg Beroza, John Pace, and Artemii Novoselov. Deep learning probabilistic regression for onset time determination. Technical report, Defense Technical Information Center. Technical Report at https://apps.dtic. mil/sti/html/trecms/AD1211554/index.html(2023). Mehdi Mirza and Simon Osindero. Conditiona...
-
[7]
Conditional Generative Adversarial Nets
Preprint at https://arxiv.org/ abs/1411.1784. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks,
work page internal anchor Pith review Pith/arXiv arXiv
-
[8]
Image-to-Image Translation with Conditional Adversarial Networks
Preprint athttps://arxiv.org/abs/1611.07004. Ruoyu Sun, Tiantian Fang, and Alex Schwing. Towards a better global loss landscape of GANs,
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
Yiwen Huang, Aaron Gokaslan, V olodymyr Kuleshov, and James Tompkin
Preprint at https://arxiv.org/abs/2011.04926. Yiwen Huang, Aaron Gokaslan, V olodymyr Kuleshov, and James Tompkin. The GAN is dead; long live the GAN! a modern GAN baseline,
-
[10]
Preprint athttps://arxiv.org/abs/2501.05441. Alberto Michelini, Spina Cianetti, Sonja Gaviano, Carlo Giunchi, Dario Jozinovi ´c, and Valentino Lauciani. IN- STANCE – the italian seismic dataset for machine learning.Earth System Science Data, 13(12):5509–5544,
-
[11]
doi:10.5194/essd-13-5509-2021. Jack Woollam, Jannes Münchmeyer, Frederik Tilmann, Andreas Rietbrock, Dietrich Lange, Thomas Bornstein, Tobias Diehl, Carlo Giunchi, Florian Haslinger, Dario Jozinovi´c, Alberto Michelini, Joachim Saul, and Hugo Soto. SeisBench—a toolbox for machine learning in seismology.Seismological Research Letters, 93(3):1695–1709,
-
[12]
ISSN 0895-0695. doi:10.1785/0220210324. Jannes Münchmeyer, Jack Woollam, Andreas Rietbrock, Frederik Tilmann, Dietrich Lange, Thomas Bornstein, Tobias Diehl, Carlo Giunchi, Florian Haslinger, Dario Jozinovi´c, Alberto Michelini, Joachim Saul, and Hugo Soto. Which picker fits my data? a quantitative evaluation of deep learning based seismic pickers.Journal...
-
[13]
ISSN 2169-9313. doi:10.1029/2021jb023499. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmenta- tion,
-
[14]
U-Net: Convolutional Networks for Biomedical Image Segmentation
Preprint athttps://arxiv.org/abs/1505.04597. Sebastian Nowozin, Botond Cseke, and Ryota Tomioka. f-GAN: Training generative neural samplers using variational divergence minimization,
work page internal anchor Pith review Pith/arXiv arXiv
-
[15]
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
Preprint athttps://arxiv.org/abs/1606.00709. Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks,
work page internal anchor Pith review Pith/arXiv arXiv
-
[16]
Generative Adversarial Networks
Preprint athttps://arxiv.org/abs/1406.2661. Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization,
work page internal anchor Pith review Pith/arXiv arXiv
-
[17]
Adam: A Method for Stochastic Optimization
Preprint at https://arxiv. org/abs/1412.6980. Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. Mixed precision training,
work page internal anchor Pith review Pith/arXiv arXiv
-
[18]
Preprint at https://arxiv.org/abs/1710.03740. 12 arXivTemplateA PREPRINT Supplementary Information Supplementary Note: cGAN Training Challenges and Solutions The cGAN training process is challenging, primarily because it relies on maintaining a precise yet fragile dynamic equilibrium between the generator and the discriminator. The stability of this frame...
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.