Data-Driven Forecasting of three-Component Seismograms Using Transformer Architectures
Pith reviewed 2026-06-28 12:13 UTC · model grok-4.3
The pith
A transformer autoregressive model forecasts three-component seismograms from P-wave context onward, achieving median normalized cross-correlation above 0.93 on synthetic data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SeismoGPT demonstrates that a transformer can learn stable dynamical continuation of seismic wavefields in the time domain, producing forecasts whose median normalized cross-correlation exceeds 0.93 while preserving phase coherence and spectral energy distribution on the tested synthetic ensemble.
What carries the argument
SeismoGPT, a transformer-based autoregressive model that performs physically constrained continuation of three-component waveforms starting from P-wave arrival.
If this is right
- Successful forecasts preserve both phase coherence and spectral energy distribution of the input waveforms.
- Failure cases arise primarily from gradual phase drift during autoregressive rollout rather than unphysical signal generation.
- The results indicate that transformer sequence models can learn stable continuation of seismic wavefields on the tested parameter ranges.
- The methodology carries potential applications in seismic warning and hazard mitigation, including for next-generation gravitational-wave observatories.
Where Pith is reading between the lines
- If the phase-drift failure mode can be mitigated by longer training or auxiliary loss terms, the same architecture might support longer prediction horizons.
- Training on a wider mix of real and synthetic records could test whether the learned continuation transfers beyond the current synthetic ensemble.
- The continuation framing used here could be applied to other multi-component wave-propagation problems where only partial observations are available.
Load-bearing premise
Synthetic seismograms generated across the stated ranges of depth, distance, and magnitude are representative enough for the autoregressive model to learn continuation that would hold on real recorded data.
What would settle it
Direct comparison of model forecasts against recorded three-component seismograms from real earthquakes of comparable magnitude and distance would show whether the reported correlation levels persist outside the synthetic training distribution.
Figures
read the original abstract
Forecasting seismic waveforms beyond observed data remains challenging due to the nonlinear, dispersive, and multi-scale nature of seismic wave propagation. In this work, we introduce \textsc{SeismoGPT}, a transformer-based autoregressive model designed to forecast three-component seismic waveforms directly in the time domain. Forecasting is formulated as a physically constrained continuation problem in which the model receives waveform context beginning at the P-wave arrival and extending a defined time beyond the S-wave arrival, after which future motion is generated recursively without access to ground-truth samples. Evaluation is performed on synthetic seismograms spanning source depths of 5--100\,km, epicentral distances of 10--90$^\circ$, and magnitudes $3 \leq M_w \leq 7$. To disentangle the effects of context length and prediction horizon, we define three evaluation configurations using a distance-normalized context ratio and fixed prediction horizons of 120 and 240\,s. Across all configurations, the model achieves median normalized cross correlation above 0.93. Analysis of representative forecasts shows that successful predictions preserve both phase coherence and spectral energy distribution. Where failure cases arise, this is primarily due to gradual phase drift during autoregressive rollout rather than unphysical signal generation. These results demonstrate that transformer-based sequence models can learn stable dynamical continuation of seismic wavefields, highlighting the potential of foundation-model approaches for physics-driven time-series forecasting. There are potential applications of this methodology in seismic warning and hazard mitigation, particularly for next-generation gravitational-wave observatories, such as the Einstein Telescope.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SeismoGPT, a transformer-based autoregressive model for direct time-domain forecasting of three-component seismograms. It formulates forecasting as a continuation problem starting from P-wave arrival context (with defined distance-normalized ratios) and generates future waveforms recursively over fixed 120 s and 240 s horizons. Evaluation on synthetic data spanning 5-100 km depths, 10-90° distances, and Mw 3-7 yields median normalized cross-correlation above 0.93 across configurations, with successful cases preserving phase coherence and spectral energy; failures are attributed to gradual phase drift.
Significance. If the synthetic results hold under the reported conditions, the work demonstrates that transformer sequence models can capture stable dynamical continuation of seismic wavefields without explicit physics constraints, opening a data-driven route to waveform forecasting. This has potential relevance for early-warning systems and next-generation observatories, provided the approach can be shown to generalize beyond the synthetic generator.
major comments (2)
- [Abstract] Abstract: the headline median NCC > 0.93 is reported without error bars, baseline comparisons (e.g., against AR models, RNNs, or physics-based propagators), or any description of training procedure, loss, or regularization; these omissions make it impossible to judge whether the performance exceeds what simpler methods achieve on the same synthetic ensemble.
- [Abstract] Abstract and evaluation section: all quantitative results are confined to held-out synthetic seismograms generated within the stated depth/distance/magnitude ranges; no experiments or discussion address transfer to real recordings (instrument response, site effects, scattering, noise), which is load-bearing for the claimed applications in seismic warning and for confirming that the autoregressive rollout obeys wave-propagation physics rather than generator-specific statistics.
minor comments (1)
- The three evaluation configurations are described only in prose; a small table listing context ratios, horizons, and per-configuration median NCC values would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline median NCC > 0.93 is reported without error bars, baseline comparisons (e.g., against AR models, RNNs, or physics-based propagators), or any description of training procedure, loss, or regularization; these omissions make it impossible to judge whether the performance exceeds what simpler methods achieve on the same synthetic ensemble.
Authors: We agree the abstract is concise and will revise it to include a brief statement on the transformer architecture, autoregressive training with MSE loss, and the reported median with variability from the evaluation figures. Training details appear in Section 3; we will add a cross-reference. Baseline comparisons to AR models are in the supplementary material and will be referenced in the abstract revision. revision: yes
-
Referee: [Abstract] Abstract and evaluation section: all quantitative results are confined to held-out synthetic seismograms generated within the stated depth/distance/magnitude ranges; no experiments or discussion address transfer to real recordings (instrument response, site effects, scattering, noise), which is load-bearing for the claimed applications in seismic warning and for confirming that the autoregressive rollout obeys wave-propagation physics rather than generator-specific statistics.
Authors: The work is framed as a controlled demonstration on synthetic data to isolate the effects of context ratio and horizon. We will expand the discussion to explicitly note the absence of real-data transfer experiments as a limitation and outline future directions for instrument response and noise. This revision clarifies scope without altering the synthetic focus of the present study. revision: partial
Circularity Check
No significant circularity detected
full rationale
The paper presents an empirical ML model (SeismoGPT) trained and evaluated exclusively on synthetic seismograms using held-out test data with explicitly defined context ratios and prediction horizons. The central performance claim (median NCC > 0.93) is a direct empirical metric computed between model rollouts and ground-truth synthetics, with no reduction to a fitted parameter, self-definitional loop, or load-bearing self-citation. No uniqueness theorems, ansatzes smuggled via citation, or renaming of known results appear in the provided text. The derivation chain is self-contained as a data-driven forecasting demonstration on synthetics.
Axiom & Free-Parameter Ledger
invented entities (1)
-
SeismoGPT
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Springer Science & Business Media, 2012
Haruo Sato, Michael C Fehler, and Takuto Maeda.Seismic wave propagation and scattering in the heterogeneous earth. Springer Science & Business Media, 2012
2012
-
[2]
Oxford University Press, 2017
Heiner Igel.Computational seismology: a practical introduction. Oxford University Press, 2017
2017
-
[3]
Robert W. Graves. Simulating seismic wave propagation in 3d elastic media using staggered-grid finite differences.Bulletin of the Seismological Society of America, 86(4): 1091–1106, 1996. doi: 10.1785/bssa0860041091
-
[4]
Dimitri Komatitsch and Jeroen Tromp. Introduction to the spectral element method for three-dimensional seismic wave propagation.Geophysical Journal International, 139 (3):806–822, 1999. doi: 10.1046/j.1365-246x.1999.00967.x
-
[5]
Spectral-element simulations of global seismic wave propagation—i
Dimitri Komatitsch and Jeroen Tromp. Spectral-element simulations of global seismic wave propagation—i. validation.Geophysical Journal International, 149(2):390–412, 2002
2002
-
[6]
AxiSEM: broadband 3-D seismic wavefields in axisymmetric media.Solid Earth, 5(1):425–445, 2014
Tarje Nissen-Meyer, Martin van Driel, Simon C Stähler, Kasra Hosseini, Stefanie Hempel, Ludwig Auer, Andrea Colombi, and Alexandre Fournier. AxiSEM: broadband 3-D seismic wavefields in axisymmetric media.Solid Earth, 5(1):425–445, 2014
2014
-
[7]
Kuangdai Leng, Tarje Nissen-Meyer, and Martin van Driel. Efficient global wave prop- agation adapted to 3-d structural complexity: a pseudospectral/spectral-element ap- proach.Geophysical Supplements to the Monthly Notices of the Royal Astronomical Society, 207(3):1700–1721, 2016
2016
-
[8]
Axisem3d: broad-band seismic wavefields in 3-d global earth models with undulating discontinuities.Geophysical Journal International, 217(3):2125–2146, 2019
Kuangdai Leng, Tarje Nissen-Meyer, Martin Van Driel, Kasra Hosseini, and David Al-Attar. Axisem3d: broad-band seismic wavefields in 3-d global earth models with undulating discontinuities.Geophysical Journal International, 217(3):2125–2146, 2019
2019
-
[9]
Xu and Z
T. Xu and Z. Zhang. Numerical simulation of 3-d seismic wave based on alternative flux finite-difference weno scheme.Geophysical Journal International, 238(1):496–512,
-
[10]
doi: 10.1093/gji/ggae167
-
[11]
Spectral-element simulations of global seismic wave propagation—ii
Dimitri Komatitsch and Jeroen Tromp. Spectral-element simulations of global seismic wave propagation—ii. three-dimensional models, oceans, rotation and self-gravitation. Geophysical journal international, 150(1):303–318, 2002
2002
-
[12]
On the modelling of self- gravitation for full 3-d global seismic wave propagation.Geophysical Journal Interna- tional, 227(1):632–643, 2021
Martin van Driel, Johannes Kemper, and Christian Boehm. On the modelling of self- gravitation for full 3-d global seismic wave propagation.Geophysical Journal Interna- tional, 227(1):632–643, 2021
2021
-
[13]
Lyu et al
D. Lyu et al. Rapid wavefield forecasting for earthquake early warning via deep sequence to sequence learning.Nature Communications, 16(1), 2025. 30
2025
-
[14]
S. Mostafa Mousavi and Gregory C. Beroza. Machine learning in earthquake seis- mology.Annual Review of Earth and Planetary Sciences, 50:641–666, 2022. doi: 10.1146/annurev-earth-032320-041749
-
[15]
SeisLM: a foundation model for seismic waveforms.arXiv preprint arXiv:2410.15765, 2024
Tianlin Liu et al. SeisLM: a foundation model for seismic waveforms.arXiv preprint arXiv:2410.15765, 2024
arXiv 2024
-
[16]
H. Kubo. Recent advances in earthquake seismology using machine learning.Earth, Planets and Space, 76(1):1–22, 2024. doi: 10.1186/s40623-024-01966-w
-
[17]
S. M. Mousavi, W. L. Ellsworth, W. Zhu, L. Y. Chuang, and G. C. Beroza. Earth- quake transformer—an attentive deep-learning model for simultaneous earthquake de- tection and phase picking.Nature Communications, 11(1):1–12, 2020. doi: 10.1038/ s41467-020-17591-w
2020
-
[18]
Weiqiang Zhu and Gregory C. Beroza. PhaseNet: a deep-neural-network-based seismic arrival-time picking method.Geophysical Journal International, 216(1):261–273, 2019. doi: 10.1093/gji/ggy423
-
[19]
Mostafa Mousavi, and Gregory C
Weiqiang Zhu, S. Mostafa Mousavi, and Gregory C. Beroza. Seismic signal denoising and decomposition using deep neural networks.IEEE Transactions on Geoscience and Remote Sensing, 57(11):9476–9488, 2019. doi: 10.1109/TGRS.2019.2926772
-
[20]
Q. Kong, R. M. Allen, L. Schreier, and Y. W. Kwon. Machine learning aspects of the MyShake global smartphone seismic network.Seismological Research Letters, 89(5): 1887–1896, 2018. doi: 10.1785/0220180037
-
[21]
Suppression of wind turbine noise from seismologi- cal data using nonlinear thresholding and denoising autoencoder.Journal of Seismology, 26(5):913–934, 2022
Janis Heuel and Wolfgang Friederich. Suppression of wind turbine noise from seismologi- cal data using nonlinear thresholding and denoising autoencoder.Journal of Seismology, 26(5):913–934, 2022
2022
-
[22]
Ross, Men-Andrin Meier, Egill Hauksson, and Thomas H
Zachary E. Ross, Men-Andrin Meier, Egill Hauksson, and Thomas H. Heaton. Gener- alized seismic phase detection with deep learning.Bulletin of the Seismological Society of America, 108(5A):2894–2901, 2018. doi: 10.1785/0120180080
-
[23]
F. F. Mojtahedi. Deep learning for time series forecasting: Review and applications in geotechnics and geosciences.Current Trends in Geotechnical Engineering, 2025
2025
-
[24]
Q. Kong, R. M. Allen, L. Schreier, and Y. W. Kwon. Machine learning in seismology: Turning data into insights.Seismological Research Letters, 89(4):1429–1441, 2018. doi: 10.1785/0220180112
-
[25]
SeismicNet: Physics-informed neural networks for seismic wave modeling in semi-infinite domain
PuRen, ChengpingRao, SuChen, Jian-XunWang, HaoSun, andYangLiu. SeismicNet: Physics-informed neural networks for seismic wave modeling in semi-infinite domain. arXiv preprint arXiv:2210.14044, 2022. URL https://arxiv.org/abs/2210.14044. 31
arXiv 2022
-
[26]
Maan Habib et al. Applications of physics-informed neural networks in geosciences: From basic seismology to comprehensive environmental monitoring.Journal of Applied Geophysics, 2024. doi: 10.1016/j.jappgeo.2024.105342
-
[27]
Waleed Esmail, Alexander Kappes, Stuart Russell, and Christine Thomas. Forecasting Seismic Waveforms: A Deep Learning Approach for Einstein Telescope.arXiv preprint arXiv:2509.21446, 2025. URL https://arxiv.org/abs/2509.21446
arXiv 2025
-
[28]
A survey of large language models.arXiv preprint arXiv:2303.18223, 2023
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. A survey of large language models.arXiv preprint arXiv:2303.18223, 2023. URL https://arxiv.org/abs/ 2303.18223
Pith/arXiv arXiv 2023
-
[29]
Richard Allen, Paolo Gasparini, Osamu Kamigaichi, and Maren Böse. The status of earthquake early warning around the world: An introductory overview.Seismological Research Letters, 80, 09 2009. doi: 10.1785/gssrl.80.5.682
-
[30]
Adrian Abac et al. The Science of the Einstein Telescope.JCAP, 03:081, 2026. doi: 10.1088/1475-7516/2026/03/081
-
[31]
Harms, Living Reviews in Relativity22, 6 (2019), URL https://doi.org/10.1007/s41114-019-0022-2
Jan Harms. Terrestrial gravity fluctuations.Living Rev. Rel., 22(1):6, 2019. doi: 10.1007/s41114-019-0022-2
-
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, pages 5998–6008, 2017. URL https://arxiv. org/abs/1706.03762
Pith/arXiv arXiv 2017
-
[33]
Sabera Talukder, Yisong Yue, and Georgia Gkioxari. TOTEM: TOkenized time series EMbeddings for general time series analysis.Transactions on Machine Learning Re- search, 2024. URL https://arxiv.org/abs/2402.16412. arXiv preprint arXiv:2402.16412
arXiv 2024
-
[34]
Chronos: Learning the language of time series.arXiv preprint arXiv:2403.07815, 2024
Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, et al. Chronos: Learning the language of time series.arXiv preprint arXiv:2403.07815, 2024. URL https://arxiv.org/abs/2403.07815
Pith/arXiv arXiv 2024
-
[36]
URL https://arxiv.org/abs/2412.05244
-
[37]
Anjos, Sebastian Lautz, and Aleksandar Kolev
Egon Peršak, Miguel F. Anjos, Sebastian Lautz, and Aleksandar Kolev. Multiple- resolution tokenization for time series forecasting with an application to pricing.arXiv preprint arXiv:2407.03185, 2024. URL https://arxiv.org/abs/2407.03185. 32
arXiv 2024
-
[38]
Janghoon Yang. TOKON: TOKenization-optimized normalization for time series anal- ysis with a large language model.arXiv preprint arXiv:2502.05701, 2025. URL https://arxiv.org/abs/2502.05701
arXiv 2025
-
[39]
Alberto Ardid Segura, David Dempsey, Corentin Caudron, Shane Cronin, Ben Kennedy, Társilo Girona, Diana Roman, Craig Miller, Sally Potter, Oliver Lamb, Martanto Mar- tanto, Yesim Cubuk, Leoncio Cabrera, Sergio Ruiz, Rodrigo Contreras-Arratia, Javier Pacheco, Mauricio Mora, and Silvio De Angelis. Ergodic seismic precursors and transfer learning for short t...
-
[40]
Abolfazl Farahani, Sahar Voghoei, Khaled Rasheed, and Hamid R. Arabnia. A brief re- view of domain adaptation. In Robert Stahlbock, Gary M. Weiss, Mahmoud Abou-Nasr, Cheng-Ying Yang, Hamid R. Arabnia, and Leonidas Deligiannidis, editors,Advances in Data Science and Information Engineering, pages 877–894, Cham, 2021. Springer In- ternational Publishing. IS...
2021
-
[41]
Instaseis: instant global seismograms based on a broadband waveform database
Martin van Driel, Lion Krischer, Simon C Stähler, Kambod Hosseini, and Tarje Nissen- Meyer. Instaseis: instant global seismograms based on a broadband waveform database. Solid Earth, 6(2):701–717, 2015
2015
-
[42]
Data products at the IRIS-DMC: Growth and usage
Alexander R Hutko, Manochehr Bahavar, Chad Trabant, Robert T Weekly, Mick Van Fossen, and Timothy Ahern. Data products at the IRIS-DMC: Growth and usage. Seismological Research Letters, 88(3):892–903, 2017. doi: 10.1785/0220160190
-
[43]
Hutko, Martin van Driel, Simon Stähler, Manochehr Ba- havar, Chad Trabant, and Tarje Nissen-Meyer
Lion Krischer, Alexander R. Hutko, Martin van Driel, Simon Stähler, Manochehr Ba- havar, Chad Trabant, and Tarje Nissen-Meyer. On-demand custom broadband syn- thetic seismograms.Seismological Research Letters, 88(4):1127–1140, 04 2017. ISSN 0895-0695. doi: 10.1785/0220160210. URL https://doi.org/10.1785/0220160210
-
[44]
J.-P. Montagner and B. L. N. Kennett. How to reconcile body-wave and normal-mode reference earth models.Geophysical Journal International, 125(1):229–248, 04 1996. ISSN 0956-540X. doi: 10.1111/j.1365-246X.1996.tb06548.x. URL https://doi.org/10. 1111/j.1365-246X.1996.tb06548.x
-
[45]
Determination of earthquake source parameters from waveform data for studies of global and regional seismicity
Adam M Dziewonski, T-A Chou, and John H Woodhouse. Determination of earthquake source parameters from waveform data for studies of global and regional seismicity. Journal of Geophysical Research: Solid Earth, 86(B4):2825–2852, 1981
1981
-
[46]
The global CMT project 2004–2010: Centroid-moment tensors for 13,017 earthquakes.Physics of the Earth and Planetary Interiors, 200:1–9, 2012
Göran Ekström, Meredith Nettles, and Adam M Dziewoński. The global CMT project 2004–2010: Centroid-moment tensors for 13,017 earthquakes.Physics of the Earth and Planetary Interiors, 200:1–9, 2012
2004
-
[47]
The taup toolkit: Flexible seismic travel-time and ray-path utilities.Seismological Research Letters, 70:154–160, 1999
H Philip Crotwell, Thomas J Owens, Jeroen Ritsema, et al. The taup toolkit: Flexible seismic travel-time and ray-path utilities.Seismological Research Letters, 70:154–160, 1999. 33
1999
-
[48]
Obspy: A python toolbox for seismology.Seismological Research Letters, 81(3):530–533, 2010
Moritz Beyreuther, Robert Barsch, Lion Krischer, Tobias Megies, Yannik Behr, and Joachim Wassermann. Obspy: A python toolbox for seismology.Seismological Research Letters, 81(3):530–533, 2010
2010
-
[49]
Improving language understanding by genera- tive pre-training
Alec Radford and Karthik Narasimhan. Improving language understanding by genera- tive pre-training. 2018. URL https://api.semanticscholar.org/CorpusID:49313245
2018
-
[50]
Language models are unsupervised multitask learners
Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019. URL https://api. semanticscholar.org/CorpusID:160025533
2019
-
[51]
AudioLM: A language modeling approach to audio generation
Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matt Sharifi, Dominik Roblek, Olivier Teboul, David Grangier, Marco Tagliasacchi, and Neil Zeghidour. AudioLM: A language modeling approach to audio generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31:2523–2533,
-
[52]
doi: 10.1109/TASLP.2023.3288409
-
[53]
Noelia Ferruz, Steffen Schmidt, and Birte Höcker. ProtGPT2 is a deep unsupervised language model for protein design.Nature Communications, 13(1):4348, 2022. doi: 10.1038/s41467-022-32007-7
-
[54]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization.arXiv preprint arXiv:1607.06450, 2016. doi: 10.48550/arXiv.1607.06450
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1607.06450 2016
-
[55]
Szegedy, C., Vanhoucke, V ., Ioffe, S., Shlens, J., and Wojna, Z
Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Bo Wen, and Yunfeng Liu. Ro- Former: Enhanced transformer with rotary position embedding.Neurocomputing, 568: 127063, 2024. doi: 10.1016/j.neucom.2023.127063
-
[56]
Gaussian Error Linear Units (GELUs)
Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (GELUs).arXiv preprint arXiv:1606.08415, 2016. doi: 10.48550/arXiv.1606.08415
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1606.08415 2016
-
[57]
Ryuichi Yamamoto, Eunwoo Song, and Jae-Min Kim. Parallel wavegan: A fast waveform generation model based on generative adversarial networks with multi- resolution spectrogram. InICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6199–6203, 2020. doi: 10.1109/ICASSP40776.2020.9053795
-
[58]
Pytorch: An imperative style, high-performance deep learn- ing library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gre- gory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Al- ban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-p...
2019
-
[59]
PyTorch Lightning, March 2019
William Falcon and The PyTorch Lightning team. PyTorch Lightning, March 2019. URL https://github.com/Lightning-AI/lightning
2019
-
[60]
Decoupled weight decay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInterna- tional Conference on Learning Representations (ICLR), 2019. URL https://arxiv.org/ abs/1711.05101
Pith/arXiv arXiv 2019
-
[61]
Similaritymeasures for time series forecasting: A survey.Algorithms, 15(10):354, 2022
KonstantinosPapasotiriou, NikolaosBakas, andAndreasLangousis. Similaritymeasures for time series forecasting: A survey.Algorithms, 15(10):354, 2022. doi: 10.3390/ a15100354
2022
-
[62]
Dmitry Bobrov, Ivan Kitov, and Lassina Zerbo. Perspectives of cross-correlation in seismic monitoring at the international data centre.Pure and Applied Geophysics, 171 (3):439–468, 2014. doi: 10.1007/s00024-012-0626-x
-
[63]
Peter Welch. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms.IEEE Transactions on Audio and Electroacoustics, 15(2):70–73, 1967. doi: 10.1109/TAU.1967.1161901
-
[64]
Brown, Benjamin Chess, Re- won Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Re- won Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling Laws for Neural Language Models. 1 2020
2020
-
[65]
S. W. French and B. Romanowicz. Whole-mantle radially anisotropic shear velocity structure from spectral-element waveform tomography.Geophysical Journal Interna- tional, 199:1303–1327, 2014
2014
-
[66]
W. Lei, Y. Ruan, E. Bozdağ, D. Peter, M. Lefebvre, D. Komatitsch, J. Tromp, J. Hill, N. Podhorszki, and D. Pugmire. Global adjoint tomography—model GLAD-M25.Geo- physical Journal International, 223(1):1–21, 2020. 35
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.