Earthquake magnitudes depend on seismic history, as revealed by a neural network analysis

Neri Berman; Oleg Zlydenko; Oren Gilon; Yohai Bar-Sinai; Yossi Matias

arxiv: 2408.02129 · v2 · submitted 2024-08-04 · ⚛️ physics.geo-ph

Earthquake magnitudes depend on seismic history, as revealed by a neural network analysis

Neri Berman , Oleg Zlydenko , Oren Gilon , Yossi Matias , Yohai Bar-Sinai This is my paper

Pith reviewed 2026-05-23 22:14 UTC · model grok-4.3

classification ⚛️ physics.geo-ph

keywords earthquake magnitudesseismic historyneural network forecastingGutenberg-Richter distributioninformation gaincatalog analysis

0 comments

The pith

Earthquake magnitudes carry information from past seismic activity, allowing better predictions than the standard Gutenberg-Richter model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that hypocenter catalogs contain extractable signals about the sizes of future earthquakes. A neural network called MAGNET processes locations, times, and past magnitudes to produce probabilistic forecasts that gain about 0.07 bits of information per event over the Gutenberg-Richter benchmark. This gain holds after controls for detection limits in catalogs from Southern California, Japan, and New Zealand. The finding directly challenges the assumption that magnitudes are independent of seismic history and can be treated as drawn from a fixed distribution. If the result stands, forecasting systems could move beyond treating magnitude as separable from the timing and location of events.

Core claim

MAGNET, a multi-encoder neural network with LSTM units, ingests spatiotemporal patterns from seismic catalogs and outputs magnitude distributions that outperform the time-independent Gutenberg-Richter model by an average of 0.07 bits per earthquake. The advantage persists across three regional catalogs after explicit controls for detection artifacts. These outcomes establish that standard hypocenter data carry measurable information about future magnitudes, contradicting the separability assumption that underpins most operational earthquake forecasts.

What carries the argument

MAGNET, a multi-encoder neural network with LSTM units that ingests hypocenter locations, occurrence times, and past magnitudes to produce history-dependent magnitude probability distributions.

If this is right

Magnitude forecasts can be improved by conditioning on the preceding sequence of events rather than treating magnitudes as independent draws.
The separability assumption between occurrence times, locations, and magnitudes does not hold in the examined catalogs.
Seismic hazard models can incorporate magnitude predictions that vary with recent seismic history.
The information gain remains detectable after standard controls for catalog artifacts in multiple independent regions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Point-process models of seismicity could be extended to include explicit magnitude-history coupling terms.
The same neural architecture might be tested on other catalogs or on laboratory earthquake data to check generality.
If the dependence is physical, it would constrain possible mechanisms that link stress history to rupture size.

Load-bearing premise

The measured information gain arises from real physical dependence on seismic history rather than from residual catalog incompleteness, detection thresholds, or model overfitting.

What would settle it

Applying the same model to a fully complete synthetic catalog generated strictly under the time-independent Gutenberg-Richter assumption and obtaining zero information gain would falsify the central claim.

Figures

Figures reproduced from arXiv: 2408.02129 by Neri Berman, Oleg Zlydenko, Oren Gilon, Yohai Bar-Sinai, Yossi Matias.

**Figure 2.** Figure 2: a shows the predicted PDFs for 100 randomly sampled events from the Southern California test set. The PDFs exhibit a clear trend: those that were calculated for higher-magnitude events are skewed towards larger magnitudes (warmer colors) compared to lowermagnitude events (cooler colors). This aligns with the expected behavior for an earthquake magnitude predictor. For comparison, the stationary GR distr… view at source ↗

**Figure 3.** Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4 [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5 [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6 [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7 [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8 [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: FIG. 9 [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: FIG. 10 [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

read the original abstract

Earthquake occurrence is notoriously difficult to predict. While some aspects of their spatiotemporal statistics can be relatively well captured by point-process models, very little is known regarding the magnitude of future events, and it is deeply debated whether it is possible to predict the magnitude of an earthquake before it starts. Most operational forecasting models assume that earthquake magnitudes follow a time-independent Gutenberg-Richter (GR) distribution, effectively treating magnitudes as independent of seismic history. We address this fundamental question by demonstrating that standard hypocenter catalogs carry information about future earthquake magnitudes, making them more predictable than previously considered. We present MAGNET (MAGnitude Neural EsTimation model), which uses a multi-encoder neural network architecture with LSTM units to process spatiotemporal patterns in seismic history. By analyzing hypocenter locations, occurrence times, and magnitudes of past events, MAGNET generates probabilistic magnitude forecasts that demonstrate information gains in predicting magnitudes of future events over GR-based models, after controlling for detection artifacts. Our model achieves an information gain of approximately 0.07 bit per earthquake on average over the GR benchmark in Southern California, Japan, and New Zealand catalogs, with this advantage persisting. These results demonstrate that hypocentral earthquake catalogs contain extractable information about future magnitudes, challenging the conventional separability assumption in earthquake forecasting and offering new approaches for seismic hazard assessment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The LSTM model reports a 0.07-bit gain over GR on three catalogs, but the result hinges on whether controls fully remove catalog artifacts.

read the letter

The paper's core finding is that a multi-encoder LSTM trained on hypocenter times, locations, and past magnitudes extracts about 0.07 bits of extra information per event when forecasting magnitudes, compared with the standard Gutenberg-Richter distribution. This holds on average across the Southern California, Japan, and New Zealand catalogs after the authors apply their stated controls for detection artifacts. The gain is small but consistent enough that it challenges the usual assumption that magnitudes are independent of seismic history in point-process models.

Referee Report

2 major / 1 minor

Summary. The paper introduces MAGNET, a multi-encoder LSTM neural network that ingests hypocenter times, locations, and past magnitudes to produce probabilistic forecasts of future earthquake magnitudes. It reports an average information gain of ~0.07 bits per event over the time-independent Gutenberg-Richter benchmark across Southern California, Japan, and New Zealand catalogs, after stated controls for detection artifacts, and concludes that standard catalogs contain extractable information about magnitude dependence on seismic history.

Significance. If the result is robust, the modest but consistent gain would challenge the conventional assumption that magnitudes are independent of seismic history, with direct implications for point-process forecasting and hazard assessment. Credit is due for testing three independent regional catalogs and for attempting artifact controls; however, the small effect size makes the finding sensitive to unstated methodological choices.

major comments (2)

[Methods (artifact controls)] The controls for detection artifacts (described in the abstract and the methods section on catalog processing) do not include an end-to-end synthetic null test that injects only realistic, spatially/temporally varying detection thresholds and incompleteness while keeping magnitudes strictly GR-distributed and independent of history. Without this test the 0.07-bit gain cannot be confidently attributed to physical history dependence rather than residual catalog artifacts.
[Methods (model training and evaluation)] No details are provided on training/validation splits, hyperparameter search procedure, or quantitative checks that the reported gain survives alternative controls or different random seeds. Given the small effect size and the number of free parameters in the LSTM architecture, these omissions leave the central claim vulnerable to overfitting or data-processing artifacts.

minor comments (1)

[Abstract] The abstract states the gain 'persists' but does not specify the time window or catalog subset over which persistence is measured; a brief clarification would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the opportunity to improve the manuscript. We address each major comment below and will revise the paper to incorporate the requested methodological details and tests.

read point-by-point responses

Referee: [Methods (artifact controls)] The controls for detection artifacts (described in the abstract and the methods section on catalog processing) do not include an end-to-end synthetic null test that injects only realistic, spatially/temporally varying detection thresholds and incompleteness while keeping magnitudes strictly GR-distributed and independent of history. Without this test the 0.07-bit gain cannot be confidently attributed to physical history dependence rather than residual catalog artifacts.

Authors: We agree that an explicit end-to-end synthetic null test would provide stronger evidence that the reported gain is not an artifact of catalog incompleteness. Our existing controls address magnitude-of-completeness variations and temporal detection changes, but we will add a dedicated synthetic experiment in the revised Methods section. Synthetic catalogs will be generated with strictly history-independent GR magnitudes, realistic spatially and temporally varying detection thresholds derived from the real data, and then processed identically to the observed catalogs to confirm that MAGNET yields no spurious information gain under the null. revision: yes
Referee: [Methods (model training and evaluation)] No details are provided on training/validation splits, hyperparameter search procedure, or quantitative checks that the reported gain survives alternative controls or different random seeds. Given the small effect size and the number of free parameters in the LSTM architecture, these omissions leave the central claim vulnerable to overfitting or data-processing artifacts.

Authors: We will add a new subsection in Methods that fully specifies the temporal training/validation/test splits (chosen to avoid forward leakage), the hyperparameter search procedure and selection criteria, and quantitative robustness results across multiple random seeds and alternative preprocessing pipelines. These additions will demonstrate that the average 0.07-bit gain remains stable under the reported controls. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical out-of-sample comparison to fixed baseline

full rationale

The paper trains a neural network (MAGNET) on hypocenter catalogs to produce probabilistic magnitude forecasts and reports an information gain versus the fixed Gutenberg-Richter benchmark on held-out events. This is a standard supervised-learning evaluation against an independent null model; the reported gain is not obtained by fitting a parameter to the test set and relabeling it a prediction, nor does any step reduce to a self-definition or self-citation chain. No equations or load-bearing claims in the provided text exhibit the enumerated circular patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the neural network extracts genuine history dependence rather than catalog artifacts, plus standard supervised learning assumptions about i.i.d. train/test splits and the GR distribution as the correct null model.

free parameters (1)

LSTM hidden sizes and learning rate
Architecture hyperparameters chosen during training; not reported in abstract.

axioms (1)

domain assumption Earthquake catalogs after detection-artifact correction are sufficiently complete for magnitude forecasting
Invoked when claiming the gain persists after controls.

pith-pipeline@v0.9.0 · 5781 in / 1105 out tokens · 17646 ms · 2026-05-23T22:14:51.831673+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Most operational forecasting models assume that earthquake magnitudes follow a time-independent Gutenberg-Richter (GR) distribution ... separability assumption

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages

[1]

background

The conditioned cumulative information gain (CIG) is then recalculated and presented as the yellow dashed curved in Fig. 4 for all three tested regions. Indeed, factoring out the temporal incompleteness reduces the information gain, though it is still significant: evidently, the conditioned CIG curves show a steadily increasing trend. The horizontal dashe...

work page 2016
[2]

Bernard, Earthquake precursors and crustal ’tran- sients’, Nature , 1 (1999)

P. Bernard, Earthquake precursors and crustal ’tran- sients’, Nature , 1 (1999)

work page 1999
[3]

R. J. Geller, D. D. Jackson, Y. Y. Kagan, and F. Mu- largia, Earthquakes Cannot Be Predicted, Science 275, 1616 (1997)

work page 1997
[4]

Gutenberg and C

B. Gutenberg and C. F. Richter, Frequency of earth- quakes in California*, Bulletin of the Seismological Soci- ety of America 34, 185 (1944)

work page 1944
[5]

Y. Y. Kagan, Seismic moment distribution revisited: I. Statistical results, Geophysical Journal International 148, 520 (2002)

work page 2002
[6]

Bak and C

P. Bak and C. Tang, Earthquakes as a self-organized crit- ical phenomenon, Journal of Geophysical Research: Solid Earth 94, 15635 (1989)

work page 1989
[7]

Dascher-Cousineau, E

K. Dascher-Cousineau, E. E. Brodsky, T. Lay, and T. H. W. Goebel, What Controls Variations in After- shock Productivity?, Journal of Geophysical Research: 10 Solid Earth 125, e2019JB018111 (2020)

work page 2020
[8]

Y. Y. Kagan, Aftershock Zone Scaling, Bulletin of the Seismological Society of America 92, 641 (2002)

work page 2002
[9]

T. Utsu, Y. Ogata, R. S, and Matsu’ura, The Centenary of the Omori Formula for a Decay Law of Aftershock Activity, Journal of Physics of the Earth 43, 1 (1995)

work page 1995
[10]

Omori, On After-Shocks of Earthquakes, Journal of the College of Science, Imperial University of Tokyo , 111 (1894)

F. Omori, On After-Shocks of Earthquakes, Journal of the College of Science, Imperial University of Tokyo , 111 (1894)

work page
[11]

Y. Y. Kagan, Short-Term Properties of Earthquake Cat- alogs and Models of Earthquake Source, Bulletin of the Seismological Society of America 94, 1207 (2004)

work page 2004
[12]

Ben-Zion and I

Y. Ben-Zion and I. Zaliapin, Localization and coalescence of seismicity before large earthquakes, Geophysical Jour- nal International 223, 561 (2020)

work page 2020
[13]

P. M. R. DeVries, F. Vi´ egas, M. Wattenberg, and B. J. Meade, Deep learning of aftershock patterns following large earthquakes, Nature 560, 632 (2018)

work page 2018
[14]

G. C. P. King, R. S. Stein, and J. Lin, Static stress changes and the triggering of earthquakes, Bulletin of the Seismological Society of America 84, 935 (1994)

work page 1994
[15]

Ogata, Statistical Models for Earthquake Occurrences and Residual Analysis for Point Processes, Source: Jour- nal of the American Statistical Association 83, 9 (1988)

Y. Ogata, Statistical Models for Earthquake Occurrences and Residual Analysis for Point Processes, Source: Jour- nal of the American Statistical Association 83, 9 (1988)

work page 1988
[16]

J. L. Hardebeck, A. L. Llenos, A. J. Michael, M. T. Page, M. Schneider, and N. J. van der Elst, Aftershock Fore- casting, Annual Review of Earth and Planetary Sciences 52, null (2024)

work page 2024
[17]

T. H. Jordan, Y. T. Chen, P. Gasparini, R. Madariaga, I. Main, W. Marzocchi, G. Papadopoulos, G. Sobolev, K. Yamaoka, and J. Zschau, OPERATIONAL EARTH- QUAKE FORECASTING. State of Knowledge and Guidelines for Utilization, Annals of Geophysics 54, 319 (2011)

work page 2011
[18]

Stirling, G

M. Stirling, G. McVerry, M. Gerstenberger, N. Litch- field, R. Van Dissen, K. Berryman, P. Barnes, L. Wal- lace, P. Villamor, R. Langridge, G. Lamarche, S. Nodder, M. Reyners, B. Bradley, D. Rhoades, W. Smith, A. Nicol, J. Pettinga, K. Clark, and K. Jacobs, National Seismic Hazard Model for New Zealand: 2010 Update, Bulletin of the Seismological Society o...

work page 2010
[19]

Ogata, K

Y. Ogata, K. Katsura, H. Tsuruoka, and N. Hirata, Ex- ploring Magnitude Forecasting of the Next Earthquake, Seismological Research Letters 89, 1298 (2018)

work page 2018
[20]

W. L. Ellsworth and G. C. Beroza, Seismic Evidence for an Earthquake Nucleation Phase, Science 268, 851 (12.5.95)

work page
[21]

Meier, T

M.-A. Meier, T. Heaton, and J. Clinton, Evidence for universal earthquake rupture initiation behavior: Univer- sal Earthquake Rupture Initiation, Geophysical Research Letters 43, 7991 (2016)

work page 2016
[22]

Gulia and S

L. Gulia and S. Wiemer, Real-time discrimination of earthquake foreshocks and aftershocks, Nature 574, 193 (2019)

work page 2019
[23]

Nandan, G

S. Nandan, G. Ouillon, and D. Sornette, Magnitude of Earthquakes Controls the Size Distribution of Their Trig- gered Events, Journal of Geophysical Research: Solid Earth 124, 2762 (2019)

work page 2019
[24]

F. P. Schoenberg, Testing Separability in Spatial- Temporal Marked Point Processes, Biometrics 60, 471 (2004), 3695775

work page 2004
[25]

Stockman, D

S. Stockman, D. J. Lawson, and M. J. Werner, Fore- casting the 2016–2017 Central Apennines Earthquake Se- quence With a Neural Point Process, Earth’s Future 11, e2023EF003777 (2023)

work page 2016
[26]

Olami, H

Z. Olami, H. J. S. Feder, and K. Christensen, Self- organized criticality in a continuous, nonconservative cel- lular automaton modeling earthquakes, Physical Review Letters 68, 1244 (1992)

work page 1992
[27]

Sornette and D

A. Sornette and D. Sornette, Self-Organized Criticality and Earthquakes, Europhysics Letters 9, 197 (1989)

work page 1989
[28]

T. W. J. de Geus and M. Wyart, Scaling theory for the statistics of slip at frictional interfaces, Physical Review E 106, 065001 (2022)

work page 2022
[29]

Petrillo and J

G. Petrillo and J. Zhuang, Verifying the Magnitude De- pendence in Earthquake Occurrence, Physical Review Letters 131, 154101 (2023)

work page 2023
[30]

Taroni, Are the magnitudes of earthquakes in South- ern California, with incompleteness removed, correlated?, Geophysical Journal International 236, 1596 (2024)

M. Taroni, Are the magnitudes of earthquakes in South- ern California, with incompleteness removed, correlated?, Geophysical Journal International 236, 1596 (2024)

work page 2024
[31]

Davidsen and A

J. Davidsen and A. Green, Are Earthquake Magnitudes Clustered?, Physical Review Letters 106, 108502 (2011)

work page 2011
[32]

Xiong, M

Q. Xiong, M. R. Brudzinski, D. Gossett, Q. Lin, and J. C. Hampton, Seismic magnitude clustering is prevalent in field and laboratory catalogs, Nature Communications 14, 2056 (2023)

work page 2056
[33]

Do Earthquakes Exhibit Self- Organized Criticality?

´A. Corral, Comment on “Do Earthquakes Exhibit Self- Organized Criticality?”, Physical Review Letters 95, 159801 (2005)

work page 2005
[34]

Spassiani and G

I. Spassiani and G. Sebastiani, Exploring the relation- ship between the magnitudes of seismic events, Journal of Geophysical Research: Solid Earth 121, 903 (2016)

work page 2016
[36]

Lippiello, L

E. Lippiello, L. de Arcangelis, and C. Godano, A positive answer on the existence of correlations be- tween positive earthquake magnitude differences (2024), arXiv:2404.15706 [physics]

work page arXiv 2024
[37]

Lippiello, L

E. Lippiello, L. de Arcangelis, and C. Godano, Influence of Time and Space Correlations on Earthquake Magni- tude, Physical Review Letters 100, 038501 (2008)

work page 2008
[38]

Shcherbakov, J

R. Shcherbakov, J. Zhuang, G. Z¨ oller, and Y. Ogata, Forecasting the magnitude of the largest expected earth- quake, Nature Communications 10, 4051 (2019)

work page 2019
[39]

Panakkat and H

A. Panakkat and H. Adeli, NEURAL NETWORK MODELS FOR EARTHQUAKE MAGNITUDE PRE- DICTION USING MULTIPLE SEISMICITY INDICA- TORS, International Journal of Neural Systems 17, 13 (2007)

work page 2007
[40]

Hochreiter and J

S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation 9, 1735 (1997)

work page 1997
[41]

Kumaraswamy, A generalized probability density func- tion for double-bounded random processes, Journal of Hydrology 46, 79 (1980)

P. Kumaraswamy, A generalized probability density func- tion for double-bounded random processes, Journal of Hydrology 46, 79 (1980)

work page 1980
[42]

Hauksson, W

E. Hauksson, W. Yang, and P. M. Shearer, Waveform Relocated Earthquake Catalog for Southern California (1981 to June 2011), Bulletin of the Seismological Society of America 102, 2239 (2012)

work page 1981
[43]

GNS, GeoNet Aotearoa New Zealand Earthquake Cata- logue (1970)

work page 1970
[44]

Japan Meteorological Agency website, https://www.data.jma.go.jp/svd/eqev/data/bulletin/hypo e.html

work page
[45]

K. P. Murphy, Machine Learning: A Probabilistic Per- spective (2012). 11

work page 2012
[46]

B¨ uttcher, C

S. B¨ uttcher, C. L. A. Clarke, and G. V. Cormack,Infor- mation Retrieval: Implementing and Evaluating Search Engines (MIT Press, 2010)

work page 2010
[47]

Wiemer and M

S. Wiemer and M. Wyss, Minimum Magnitude of Completeness in Earthquake Catalogs: Examples from Alaska, the Western United States, and Japan, Bulletin of the Seismological Society of America 90, 859 (2000)

work page 2000
[48]

D. Amitrano, Brittle-ductile transition and associated seismicity: Experimental and numerical studies and re- lationship with the b value, Journal of Geophysical Re- search: Solid Earth 108, 10.1029/2001JB000680 (2003)

work page doi:10.1029/2001jb000680 2003
[49]

C. H. Scholz, On the stress dependence of the earthquake b value, Geophysical Research Letters 42, 1399 (2015)

work page 2015
[50]

Herrmann, E

M. Herrmann, E. Piegari, and W. Marzocchi, Revealing the spatiotemporal complexity of the magnitude distribu- tion and b-value during an earthquake sequence, Nature Communications 13, 5087 (2022)

work page 2022
[51]

Taroni and M

M. Taroni and M. M. C. Carafa, Earthquake size distri- butions are slightly different in compression vs extension, Communications Earth & Environment 4, 1 (2023)

work page 2023
[52]

Taroni, J

M. Taroni, J. Zhuang, and W. Marzocchi, High- Definition Mapping of the Gutenberg–Richter b-Value and Its Relevance: A Case Study in Italy, Seismologi- cal Research Letters 92, 3778 (2021)

work page 2021
[53]

Sturmfels, S

P. Sturmfels, S. Lundberg, and S.-I. Lee, Visualizing the Impact of Feature Attribution Baselines, Distill 5, e22 (2020)

work page 2020
[54]

Zhang, P

Y. Zhang, P. Tiˇ no, A. Leonardis, and K. Tang, A Survey on Neural Network Interpretability, IEEE Transactions on Emerging Topics in Computational Intelligence5, 726 (2021)

work page 2021
[55]

Liu and F

Z. Liu and F. Xu, Interpretable neural networks: Princi- ples and applications, Frontiers in Artificial Intelligence 6, 10.3389/frai.2023.974295 (2023)

work page doi:10.3389/frai.2023.974295 2023
[56]

S. M. Mousavi and G. C. Beroza, Deep-learning seismol- ogy, Science 377, eabm4470 (2022)

work page 2022
[57]

S. M. Mousavi and G. C. Beroza, Machine Learning in Earthquake Seismology, Annual Review of Earth and Planetary Sciences 51, 105 (2023)

work page 2023
[58]

Mignan and M

A. Mignan and M. Broccardo, Neural Network Appli- cations in Earthquake Prediction (1994–2019): Meta- Analytic and Statistical Insights on Their Limitations, Seismological Research Letters 91, 2330 (2020)

work page 1994
[59]

Karimpouli, D

S. Karimpouli, D. Caus, H. Grover, P. Mart´ ınez-Garz´ on, M. Bohnhoff, G. C. Beroza, G. Dresen, T. Goebel, T. Weigel, and G. Kwiatek, Explainable machine learn- ing for labquake prediction using catalog-driven features, Earth and Planetary Science Letters 622, 118383 (2023)

work page 2023
[60]

K. J. Bergen, P. A. Johnson, M. V. de Hoop, and G. C. Beroza, Machine learning for data-driven discovery in solid Earth geoscience, Science 363, eaau0323 (2019)

work page 2019
[61]

Ogata, Statistics of Earthquake Activity: Models and Methods for Earthquake Predictability Studies, Annual Review of Earth and Planetary Sciences 45, 497 (2017)

Y. Ogata, Statistics of Earthquake Activity: Models and Methods for Earthquake Predictability Studies, Annual Review of Earth and Planetary Sciences 45, 497 (2017)

work page 2017
[62]

Dascher-Cousineau, O

K. Dascher-Cousineau, O. Shchur, E. E. Brodsky, and S. G¨ unnemann, Using Deep Learning for Flexible and Scalable Earthquake Forecasting, Geophysical Research Letters 50, e2023GL103909 (2023)

work page 2023
[63]

Zlydenko, G

O. Zlydenko, G. Elidan, A. Hassidim, D. Kukliansky, Y. Matias, B. Meade, A. Molchanov, S. Nevo, and Y. Bar- Sinai, A neural encoder for earthquake rate forecasting, Scientific Reports 13, 12350 (2023)

work page 2023
[64]

D. W. Scott, Multivariate Density Estimation , 2nd ed. (Wiley, 2015)

work page 2015
[65]

Do earthquakes “know

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haber- land, T. Reddy, D. Cournapeau, E. Burovski, P. Pe- terson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, ˙I. Po- lat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henr...

work page 2020

[1] [1]

background

The conditioned cumulative information gain (CIG) is then recalculated and presented as the yellow dashed curved in Fig. 4 for all three tested regions. Indeed, factoring out the temporal incompleteness reduces the information gain, though it is still significant: evidently, the conditioned CIG curves show a steadily increasing trend. The horizontal dashe...

work page 2016

[2] [2]

Bernard, Earthquake precursors and crustal ’tran- sients’, Nature , 1 (1999)

P. Bernard, Earthquake precursors and crustal ’tran- sients’, Nature , 1 (1999)

work page 1999

[3] [3]

R. J. Geller, D. D. Jackson, Y. Y. Kagan, and F. Mu- largia, Earthquakes Cannot Be Predicted, Science 275, 1616 (1997)

work page 1997

[4] [4]

Gutenberg and C

B. Gutenberg and C. F. Richter, Frequency of earth- quakes in California*, Bulletin of the Seismological Soci- ety of America 34, 185 (1944)

work page 1944

[5] [5]

Y. Y. Kagan, Seismic moment distribution revisited: I. Statistical results, Geophysical Journal International 148, 520 (2002)

work page 2002

[6] [6]

Bak and C

P. Bak and C. Tang, Earthquakes as a self-organized crit- ical phenomenon, Journal of Geophysical Research: Solid Earth 94, 15635 (1989)

work page 1989

[7] [7]

Dascher-Cousineau, E

K. Dascher-Cousineau, E. E. Brodsky, T. Lay, and T. H. W. Goebel, What Controls Variations in After- shock Productivity?, Journal of Geophysical Research: 10 Solid Earth 125, e2019JB018111 (2020)

work page 2020

[8] [8]

Y. Y. Kagan, Aftershock Zone Scaling, Bulletin of the Seismological Society of America 92, 641 (2002)

work page 2002

[9] [9]

T. Utsu, Y. Ogata, R. S, and Matsu’ura, The Centenary of the Omori Formula for a Decay Law of Aftershock Activity, Journal of Physics of the Earth 43, 1 (1995)

work page 1995

[10] [10]

Omori, On After-Shocks of Earthquakes, Journal of the College of Science, Imperial University of Tokyo , 111 (1894)

F. Omori, On After-Shocks of Earthquakes, Journal of the College of Science, Imperial University of Tokyo , 111 (1894)

work page

[11] [11]

Y. Y. Kagan, Short-Term Properties of Earthquake Cat- alogs and Models of Earthquake Source, Bulletin of the Seismological Society of America 94, 1207 (2004)

work page 2004

[12] [12]

Ben-Zion and I

Y. Ben-Zion and I. Zaliapin, Localization and coalescence of seismicity before large earthquakes, Geophysical Jour- nal International 223, 561 (2020)

work page 2020

[13] [13]

P. M. R. DeVries, F. Vi´ egas, M. Wattenberg, and B. J. Meade, Deep learning of aftershock patterns following large earthquakes, Nature 560, 632 (2018)

work page 2018

[14] [14]

G. C. P. King, R. S. Stein, and J. Lin, Static stress changes and the triggering of earthquakes, Bulletin of the Seismological Society of America 84, 935 (1994)

work page 1994

[15] [15]

Ogata, Statistical Models for Earthquake Occurrences and Residual Analysis for Point Processes, Source: Jour- nal of the American Statistical Association 83, 9 (1988)

Y. Ogata, Statistical Models for Earthquake Occurrences and Residual Analysis for Point Processes, Source: Jour- nal of the American Statistical Association 83, 9 (1988)

work page 1988

[16] [16]

J. L. Hardebeck, A. L. Llenos, A. J. Michael, M. T. Page, M. Schneider, and N. J. van der Elst, Aftershock Fore- casting, Annual Review of Earth and Planetary Sciences 52, null (2024)

work page 2024

[17] [17]

T. H. Jordan, Y. T. Chen, P. Gasparini, R. Madariaga, I. Main, W. Marzocchi, G. Papadopoulos, G. Sobolev, K. Yamaoka, and J. Zschau, OPERATIONAL EARTH- QUAKE FORECASTING. State of Knowledge and Guidelines for Utilization, Annals of Geophysics 54, 319 (2011)

work page 2011

[18] [18]

Stirling, G

M. Stirling, G. McVerry, M. Gerstenberger, N. Litch- field, R. Van Dissen, K. Berryman, P. Barnes, L. Wal- lace, P. Villamor, R. Langridge, G. Lamarche, S. Nodder, M. Reyners, B. Bradley, D. Rhoades, W. Smith, A. Nicol, J. Pettinga, K. Clark, and K. Jacobs, National Seismic Hazard Model for New Zealand: 2010 Update, Bulletin of the Seismological Society o...

work page 2010

[19] [19]

Ogata, K

Y. Ogata, K. Katsura, H. Tsuruoka, and N. Hirata, Ex- ploring Magnitude Forecasting of the Next Earthquake, Seismological Research Letters 89, 1298 (2018)

work page 2018

[20] [20]

W. L. Ellsworth and G. C. Beroza, Seismic Evidence for an Earthquake Nucleation Phase, Science 268, 851 (12.5.95)

work page

[21] [21]

Meier, T

M.-A. Meier, T. Heaton, and J. Clinton, Evidence for universal earthquake rupture initiation behavior: Univer- sal Earthquake Rupture Initiation, Geophysical Research Letters 43, 7991 (2016)

work page 2016

[22] [22]

Gulia and S

L. Gulia and S. Wiemer, Real-time discrimination of earthquake foreshocks and aftershocks, Nature 574, 193 (2019)

work page 2019

[23] [23]

Nandan, G

S. Nandan, G. Ouillon, and D. Sornette, Magnitude of Earthquakes Controls the Size Distribution of Their Trig- gered Events, Journal of Geophysical Research: Solid Earth 124, 2762 (2019)

work page 2019

[24] [24]

F. P. Schoenberg, Testing Separability in Spatial- Temporal Marked Point Processes, Biometrics 60, 471 (2004), 3695775

work page 2004

[25] [25]

Stockman, D

S. Stockman, D. J. Lawson, and M. J. Werner, Fore- casting the 2016–2017 Central Apennines Earthquake Se- quence With a Neural Point Process, Earth’s Future 11, e2023EF003777 (2023)

work page 2016

[26] [26]

Olami, H

Z. Olami, H. J. S. Feder, and K. Christensen, Self- organized criticality in a continuous, nonconservative cel- lular automaton modeling earthquakes, Physical Review Letters 68, 1244 (1992)

work page 1992

[27] [27]

Sornette and D

A. Sornette and D. Sornette, Self-Organized Criticality and Earthquakes, Europhysics Letters 9, 197 (1989)

work page 1989

[28] [28]

T. W. J. de Geus and M. Wyart, Scaling theory for the statistics of slip at frictional interfaces, Physical Review E 106, 065001 (2022)

work page 2022

[29] [29]

Petrillo and J

G. Petrillo and J. Zhuang, Verifying the Magnitude De- pendence in Earthquake Occurrence, Physical Review Letters 131, 154101 (2023)

work page 2023

[30] [30]

Taroni, Are the magnitudes of earthquakes in South- ern California, with incompleteness removed, correlated?, Geophysical Journal International 236, 1596 (2024)

M. Taroni, Are the magnitudes of earthquakes in South- ern California, with incompleteness removed, correlated?, Geophysical Journal International 236, 1596 (2024)

work page 2024

[31] [31]

Davidsen and A

J. Davidsen and A. Green, Are Earthquake Magnitudes Clustered?, Physical Review Letters 106, 108502 (2011)

work page 2011

[32] [32]

Xiong, M

Q. Xiong, M. R. Brudzinski, D. Gossett, Q. Lin, and J. C. Hampton, Seismic magnitude clustering is prevalent in field and laboratory catalogs, Nature Communications 14, 2056 (2023)

work page 2056

[33] [33]

Do Earthquakes Exhibit Self- Organized Criticality?

´A. Corral, Comment on “Do Earthquakes Exhibit Self- Organized Criticality?”, Physical Review Letters 95, 159801 (2005)

work page 2005

[34] [34]

Spassiani and G

I. Spassiani and G. Sebastiani, Exploring the relation- ship between the magnitudes of seismic events, Journal of Geophysical Research: Solid Earth 121, 903 (2016)

work page 2016

[35] [36]

Lippiello, L

E. Lippiello, L. de Arcangelis, and C. Godano, A positive answer on the existence of correlations be- tween positive earthquake magnitude differences (2024), arXiv:2404.15706 [physics]

work page arXiv 2024

[36] [37]

Lippiello, L

E. Lippiello, L. de Arcangelis, and C. Godano, Influence of Time and Space Correlations on Earthquake Magni- tude, Physical Review Letters 100, 038501 (2008)

work page 2008

[37] [38]

Shcherbakov, J

R. Shcherbakov, J. Zhuang, G. Z¨ oller, and Y. Ogata, Forecasting the magnitude of the largest expected earth- quake, Nature Communications 10, 4051 (2019)

work page 2019

[38] [39]

Panakkat and H

A. Panakkat and H. Adeli, NEURAL NETWORK MODELS FOR EARTHQUAKE MAGNITUDE PRE- DICTION USING MULTIPLE SEISMICITY INDICA- TORS, International Journal of Neural Systems 17, 13 (2007)

work page 2007

[39] [40]

Hochreiter and J

S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation 9, 1735 (1997)

work page 1997

[40] [41]

Kumaraswamy, A generalized probability density func- tion for double-bounded random processes, Journal of Hydrology 46, 79 (1980)

P. Kumaraswamy, A generalized probability density func- tion for double-bounded random processes, Journal of Hydrology 46, 79 (1980)

work page 1980

[41] [42]

Hauksson, W

E. Hauksson, W. Yang, and P. M. Shearer, Waveform Relocated Earthquake Catalog for Southern California (1981 to June 2011), Bulletin of the Seismological Society of America 102, 2239 (2012)

work page 1981

[42] [43]

GNS, GeoNet Aotearoa New Zealand Earthquake Cata- logue (1970)

work page 1970

[43] [44]

Japan Meteorological Agency website, https://www.data.jma.go.jp/svd/eqev/data/bulletin/hypo e.html

work page

[44] [45]

K. P. Murphy, Machine Learning: A Probabilistic Per- spective (2012). 11

work page 2012

[45] [46]

B¨ uttcher, C

S. B¨ uttcher, C. L. A. Clarke, and G. V. Cormack,Infor- mation Retrieval: Implementing and Evaluating Search Engines (MIT Press, 2010)

work page 2010

[46] [47]

Wiemer and M

S. Wiemer and M. Wyss, Minimum Magnitude of Completeness in Earthquake Catalogs: Examples from Alaska, the Western United States, and Japan, Bulletin of the Seismological Society of America 90, 859 (2000)

work page 2000

[47] [48]

D. Amitrano, Brittle-ductile transition and associated seismicity: Experimental and numerical studies and re- lationship with the b value, Journal of Geophysical Re- search: Solid Earth 108, 10.1029/2001JB000680 (2003)

work page doi:10.1029/2001jb000680 2003

[48] [49]

C. H. Scholz, On the stress dependence of the earthquake b value, Geophysical Research Letters 42, 1399 (2015)

work page 2015

[49] [50]

Herrmann, E

M. Herrmann, E. Piegari, and W. Marzocchi, Revealing the spatiotemporal complexity of the magnitude distribu- tion and b-value during an earthquake sequence, Nature Communications 13, 5087 (2022)

work page 2022

[50] [51]

Taroni and M

M. Taroni and M. M. C. Carafa, Earthquake size distri- butions are slightly different in compression vs extension, Communications Earth & Environment 4, 1 (2023)

work page 2023

[51] [52]

Taroni, J

M. Taroni, J. Zhuang, and W. Marzocchi, High- Definition Mapping of the Gutenberg–Richter b-Value and Its Relevance: A Case Study in Italy, Seismologi- cal Research Letters 92, 3778 (2021)

work page 2021

[52] [53]

Sturmfels, S

P. Sturmfels, S. Lundberg, and S.-I. Lee, Visualizing the Impact of Feature Attribution Baselines, Distill 5, e22 (2020)

work page 2020

[53] [54]

Zhang, P

Y. Zhang, P. Tiˇ no, A. Leonardis, and K. Tang, A Survey on Neural Network Interpretability, IEEE Transactions on Emerging Topics in Computational Intelligence5, 726 (2021)

work page 2021

[54] [55]

Liu and F

Z. Liu and F. Xu, Interpretable neural networks: Princi- ples and applications, Frontiers in Artificial Intelligence 6, 10.3389/frai.2023.974295 (2023)

work page doi:10.3389/frai.2023.974295 2023

[55] [56]

S. M. Mousavi and G. C. Beroza, Deep-learning seismol- ogy, Science 377, eabm4470 (2022)

work page 2022

[56] [57]

S. M. Mousavi and G. C. Beroza, Machine Learning in Earthquake Seismology, Annual Review of Earth and Planetary Sciences 51, 105 (2023)

work page 2023

[57] [58]

Mignan and M

A. Mignan and M. Broccardo, Neural Network Appli- cations in Earthquake Prediction (1994–2019): Meta- Analytic and Statistical Insights on Their Limitations, Seismological Research Letters 91, 2330 (2020)

work page 1994

[58] [59]

Karimpouli, D

S. Karimpouli, D. Caus, H. Grover, P. Mart´ ınez-Garz´ on, M. Bohnhoff, G. C. Beroza, G. Dresen, T. Goebel, T. Weigel, and G. Kwiatek, Explainable machine learn- ing for labquake prediction using catalog-driven features, Earth and Planetary Science Letters 622, 118383 (2023)

work page 2023

[59] [60]

K. J. Bergen, P. A. Johnson, M. V. de Hoop, and G. C. Beroza, Machine learning for data-driven discovery in solid Earth geoscience, Science 363, eaau0323 (2019)

work page 2019

[60] [61]

Ogata, Statistics of Earthquake Activity: Models and Methods for Earthquake Predictability Studies, Annual Review of Earth and Planetary Sciences 45, 497 (2017)

Y. Ogata, Statistics of Earthquake Activity: Models and Methods for Earthquake Predictability Studies, Annual Review of Earth and Planetary Sciences 45, 497 (2017)

work page 2017

[61] [62]

Dascher-Cousineau, O

K. Dascher-Cousineau, O. Shchur, E. E. Brodsky, and S. G¨ unnemann, Using Deep Learning for Flexible and Scalable Earthquake Forecasting, Geophysical Research Letters 50, e2023GL103909 (2023)

work page 2023

[62] [63]

Zlydenko, G

O. Zlydenko, G. Elidan, A. Hassidim, D. Kukliansky, Y. Matias, B. Meade, A. Molchanov, S. Nevo, and Y. Bar- Sinai, A neural encoder for earthquake rate forecasting, Scientific Reports 13, 12350 (2023)

work page 2023

[63] [64]

D. W. Scott, Multivariate Density Estimation , 2nd ed. (Wiley, 2015)

work page 2015

[64] [65]

Do earthquakes “know

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haber- land, T. Reddy, D. Cournapeau, E. Burovski, P. Pe- terson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, ˙I. Po- lat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henr...

work page 2020