A paradigm for developing earthquake probability forecasts based on geoelectric data

Chien-Chih Chen; Didier Sornette; Guy Ouillon; Hong-Jia Chen

arxiv: 1907.05623 · v1 · pith:ZT4X4VCWnew · submitted 2019-07-12 · ⚛️ physics.geo-ph

A paradigm for developing earthquake probability forecasts based on geoelectric data

Hong-Jia Chen , Chien-Chih Chen , Guy Ouillon , Didier Sornette This is my paper

Pith reviewed 2026-05-24 22:22 UTC · model grok-4.3

classification ⚛️ physics.geo-ph

keywords geoelectric signalsearthquake predictionseismoelectric relationshipalarm-based modelbinary classificationmachine learningprobabilistic forecastsprecursory signals

0 comments

The pith

Geoelectric signals carry statistically significant information about impending large earthquakes that machine learning can convert into probability forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an improved algorithm using alarm-based models and binary classification on geoelectric data from multiple stations to detect precursory signals before large earthquakes. It removes a time parameter for coarse-graining and identifies optimal frequency bands with the highest signal-to-noise ratio. Significance tests provide evidence for an underlying seismoelectric relationship. This relationship can be extracted using machine learning to generate probabilistic forecasts, moving toward operational earthquake prediction.

Core claim

The authors show that geoelectric signals exhibit a seismoelectric relationship with future earthquakes, demonstrated through significance tests on an alarm-based model and binary classification applied to joint station data, allowing machine learning to quantify probabilistic forecasts of large earthquakes.

What carries the argument

An improved alarm-based model combined with binary classification on multi-station geoelectric data, optimized over frequency bands.

If this is right

Removing the time parameter for coarse-graining improves the model's applicability.
Joint stations method extends the single station approach for better detection.
Optimal frequency bands maximize the signal-to-noise ratio for earthquake-related signals.
Significance tests confirm the predictive potential of the geoelectric signals.
Machine learning can be used to extract the relationship for probabilistic forecasts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be tested on datasets from different seismic regions to assess generalizability.
Integration with other geophysical precursors might enhance forecast accuracy.
If the relationship holds, it suggests physical mechanisms linking electric signals to stress changes before quakes.

Load-bearing premise

The geoelectric signals identified by the model genuinely precede and predict large earthquakes rather than arising from unrelated noise or biases in data selection.

What would settle it

A replication study on independent geoelectric and earthquake data that applies the same significance tests but finds no statistical link after correcting for multiple comparisons would falsify the claim.

read the original abstract

We examine the precursory behavior of geoelectric signals before large earthquakes by means of an algorithm including an alarm-based model and binary classification. This algorithm, introduced originally by Chen and Chen [Nat. Hazards., 84, 2016], is improved by removing a time parameter for coarse-graining of earthquake occurrences, as well as by extending the single station method into a joint stations method. We also determine the optimal frequency bands of earthquake-related geoelectric signals with the highest signal-to-noise ratio. Using significance tests, we also provide evidence of an underlying seismoelectric relationship. It is appropriate for machine learning to extract this underlying relationship, which could be used to quantify probabilistic forecasts of impending earthquakes, and to get closer to operational earthquake prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a modest engineering update to the authors' own 2016 geoelectric alarm method, but the significance tests are likely compromised by uncorrected searches over frequency bands and station combinations.

read the letter

The main contribution here is a set of practical tweaks to the Chen and Chen 2016 alarm-based model. They drop one time parameter for coarse-graining earthquake times, move from single-station to joint-station analysis, scan for the frequency bands with highest signal-to-noise, and run significance tests to claim evidence of a seismoelectric relationship that could feed into machine-learning probability forecasts.

Referee Report

1 major / 1 minor

Summary. The manuscript presents an improved version of an alarm-based model combined with binary classification to detect precursory geoelectric signals before large earthquakes. Building on Chen and Chen (2016), the authors remove a time parameter for coarse-graining earthquake occurrences, extend the approach from single-station to joint-station analysis, identify optimal frequency bands via highest signal-to-noise ratio, and invoke significance tests to claim evidence of an underlying seismoelectric relationship. They conclude that machine learning can extract this relationship to produce probabilistic earthquake forecasts.

Significance. If the statistical evidence for a genuine seismoelectric relationship survives scrutiny, the work could supply a concrete data-driven pathway toward probabilistic forecasts in a field where operational prediction remains elusive. The joint-station extension and SNR-based band selection are practical methodological steps that could be adopted by others working with electromagnetic precursors.

major comments (1)

[Abstract] Abstract: The central claim that 'significance tests... provide evidence of an underlying seismoelectric relationship' rests on tests performed after data-driven selection of optimal frequency bands (by highest SNR) and after extension to joint-station combinations. The manuscript does not state whether any multiplicity correction (Bonferroni, FDR, or permutation-based) was applied to the family of tests over bands and station groupings. Without such correction the reported significance levels are inflated and cannot securely rule out noise or post-selection artifacts, directly undermining the evidential foundation for the downstream machine-learning recommendation.

minor comments (1)

[Abstract] The abstract invokes 'significance tests' without naming the test statistic, the exact null hypothesis, sample sizes, or p-value threshold; these details should be supplied in the methods section for reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the statistical robustness of our significance tests. We address the concern directly below and will revise the manuscript to strengthen the evidential claims.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'significance tests... provide evidence of an underlying seismoelectric relationship' rests on tests performed after data-driven selection of optimal frequency bands (by highest SNR) and after extension to joint-station combinations. The manuscript does not state whether any multiplicity correction (Bonferroni, FDR, or permutation-based) was applied to the family of tests over bands and station groupings. Without such correction the reported significance levels are inflated and cannot securely rule out noise or post-selection artifacts, directly undermining the evidential foundation for the downstream machine-learning recommendation.

Authors: We agree that selecting optimal frequency bands via highest SNR and extending to joint-station combinations constitutes a family of tests, and that the absence of multiplicity correction (e.g., FDR or permutation-based) means the reported significance levels may be inflated. The submitted manuscript did not apply or discuss such a correction. In revision we will (i) explicitly state the number of tests performed, (ii) apply a permutation-based multiplicity correction to the significance tests, and (iii) update the abstract and discussion to reflect the corrected evidence for the seismoelectric relationship. This change directly addresses the referee's concern and bolsters the foundation for the machine-learning recommendation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical application with statistical tests on new data

full rationale

The paper extends its authors' 2016 algorithm by removing a time parameter and adding a joint-stations method, selects frequency bands by highest SNR on the data, and reports significance tests as evidence for a seismoelectric relationship before recommending ML for forecasts. This workflow applies an existing method to fresh observations and performs statistical checks rather than deriving a result that equals its inputs by construction. No equations reduce to tautologies, no fitted parameters are relabeled as independent predictions, and the cited prior work supplies only the base procedure while the new evidence claim rests on the tests themselves. Self-citation occurs but is not load-bearing for the central statistical claim.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; the method inherits its core alarm logic and binary-classification framing from the 2016 Chen & Chen paper. No new free parameters, axioms, or invented entities are explicitly introduced in the provided text.

pith-pipeline@v0.9.0 · 5657 in / 1122 out tokens · 16207 ms · 2026-05-24T22:22:56.511522+00:00 · methodology

A paradigm for developing earthquake probability forecasts based on geoelectric data

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)