pith. sign in

arxiv: 1907.01497 · v1 · pith:J4YVLDGWnew · submitted 2019-07-02 · 📡 eess.IV · cs.LG· physics.comp-ph· physics.geo-ph

Seismic data denoising and deblending using deep learning

Pith reviewed 2026-05-25 10:35 UTC · model grok-4.3

classification 📡 eess.IV cs.LGphysics.comp-phphysics.geo-ph
keywords seismic denoisingdeblendingdeep learningU-netResNetcommon offset gatherssynthetic training data
0
0 comments X

The pith

A U-net model trained only on synthetic seismic data removes noise from real gathers recorded worldwide when given adjacent offset gathers as extra input channels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a U-net architecture built on a ResNet backbone, first pretrained on ImageNet images and then trained on synthetic seismic data, can denoise and deblend common-offset gathers. Adjacent gathers are supplied as extra input channels so the network sees neighboring offset information while cleaning the central gather. The resulting model produces moderate noise removal on real field data collected in multiple geographic regions and works without any user-chosen parameters. The best results occur when three gathers on each side of the target gather are included.

Core claim

We use deep learning, with a U-net model incorporating a ResNet architecture pretrained on ImageNet and further trained on synthetic seismic data, to perform this task. The method is applied to common offset gathers, with adjacent offset gathers of the gather being denoised provided as additional input channels. Here we show that this approach leads to a method that removes noise from several datasets recorded in different parts of the world with moderate success. We find that providing three adjacent offset gathers on either side of the gather being denoised is most effective. As this method does not require parameters to be chosen, it is more automated than traditional methods.

What carries the argument

U-net model with ResNet backbone that receives multiple adjacent common-offset gathers as additional input channels to denoise the central gather.

If this is right

  • The method applies directly to real seismic data collected in different parts of the world.
  • Three adjacent offset gathers on each side of the target gather give the strongest denoising performance.
  • No manual parameter selection is required once the model is trained.
  • The same network handles both random noise removal and source-interference deblending.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Seismic processing pipelines could run faster if the model is integrated into existing workflows.
  • The multi-channel adjacent-gather strategy may transfer to denoising problems in other wavefield imaging domains.
  • Performance on new regions could be checked by adding a small amount of local synthetic or field examples during fine-tuning.

Load-bearing premise

A model trained exclusively on synthetic seismic data will generalize to produce useful denoising on real recorded gathers from multiple geographic regions without further adaptation or parameter selection.

What would settle it

Running the trained model on a fresh collection of real seismic gathers from a geographic region absent from the original tests and checking whether the output shows clear noise reduction compared with the raw data.

read the original abstract

An important step of seismic data processing is removing noise, including interference due to simultaneous and blended sources, from the recorded data. Traditional methods are time-consuming to apply as they often require manual choosing of parameters to obtain good results. We use deep learning, with a U-net model incorporating a ResNet architecture pretrained on ImageNet and further trained on synthetic seismic data, to perform this task. The method is applied to common offset gathers, with adjacent offset gathers of the gather being denoised provided as additional input channels. Here we show that this approach leads to a method that removes noise from several datasets recorded in different parts of the world with moderate success. We find that providing three adjacent offset gathers on either side of the gather being denoised is most effective. As this method does not require parameters to be chosen, it is more automated than traditional methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a U-Net architecture with a ResNet backbone (pretrained on ImageNet and fine-tuned on synthetic seismic data) for denoising and deblending common-offset gathers. Adjacent offset gathers are supplied as additional input channels, with the claim that three gathers on either side is optimal. The method is asserted to remove noise from real datasets recorded in multiple geographic regions with moderate success and without manual parameter selection, offering an automated alternative to traditional techniques.

Significance. If the generalization from synthetic training to real multi-regional data can be demonstrated with quantitative evidence, the work would provide a useful demonstration of deep learning for automating a key step in seismic processing pipelines. The multi-region application and the specific finding on input channel count would be of practical interest to the field.

major comments (2)
  1. [Abstract] Abstract: the central claim of 'moderate success' on real datasets from different parts of the world is unsupported by any reported quantitative metrics (SNR, MSE, or similar), error bars, baseline comparisons against traditional methods, or details on training loss and held-out generalization tests.
  2. [Abstract] Abstract: the generalization claim rests on the untested assumption that synthetic-only training captures the relevant statistics of real noise, signal, and acquisition variations across regions; no description of synthetic data generation, noise modeling, or domain-matching procedure is supplied to evaluate domain-shift risk.
minor comments (1)
  1. [Abstract] The abstract introduces both denoising and deblending but provides no separate discussion of how deblending performance is evaluated or distinguished from denoising.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that the abstract's claims require stronger quantitative support and a clearer description of the synthetic data to allow evaluation of domain shift. We will revise the manuscript to address both points directly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of 'moderate success' on real datasets from different parts of the world is unsupported by any reported quantitative metrics (SNR, MSE, or similar), error bars, baseline comparisons against traditional methods, or details on training loss and held-out generalization tests.

    Authors: We accept the observation. The current manuscript presents results on real data primarily through visual comparison. In revision we will add quantitative metrics (e.g., estimated SNR improvements where feasible) on the real gathers, include comparisons against at least one traditional method on the same data, and report training-loss curves together with held-out synthetic test performance to substantiate the generalization statement. revision: yes

  2. Referee: [Abstract] Abstract: the generalization claim rests on the untested assumption that synthetic-only training captures the relevant statistics of real noise, signal, and acquisition variations across regions; no description of synthetic data generation, noise modeling, or domain-matching procedure is supplied to evaluate domain-shift risk.

    Authors: We agree that the manuscript would benefit from an expanded account of the synthetic data. We will enlarge the methods section to detail the synthetic gather generation procedure, the noise model employed, and any steps taken to align synthetic and real statistics, thereby allowing readers to assess the domain-shift risk themselves. revision: yes

Circularity Check

0 steps flagged

No circularity; standard supervised learning pipeline with external evaluation

full rationale

The paper applies a U-Net (ResNet backbone pretrained on ImageNet, fine-tuned on synthetic seismic data) to denoise real common-offset gathers using adjacent gathers as input channels. No derivation chain, fitted parameter, or uniqueness theorem is presented that reduces to the inputs by construction. Performance is assessed on held-out real data from multiple regions, satisfying the criterion for self-contained evaluation against external benchmarks. No self-citations, ansatzes, or renamings of known results appear in the load-bearing steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the transferability of a network trained on synthetic data to real seismic recordings and on the empirical choice of three adjacent gathers as optimal context.

free parameters (1)
  • number of adjacent offset gathers = 3
    The paper states that three gathers on either side is most effective, implying this value was selected or tuned on the data.
axioms (1)
  • domain assumption Synthetic seismic data distributions are close enough to real recorded data that a network trained on the former will produce useful outputs on the latter.
    Training occurs exclusively on synthetic examples; evaluation is on real field data.

pith-pipeline@v0.9.0 · 5675 in / 1325 out tokens · 36400 ms · 2026-05-25T10:35:01.185691+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.