pith. sign in

arxiv: 2605.20997 · v1 · pith:PHAE4EY2new · submitted 2026-05-20 · 💻 cs.CV · cs.AI· cs.LG· physics.comp-ph

Hybrid Machine Learning Model for Forest Height Estimation from TanDEM-X and Landsat Data

Pith reviewed 2026-05-21 04:41 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LGphysics.comp-ph
keywords forest height estimationTanDEM-XLandsathybrid machine learningphysical modelinterferometric coherenceremote sensingLiDAR validation
0
0 comments X

The pith

Adding Landsat optical data to a hybrid machine learning model improves forest height estimates from TanDEM-X coherence by resolving remaining ambiguities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends a prior hybrid approach that combines machine learning with a physical model to retrieve forest height from TanDEM-X interferometric coherence measurements. The original feature set could not fully resolve ambiguities involving forest height and structure or baseline and terrain slope. By expanding the input space with multispectral Landsat optical bands, the model gains complementary information on forest type and structure. When applied to multiple TanDEM-X acquisitions over the Gabonese Lopé national park and compared to airborne LiDAR reference data, the extended model reduces root-mean-square error by 13.5 percent and mean absolute error by 16.6 percent.

Core claim

The extended hybrid model that incorporates Landsat optical data alongside TanDEM-X coherence measurements supplies complementary forest-type and structure information that resolves height, structure, baseline, and terrain-slope ambiguities, yielding a 13.5 percent reduction in RMSE and a 16.6 percent reduction in MAE relative to the original hybrid model when validated against LiDAR over the Lopé site.

What carries the argument

The expanded feature space of the hybrid machine-learning and physical-model framework that adds Landsat multispectral bands to TanDEM-X coherence inputs to supply independent forest-structure information during training and inversion.

If this is right

  • Forest height maps derived from TanDEM-X become more accurate when optical data are included in the hybrid inversion.
  • Ambiguities between height, structure, baseline, and slope that persisted in the original model are measurably reduced.
  • The same extension approach can be tested on additional TanDEM-X acquisitions to confirm consistent gains.
  • Physical consistency of the retrieved heights is strengthened by the added multispectral constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar optical-SAR fusion could improve other coherence-based geophysical retrievals that currently suffer from under-constrained ambiguities.
  • Operational large-area forest monitoring systems might adopt combined TanDEM-X and Landsat pipelines once the site-specific gains are shown to generalize.
  • The method invites direct comparison against pure physical-model inversions or purely data-driven networks on the same validation data to quantify the hybrid benefit.

Load-bearing premise

The Landsat optical data supplies independent complementary information on forest type or structure that resolves remaining ambiguities without introducing new biases or inconsistencies.

What would settle it

Repeating the experiment on an independent forest site where adding the Landsat bands produces no reduction or an increase in RMSE would show that the complementary-information assumption does not hold.

Figures

Figures reproduced from arXiv: 2605.20997 by Irena Hajnsek, Islam Mansour, Konstantinos Papathanassiou, Ronny Haensch.

Figure 1
Figure 1. Figure 1: Conceptual architecture and functionality of the hybrid model in the training (top) and inference phase (bottom). Table I: Models and associated number of Legendre coef. and features used for training (as shown in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
read the original abstract

Integrating machine learning (ML) with physical models (PM) has emerged as a promising way of retrieving geophysical parameters from remote sensing data. In this context, a ML model for estimating forest height from TanDEM-X interferometric coherence measurements has recently been proposed, that constrains the learning process through a PM. While the features used for training and inversion where selected to ensure the physical consistency of the solutions, they could not resolve all height / structure and baseline / terrain slope ambiguities in the data. To improve this, the extension of the feature space with optical Landsat data is proposed able to provide complementary information on forest type or structure. The extended model is applied and validated on several TanDEM-X acquisitions over the Gabonese Lop\'e national park site and assessed against airborne LiDAR measurements. Results show a 13.5% reduction in RMSE and a 16.6% reduction in MAE compared to the original hybrid model, confirming the added value of multispectral inputs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper extends a prior hybrid machine learning-physical model (ML-PM) for forest height retrieval from TanDEM-X interferometric coherence by adding Landsat multispectral bands to resolve remaining height/structure and baseline/terrain slope ambiguities. The extended model is applied to multiple TanDEM-X acquisitions over the Gabonese Lopé national park and validated against airborne LiDAR, reporting a 13.5% RMSE reduction and 16.6% MAE reduction relative to the original hybrid model.

Significance. If the accuracy gains prove robust and attributable to complementary information rather than increased model capacity, the work would strengthen hybrid PM-ML frameworks for geophysical parameter retrieval by showing how optical data can address SAR-specific ambiguities without violating physical constraints.

major comments (3)
  1. [Abstract / Results] Abstract and Results section: the reported 13.5% RMSE and 16.6% MAE reductions are presented without an ablation study isolating the contribution of Landsat features from simple increases in feature dimensionality or training capacity; this leaves open whether the gains stem from ambiguity resolution as claimed.
  2. [Methods] Methods section: no explicit post-inversion verification is described that confirms TanDEM-X coherence predictions remain consistent with the original physical model (PM) constraints once Landsat bands are active; without this check the hybrid character of the extended model is not demonstrated.
  3. [Validation / Experiments] Validation description: details on feature selection, training procedures, hyperparameter tuning, and controls for site-specific effects or error propagation are absent, preventing assessment of whether the quantitative improvements are generalizable beyond the Lopé site.
minor comments (2)
  1. [Methods] Notation for the physical model constraints and the precise mapping of Landsat bands to forest-type or structure variables should be clarified for reproducibility.
  2. [Figures] Figure captions and axis labels in the results figures could more explicitly indicate which model (original hybrid vs. extended) is shown in each panel.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the contributions and limitations of our work. We address each major comment point by point below.

read point-by-point responses
  1. Referee: [Abstract / Results] Abstract and Results section: the reported 13.5% RMSE and 16.6% MAE reductions are presented without an ablation study isolating the contribution of Landsat features from simple increases in feature dimensionality or training capacity; this leaves open whether the gains stem from ambiguity resolution as claimed.

    Authors: We agree that an explicit ablation study would better isolate the role of Landsat bands in resolving ambiguities. The model architecture and capacity are unchanged from the original hybrid model; only the input feature set is extended. We will add an ablation experiment in the revised manuscript that trains and evaluates the identical model with and without the Landsat features to quantify their specific contribution. revision: yes

  2. Referee: [Methods] Methods section: no explicit post-inversion verification is described that confirms TanDEM-X coherence predictions remain consistent with the original physical model (PM) constraints once Landsat bands are active; without this check the hybrid character of the extended model is not demonstrated.

    Authors: The hybrid formulation preserves physical consistency by embedding the PM within the training loss, as in the original work. We nevertheless recognize the value of an explicit post-inversion check. We will add a verification step that recomputes TanDEM-X coherence from the retrieved heights and compares it against both the observed coherence and the PM forward model to confirm that the added Landsat inputs do not violate the original physical constraints. revision: yes

  3. Referee: [Validation / Experiments] Validation description: details on feature selection, training procedures, hyperparameter tuning, and controls for site-specific effects or error propagation are absent, preventing assessment of whether the quantitative improvements are generalizable beyond the Lopé site.

    Authors: We will expand the Methods and Experiments sections to document the feature-selection criteria (physical relevance plus correlation filtering), the training protocol (including train/validation splits and regularization), the hyperparameter search procedure, and any steps taken to mitigate site-specific bias or error propagation. Because the study relies on the unique availability of airborne LiDAR at Lopé, a full demonstration of generalizability to other sites is not possible with the current dataset; we will add an explicit discussion of this limitation and the conditions under which the approach may transfer. revision: partial

Circularity Check

1 steps flagged

Minor self-citation to prior hybrid model; central results remain independent empirical validations against LiDAR.

specific steps
  1. self citation load bearing [Abstract]
    "a ML model for estimating forest height from TanDEM-X interferometric coherence measurements has recently been proposed, that constrains the learning process through a PM. While the features used for training and inversion where selected to ensure the physical consistency of the solutions, they could not resolve all height / structure and baseline / terrain slope ambiguities in the data."

    The sentence invokes the authors' own prior hybrid model as the starting point whose limitations the new Landsat extension is claimed to overcome. This is a minor self-citation for context; the subsequent RMSE/MAE reductions are measured on external LiDAR and do not reduce to the cited model by definition.

full rationale

The paper extends a previously proposed hybrid ML-PM model (referenced in the abstract as 'recently been proposed') by adding Landsat features. The load-bearing claims are quantitative error reductions (13.5% RMSE, 16.6% MAE) measured on held-out TanDEM-X acquisitions and assessed against independent airborne LiDAR. These metrics are not forced by construction or by the self-citation; they are direct empirical comparisons. The prior-model reference supplies context for the baseline but does not substitute for the new validation data or the reported improvements.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No explicit free parameters, axioms, or invented entities are described in the abstract; the model relies on an existing physical model for constraints and standard ML training, with Landsat data treated as an additional input source.

pith-pipeline@v0.9.0 · 5717 in / 1186 out tokens · 37959 ms · 2026-05-21T04:41:33.627027+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

  1. [1]

    1 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. X, NO. X, XXX 2026 Hybrid Machine Learning Model for Forest Height Estimation from TanDEM-X and Landsat Data Islam Mansour, Member, IEEE, Ronny Hänsch, Senior Member, IEEE, Irena Hajnsek, Fellow, IEEE, and Konstantinos Papathanassiou, Fellow, IEEE Abstract—Integrating machine learning (ML) with physic...

  2. [2]

    !"#(κ$) can be expressed as [18], [21]: γ

    This work was supported by the DeepSAR Research Project, funded by Helmholtz AI under the Helmholtz Association of German Research Centers (HGF). (Corresponding author: Islam Mansour.) Islam Mansour and Irena Hajnsek are with the Microwaves and Radar Institute, German Aerospace Center (DLR), 82234 Weßling, Germany, and also with the Chair of Earth Observa...

  3. [3]

    used seven Legendre polynomials (N=7) to define the vertical reflectivity profiles and three TanDEM-X acquisitions (two ascending and one descending) for training. The input features included the interferometric volume coherence 𝛾"=>?(𝜅9), the terrain corrected vertical wavenumber 𝜅9, the incidence angle 𝜃2 , and the terrain slope in the range direction 𝛼...

  4. [4]

    !"#B|. This predicted value is then compared to the observed volume coherence |γ

    Four Landsat spectral bands (Red, NIR, SWIR1, and SWIR2) were derived from the circa 2019 cloud-free composite of the Hansen Global Forest Change v1.7 product [20], primarily based on Landsat-8 imagery, with fallback to the nearest cloud-free year (2010–2015). Reference forest height data were derived from full-waveform LiDAR collected by NASA’s LVIS sens...

  5. [5]

    =>?(κ$)|. The resulting solution spaces from the “learned

    |𝛄"𝐕𝐨𝐥(𝛋𝐳)| vs. 𝛋𝐳𝐡𝐯 product. The plots are generated using the Lopé forest height estimates 𝐡𝐯 obtained from the inversion of all the five TanDEM-X acquisitions using two models (left model C and right model D). The colors indicate the relative number of samples and goes from dark blue (low) to dark red (high). difference. The coefficients are estimated ...

  6. [6]

    2 results

    Scene No. 2 results. Top: coherence and slope maps. Middle: residual height errors for Models C (left) and D (right). Bottom: false-RGB composites of the first three Legendre coefficients, showing clearer spatial patterns for Model D, especially across forest–savannah transitions. forest–savannah transitions. To further assess the model’s improvement, we ...

  7. [7]

    Large-Scale Forest Height Mapping by Combining TanDEM-X and GEDI Data,

    C. Choi et al., “Large-Scale Forest Height Mapping by Combining TanDEM-X and GEDI Data,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 16, pp. 2374–2385, 2023, doi: 10.1109/JSTARS.2023.3244866

  8. [8]

    Forest Height Estimation by Means of TanDEM-X InSAR and Waveform Lidar Data,

    R. Guliaev, V. Cazcarra-Bes, M. Pardini, and K. Papathanassiou, “Forest Height Estimation by Means of TanDEM-X InSAR and Waveform Lidar Data,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 14, pp. 3084–3094, 2021, doi: 10.1109/JSTARS.2021.3058837

  9. [9]

    TanDEM-X Pol-InSAR Performance for Forest Height Estimation,

    F. Kugler, D. Schulze, I. Hajnsek, H. Pretzsch, and K. P. Papathanassiou, “TanDEM-X Pol-InSAR Performance for Forest Height Estimation,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 10, pp. 6404–6422, Oct. 2014, doi: 10.1109/TGRS.2013.2296533

  10. [10]

    A Deep Learning Framework for the Estimation of Forest Height From Bistatic TanDEM-X Data,

    D. Carcereri, P. Rizzoli, D. Ienco, and L. Bruzzone, “A Deep Learning Framework for the Estimation of Forest Height From Bistatic TanDEM-X Data,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 16, pp. 8334–8352, 2023, doi: 10.1109/JSTARS.2023.3310209

  11. [11]

    Forest Height Estimation Using Multibaseline PolInSAR and Sparse Lidar Data Fusion,

    M. Denbina, M. Simard, and B. Hawkins, “Forest Height Estimation Using Multibaseline PolInSAR and Sparse Lidar Data Fusion,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 11, no. 10, pp. 3415–3433, Oct. 2018, doi: 10.1109/JSTARS.2018.2841388

  12. [12]

    Connecting spaceborne lidar with NFI networks: A method for improved estimation of forest structure and biomass,

    P. B. May, R. O. Dubayah, J. M. Bruening, and G. C. Gaines, “Connecting spaceborne lidar with NFI networks: A method for improved estimation of forest structure and biomass,” Int. J. Appl. Earth Obs. Geoinformation, vol. 129, p. 103797, May 2024, doi: 10.1016/j.jag.2024.103797

  13. [13]

    Hybrid Machine Learning Forest Height Estimation From TanDEM-X InSAR,

    I. Mansour, K. Papathanassiou, R. Hänsch, and I. Hajnsek, “Hybrid Machine Learning Forest Height Estimation From TanDEM-X InSAR,” IEEE Trans. Geosci. Remote Sens., vol. 63, pp. 1–11, 2025, doi: 10.1109/TGRS.2024.3520387

  14. [14]

    C-band repeat-pass interferometric SAR observations of the forest,

    J. I. H. Askne, P. B. G. Dammert, L. M. H. Ulander, and G. Smith, “C-band repeat-pass interferometric SAR observations of the forest,” IEEE Trans. Geosci. Remote Sens., vol. 35, no. 1, pp. 25–35, Jan. 1997, doi: 10.1109/36.551931

  15. [15]

    Decorrelation in interferometric radar echoes,

    H. A. Zebker and J. Villasenor, “Decorrelation in interferometric radar echoes,” IEEE Trans. Geosci. Remote Sens., vol. 30, no. 5, pp. 950–959, Sep. 1992, doi: 10.1109/36.175330

  16. [16]

    Repeat-pass SAR interferometry over forested terrain,

    J. O. Hagberg, L. M. H. Ulander, and J. Askne, “Repeat-pass SAR interferometry over forested terrain,” IEEE Trans. Geosci. Remote Sens., vol. 33, no. 2, pp. 331–340, Mar. 1995, doi: 10.1109/TGRS.1995.8746014

  17. [17]

    Synthetic aperture radar interferometry,

    R. Bamler and P. Hartl, “Synthetic aperture radar interferometry,” Inverse Probl., vol. 14, no. 4, p. R1, Aug. 1998, doi: 10.1088/0266-5611/14/4/001

  18. [18]

    Coherence evaluation of TanDEM-X interferometric data,

    M. Martone, B. Bräutigam, P. Rizzoli, C. Gonzalez, M. Bachmann, and G. Krieger, “Coherence evaluation of TanDEM-X interferometric data,” ISPRS J. Photogramm. Remote Sens., vol. 73, pp. 21–29, Sep. 2012, doi: 10.1016/j.isprsjprs.2012.06.006

  19. [19]

    Cramer–Rao Lower Bound Analysis of Vegetation Height Estimation With Random Volume Over Ground Model and Polarimetric SAR Interferometry,

    A. Roueff, A. Arnaubec, P. C. Dubois-Fernandez, and P. Refregier, “Cramer–Rao Lower Bound Analysis of Vegetation Height Estimation With Random Volume Over Ground Model and Polarimetric SAR Interferometry,” IEEE Geosci. Remote Sens. Lett., vol. 8, no. 6, pp. 1115–1119, Nov. 2011, doi: 10.1109/LGRS.2011.2157891

  20. [20]

    doi: 10.1093/acprof:oso/9780199569731.001.0001

  21. [21]

    Forest Height Estimation by Means of Pol-InSAR Data Inversion: The Role of the Vertical Wavenumber,

    F. Kugler, Seung-Kuk Lee, I. Hajnsek, and K. P. Papathanassiou, “Forest Height Estimation by Means of Pol-InSAR Data Inversion: The Role of the Vertical Wavenumber,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 10, pp. 5294–5311, Oct. 2015, doi: 10.1109/TGRS.2015.2420996

  22. [22]

    Polarization coherence tomography,

    S. R. Cloude, “Polarization coherence tomography,” Radio Sci., vol. 41, no. 4, 2006, doi: 10.1029/2005RS003436

  23. [23]

    A Lidar-Radar Framework to Assess the Impact of Vertical Forest Structure on Interferometric Coherence,

    M. Brolly, M. Simard, H. Tang, R. O. Dubayah, and J. P. Fisk, “A Lidar-Radar Framework to Assess the Impact of Vertical Forest Structure on Interferometric Coherence,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 9, no. 12, pp. 5830–5841, Dec. 2016, doi: 10.1109/JSTARS.2016.2527360

  24. [24]

    doi: 10.5270/ESA-c5d3d65

  25. [25]

    High-Resolution Global Maps of 21st-Century Forest Cover Change,

    M. C. Hansen et al., “High-Resolution Global Maps of 21st-Century Forest Cover Change,” Science, vol. 342, no. 6160, pp. 850–853, Nov. 2013, doi: 10.1126/science.1244693

  26. [26]

    AfriSAR: Gridded Forest Biomass and Canopy Metrics Derived from LVIS, Gabon, 2016

    J. Armston et al., “AfriSAR: Gridded Forest Biomass and Canopy Metrics Derived from LVIS, Gabon, 2016.” ORNL Distributed Active Archive Center,