Deep Learning for Soil Moisture Estimation: Fusing Satellite Data with Optimally-Lagged Meteorological Features
Pith reviewed 2026-06-26 14:32 UTC · model grok-4.3
The pith
Incorporating time-lagged meteorological data and soil depth information improves deep learning models for estimating soil moisture from satellite observations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study shows that determining optimal temporal lags (0-30 days) for meteorological variables and inter-depth lags (0-15 days) using cross-correlation, and incorporating them into CNN, LSTM, and CNN-LSTM models, leads to better soil moisture prediction, with the hybrid model achieving R^2 of 0.930 on held-out data.
What carries the argument
The Cross-Correlation Function (CCF) methodology to determine optimal temporal lags between meteorological variables and soil moisture, as well as inter-depth lags describing vertical moisture propagation from the surface to deeper layers.
If this is right
- Meteorological variables with optimal lags improve performance compared to using satellite data alone.
- Including subsurface depth information is decisive for accurate predictions across all tested model architectures.
- A per-pixel CNN achieves the strongest single-patch result with R^2 of 0.877.
- A pooled CNN-LSTM hybrid reaches the highest overall performance with R^2 of 0.930.
Where Pith is reading between the lines
- The lag selection process might transfer to estimating other delayed environmental processes such as groundwater response.
- Retraining the lag finder on new regions could be necessary if physical conditions differ substantially from the original plots.
- The multi-patch training strategy indicates that pooling data from multiple sites aids generalization within similar agricultural settings.
Load-bearing premise
The lags selected via cross-correlation on the training plots capture generalizable physical delays rather than dataset-specific correlations or noise.
What would settle it
Retraining and testing the models on data from a different semi-arid region without re-computing the lags from cross-correlation, and checking if the performance improvement over the satellite-only baseline holds or vanishes.
Figures
read the original abstract
Accurate soil moisture estimation in semi-arid agricultural regions requires integrating remote sensing and meteorological information while accounting for the delayed response of soil moisture to atmospheric forcing. This study introduces a Cross-Correlation Function (CCF) methodology to determine optimal temporal lags (0-30 days) between meteorological variables and soil moisture, as well as inter-depth lags (0-15 days) describing vertical moisture propagation from the surface (10 cm) to deeper layers (20-50 cm). The approach was validated across seven agricultural plots in southeastern Spain. Three deep learning architectures, each targeting a distinct prediction granularity, were evaluated under five feature configurations ranging from satellite-only to full satellite-meteorology-depth fusion: a CNN for per-pixel estimation within each plot, an LSTM for frame-level (daily plot-mean) prediction, and a CNN-LSTM hybrid operating on sliding windows with pooled multi-patch training. Models were assessed on held-out data to measure genuine generalisation. Meteorological variables improved performance over the satellite-only baseline, while subsurface depth information proved decisive across all architectures. The per-pixel CNN achieved the strongest single-patch result (R^2 = 0.877, RMSE = 2.28), with a seven-patch average R^2 of 0.535, representing an improvement of +1.00 over the satellite-only baseline. The pooled CNN-LSTM hybrid obtained the highest overall performance (R^2 = 0.930, CVRMSE = 8.0%). These results demonstrate that explicitly modelling atmospheric and vertical subsurface delays substantially improves soil moisture estimation for precision agriculture.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a Cross-Correlation Function (CCF) approach can identify optimal temporal lags (0-30 days for meteorological variables, 0-15 days for inter-depth propagation) between satellite observations, meteorological drivers, and soil moisture at multiple depths. These lagged features are then fused into three deep-learning architectures (per-pixel CNN, LSTM, and CNN-LSTM hybrid) and evaluated on held-out data from seven agricultural plots in southeastern Spain. The central empirical result is that adding the optimally lagged meteorological and subsurface-depth features produces substantial gains over a satellite-only baseline, with the pooled CNN-LSTM hybrid reaching R² = 0.930 and the per-pixel CNN achieving a seven-plot average R² = 0.535 (+1.00 over baseline).
Significance. If the reported gains are shown to arise from physically transferable delay modeling rather than plot-specific lag selection, the work would provide concrete evidence that explicit incorporation of atmospheric and vertical propagation delays improves deep-learning soil-moisture retrievals for precision agriculture. The concrete held-out metrics and the comparison across five feature configurations constitute a clear, falsifiable demonstration of the value of the lagged-feature strategy.
major comments (2)
- [Methods (CCF methodology)] Methods (CCF lag selection): The manuscript does not state whether the cross-correlation functions used to select the 0-30-day meteorological lags and 0-15-day inter-depth lags were computed exclusively on the training subset of the seven plots or on the full dataset. Because the reported performance lift (e.g., +1.00 R² for the per-pixel CNN) is attributed to these lags, any leakage from held-out plots into lag choice would render the generalization claim circular.
- [Experimental setup and validation] Experimental design (seven-plot validation): With only seven plots and no description of nested cross-validation or lag-sensitivity analysis, it remains possible that the chosen lags capture plot-specific irrigation schedules or soil heterogeneity rather than generalizable physical response times. A leave-one-plot-out protocol with fixed lags would be required to substantiate the claim that delay modeling drives the observed improvements.
minor comments (1)
- [Abstract] Abstract and results: The seven-patch average R² = 0.535 is presented as an improvement of +1.00 over the satellite-only baseline; the baseline value itself should be reported explicitly for direct comparison.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and indicate the revisions that will be made to the manuscript.
read point-by-point responses
-
Referee: [Methods (CCF methodology)] Methods (CCF lag selection): The manuscript does not state whether the cross-correlation functions used to select the 0-30-day meteorological lags and 0-15-day inter-depth lags were computed exclusively on the training subset of the seven plots or on the full dataset. Because the reported performance lift (e.g., +1.00 R² for the per-pixel CNN) is attributed to these lags, any leakage from held-out plots into lag choice would render the generalization claim circular.
Authors: We agree that this procedural detail is not explicitly stated and should be clarified. The CCF lag selections were performed exclusively on the training subsets of each plot to avoid any leakage from held-out data. We will add a clear statement to this effect in the Methods section of the revised manuscript. revision: yes
-
Referee: [Experimental setup and validation] Experimental design (seven-plot validation): With only seven plots and no description of nested cross-validation or lag-sensitivity analysis, it remains possible that the chosen lags capture plot-specific irrigation schedules or soil heterogeneity rather than generalizable physical response times. A leave-one-plot-out protocol with fixed lags would be required to substantiate the claim that delay modeling drives the observed improvements.
Authors: Our current validation uses held-out data portions within the seven plots to evaluate generalization. We acknowledge that the small number of plots and lack of explicit leave-one-plot-out or lag-sensitivity analysis leaves room for plot-specific effects. To address this, we will add a leave-one-plot-out evaluation (with lags fixed from the original training procedure) and a brief lag-sensitivity analysis in the revised manuscript. revision: yes
Circularity Check
No circularity: lag selection and model evaluation follow standard non-circular supervised ML pipeline on held-out data.
full rationale
The paper selects lags via CCF on training plots then trains DL models to predict soil moisture from the lagged features, evaluating on held-out plots. This is ordinary feature engineering followed by supervised training and generalization testing; the reported R² values are not equivalent to the CCF correlations by construction, nor do any equations reduce the target prediction to a fitted parameter or self-citation. No self-definitional steps, no uniqueness theorems, and no load-bearing self-citations appear in the provided text. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Cross-correlation function applied to the plot data identifies lags that reflect genuine atmospheric and vertical propagation delays rather than spurious correlations.
Reference graph
Works this paper leans on
-
[1]
Manrique-Alba, S
À. Manrique-Alba, S. Ruiz-Yanetti, H. Moutahir, K. Novak, M. De Luis, J. Bellot, Soil moisture and its role in growth-climate relationships across an aridity gradient in semiarid pinus halepensis forests, Science of the Total Environment 574 (2017) 982–990
2017
-
[2]
A. V. Ines, N. N. Das, J. W. Hansen, E. G. Njoku, Assimilation of remotely sensed soil moisture and vegetation with a crop simulation model for maize yield prediction, Remote Sensing of Environment 138 (2013) 149–164
2013
-
[3]
R. S. Ayers, D. W. Westcot, et al., Water quality for agriculture, volume 29, Food and agriculture organization of the United Nations Rome, 1985
1985
-
[4]
D. A. Robinson, S. B. Jones, J. M. Wraith, D. Or, S. P. Friedman, A review of advances in dielectric and electrical conductivity measure- ment in soils using time domain reflectometry, Vadose zone journal 2 (2003) 444–475. A. Canovas-Rodriguez et al.:Preprint submitted to ElsevierPage 14 of 15 Deep Learning for Soil Moisture Estimation
2003
-
[5]
M. S. Farooq, S. Riaz, A. Abid, K. Abid, M. A. Naeem, A survey on theroleofiotinagriculturefortheimplementationofsmartfarming, IEEE access 7 (2019) 156237–156271
2019
-
[6]
A. Garg, V. Sreshta, N. Mehta, Application of soil moisture sensors in agriculture: A review, International Journal of Research in Engi- neering and Applied Sciences 6 (2016) 55–64
2016
-
[7]
Y. H. Kerr, P. Waldteufel, J.-P. Wigneron, S. Delwart, F. Cabot, J. Boutin, M.-J. Escorihuela, J. Font, N. Reul, C. Gruhier, et al., The smos mission: New tool for monitoring key elements ofthe global water cycle, Proceedings of the IEEE 98 (2010) 666–687
2010
-
[8]
Entekhabi, E
D. Entekhabi, E. G. Njoku, P. E. O’neill, K. H. Kellogg, W. T. Crow, W.N.Edelstein,J.K.Entin,S.D.Goodman,T.J.Jackson,J.Johnson, et al., The soil moisture active passive (smap) mission, Proceedings of the IEEE 98 (2010) 704–716
2010
-
[9]
Drusch, U
M. Drusch, U. Del Bello, S. Carlier, O. Colin, V. Fernandez, F. Gas- con,B.Hoersch,C.Isola,P.Laberinti,P.Martimort,etal., Sentinel-2: Esa’s optical high-resolution mission for gmes operational services, Remote sensing of Environment 120 (2012) 25–36
2012
-
[10]
E.G.Njoku,T.J.Jackson,V.Lakshmi,T.K.Chan,S.V.Nghiem,Soil moistureretrievalfromamsr-e, IEEEtransactionsonGeoscienceand remote sensing 41 (2003) 215–229
2003
-
[11]
Y. Wang, W. Wang, Z. Ma, M. Zhao, W. Li, X. Hou, J. Li, F. Ye, W. Ma, A deep learning approach based on physical constraints for predicting soil moisture in unsaturated zones, Water Resources Research 59 (2023) e2023WR035194
2023
-
[12]
E. H. Hegazi, A. A. Samak, L. Yang, R. Huang, J. Huang, Prediction of soil moisture content from sentinel-2 images using convolutional neural network (cnn), Agronomy 13 (2023) 656
2023
-
[13]
Q. Geng, S. Yan, Q. Li, C. Zhang, Enhancing data-driven soil moisture modeling with physically-guided lstm networks, Frontiers in Forests and Global Change 7 (2024) 1353011
2024
-
[14]
J. Yu, X. Zhang, L. Xu, J. Dong, L. Zhangzhong, A hybrid cnn-gru model for predicting soil moisture in maize root zone, Agricultural Water Management 245 (2021) 106649
2021
-
[15]
A. Rani, N. Kumar, J. Kumar, N. K. Sinha, Machine learning for soil moisture assessment, in: Deep learning for sustainable agriculture, Elsevier, 2022, pp. 143–168
2022
-
[16]
Ahmad, A
S. Ahmad, A. Kalra, H. Stephen, Estimating soil moisture using remotesensingdata:Amachinelearningapproach,Advancesinwater resources 33 (2010) 69–80
2010
-
[17]
C.S.Lee,E.Sohn,J.D.Park,J.-D.Jang, Estimationofsoilmoisture using deep learning based on satellite data: A case study of south korea, GIScience & Remote Sensing 56 (2019) 43–67
2019
-
[18]
J. Wei, R. Song, Spatiotemporal characteristics of soil moisture memory: an integrated analysis using multiple metrics and datasets, Climate Dynamics 63 (2025) 228
2025
-
[19]
Z. Bai, S. Jia, G. Wang, M. Huang, W. Zhang, Near real-time reconstruction of 0–200 cm soil moisture profiles in croplands using shallow-layer monitoring and multi-day meteorological accumula- tions, Agronomy 15 (2025) 2864
2025
-
[20]
M.Rahmati,W.Amelung,C.Brogi,J.Dari,A.Flammini,H.Bogena, L.Brocca,H.Chen,J.Groh,R.D.Koster,etal., Soilmoisturemem- ory:State-of-the-art andtheway forward, ReviewsofGeophysics 62 (2024) e2023RG000828
2024
-
[21]
T. Wu, L. Xu, Y. Lv, R. Cai, Z. Pan, X. Zhang, X. Zhang, N. Chen, Integratingcausalinferencewithconvlstmnetworksforspatiotempo- ral forecasting of root zone soil moisture, Journal of Hydrology 659 (2025) 133246
2025
-
[22]
Kapoor, A
S. Kapoor, A. Narayanan, Leakage and the reproducibility crisis in machine-learning-based science, Patterns 4 (2023)
2023
-
[23]
D. R. Roberts, V. Bahn, S. Ciuti, M. S. Boyce, J. Elith, G. Guillera- Arroita,S.Hauenstein,J.J.Lahoz-Monfort,B.Schröder,W.Thuiller, et al., Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure, Ecography 40 (2017) 913– 929
2017
-
[24]
Le Rest, D
K. Le Rest, D. Pinaud, P. Monestiez, J. Chadoeuf, V. Bretagnolle, Spatial leave-one-out cross-validation for variable selection in the presenceofspatialautocorrelation, Globalecologyandbiogeography 23 (2014) 811–820
2014
-
[25]
M. Shah, M. S. Raval, S. Divakaran, A systematic review on deep learning for atmospheric correction of satellite images, Archives of Computational Methods in Engineering (2025) 1–31
2025
-
[26]
Hochreiter, J
S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation 9 (1997) 1735–1780
1997
-
[27]
D.F.Kandamali,E.Porter,W.M.Porter,A.McLemore,D.O.Kiobia, A.P.Tavandashti,G.C.Rains, Hybridlstmmethodformultistepsoil moisture prediction using historical soil moisture and weather data, AgriEngineering 7 (2025) 260
2025
-
[28]
J.Li,D.Hong,L.Gao,J.Yao,K.Zheng,B.Zhang,J.Chanussot,Deep learninginmultimodalremotesensingdatafusion:Acomprehensive review, International Journal of Applied Earth Observation and Geoinformation 112 (2022) 102926
2022
-
[29]
J. Liu, Z. Hao, J. Ding, Y. Zhang, Z. Miao, Y. Zheng, A. Alimu, H. Cheng, X. Li, Ensemble machine-learning-based framework for estimatingsurfacesoilmoistureusingSentinel-1/2data:Acasestudy of an arid oasis in China, Land 13 (2024) 1635
2024
-
[30]
Chatenoux, J.-P
B. Chatenoux, J.-P. Richard, D. Small, C. Roeoesli, V. Wingate, C. Poussin, D. Rodila, P. Peduzzi, C. Steinmeier, C. Ginzler, A. Pso- mas, M. E. Schaepman, G. Giuliani, The Swiss data cube, analysis readydataarchiveusingearthobservationsofSwitzerland, Scientific Data 8 (2021) 295
2021
-
[31]
Australia, Digital earth australia (2024)
G. Australia, Digital earth australia (2024). A. Canovas-Rodriguez et al.:Preprint submitted to ElsevierPage 15 of 15
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.