INLA-RF: A Hybrid Modeling Strategy for Spatio-Temporal Environmental Data
Pith reviewed 2026-05-19 02:37 UTC · model grok-4.3
The pith
A hybrid INLA-RF strategy combines Bayesian spatio-temporal models with random forests to improve predictions while propagating uncertainty between stages.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that INLA-RF1, which treats random-forest predictions as an offset inside the INLA-SPDE linear predictor, and INLA-RF2, which lets the forest directly adjust selected nodes of the latent field, both allow uncertainty to flow from the first stage to the second. A Kullback-Leibler divergence criterion decides when the iterative loop has converged. Two simulation studies demonstrate that these hybrids deliver higher predictive accuracy for spatio-temporal processes than standalone INLA-SPDE or random forest while preserving calibration of posterior intervals.
What carries the argument
The INLA-RF iterative two-stage framework, in which random-forest output is either inserted as an offset or used to correct latent-field nodes inside an INLA-SPDE model, with a Kullback-Leibler divergence stopping rule that controls iteration.
If this is right
- Spatio-temporal environmental forecasts become more accurate without sacrificing the ability to report credible intervals.
- Modelers can retain the interpretability of a Bayesian latent-field description while borrowing the flexibility of tree-based learners for non-linear effects.
- The same two-stage structure can be applied to other Bayesian geostatistical models that admit offsets or node-wise corrections.
- A Kullback-Leibler stopping rule provides an objective, data-driven way to terminate the hybrid iteration.
Where Pith is reading between the lines
- The approach could be extended to real-world air-quality or climate datasets where ground-truth values are only partially observed.
- Similar hybrids might be tested in domains such as epidemiology or ecology where both spatial structure and abrupt local changes matter.
- If the uncertainty propagation holds, the method could serve as a drop-in replacement for pure machine-learning models that currently lack calibrated intervals.
Load-bearing premise
That feeding random-forest predictions into the INLA-SPDE model as an offset or as node corrections produces statistically valid uncertainty propagation between the two stages.
What would settle it
A simulation in which the hybrid model's 95 percent credible intervals cover the true held-out values at a rate materially below 95 percent would falsify the claim of coherent uncertainty quantification.
Figures
read the original abstract
Environmental processes often exhibit complex, non-linear patterns and discontinuities across space and time, posing significant challenges for traditional geostatistical modeling approaches. In this paper, we propose a hybrid spatio-temporal modeling framework that combines the interpretability and uncertainty quantification of Bayesian models -- estimated using the INLA-SPDE approach -- with the predictive power and flexibility of Random Forest (RF). Specifically, we introduce two novel algorithms, collectively named INLA-RF, which integrate a statistical spatio-temporal model with RF in an iterative two-stage framework. The first algorithm (INLA-RF1) incorporates RF predictions as an offset in the INLA-SPDE model, while the second (INLA-RF2) uses RF to directly correct selected latent field nodes. Both hybrid strategies enable uncertainty propagation between modeling stages, an aspect often overlooked in existing hybrid approaches. In addition, we propose a Kullback-Leibler divergence-based stopping criterion. We evaluate the predictive performance and uncertainty quantification capabilities of the proposed algorithms through two simulation studies. Results suggest that our hybrid approach enhances spatio-temporal prediction while maintaining interpretability and coherence in uncertainty estimates.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes INLA-RF, a hybrid spatio-temporal modeling framework that integrates the INLA-SPDE approach for Bayesian inference with Random Forest to capture complex non-linear patterns in environmental data. It defines two iterative algorithms: INLA-RF1, which inserts RF predictions as an offset into the INLA-SPDE linear predictor, and INLA-RF2, which overwrites selected nodes of the latent Gaussian field with RF corrections. A Kullback-Leibler divergence criterion is introduced to terminate the iteration. Performance is assessed via two simulation studies, with the central claim that the hybrids improve predictive accuracy while preserving interpretability and coherent uncertainty quantification.
Significance. The work targets a practical gap in spatio-temporal modeling where pure INLA-SPDE struggles with strong non-linearities and discontinuities. If the uncertainty-propagation mechanism can be shown to produce calibrated credible intervals, the approach would supply a usable compromise between the flexibility of tree-based methods and the structured uncertainty of Gaussian Markov random fields. The simulation-based evaluation and the explicit KL stopping rule are constructive elements that could be strengthened by coverage diagnostics.
major comments (2)
- [§3.1] §3.1 (INLA-RF1 algorithm): Treating RF point predictions as a fixed offset conditions the INLA posterior on those values without propagating RF variability into the linear predictor or hyperparameter posteriors. Standard INLA offset semantics treat the term as known; therefore the reported credible intervals are narrower than they would be under a joint model. The simulation studies must report empirical coverage of the nominal 95 % intervals under this construction to substantiate the coherence claim.
- [§3.2] §3.2 (INLA-RF2 algorithm): Directly replacing selected nodes of the latent field with RF values risks breaking the Markov property and the Gaussianity required by the SPDE approximation. The subsequent INLA step then computes marginals for an altered GMRF whose precision matrix is no longer consistent with the original mesh. A diagnostic (e.g., comparison of marginal variances before and after correction, or a small-scale joint-model benchmark) is needed to confirm that the final posteriors remain valid.
minor comments (3)
- [Simulation studies] The two simulation studies are referenced but their data-generating processes, mesh resolutions, and exact baseline comparators (pure INLA, pure RF, other hybrids) are not tabulated; adding a summary table of design parameters would improve reproducibility.
- [Figures] Figure captions should explicitly state whether plotted intervals are pointwise credible intervals or predictive intervals and should include the nominal coverage level used for calibration checks.
- [Stopping criterion] The KL-divergence stopping rule is introduced without a reference or derivation; a short appendix deriving its form from the iterative scheme would clarify its statistical justification.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help clarify important aspects of uncertainty quantification in our hybrid framework. We respond to each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [§3.1] §3.1 (INLA-RF1 algorithm): Treating RF point predictions as a fixed offset conditions the INLA posterior on those values without propagating RF variability into the linear predictor or hyperparameter posteriors. Standard INLA offset semantics treat the term as known; therefore the reported credible intervals are narrower than they would be under a joint model. The simulation studies must report empirical coverage of the nominal 95 % intervals under this construction to substantiate the coherence claim.
Authors: We agree that the use of RF predictions as a fixed offset in INLA-RF1 does not propagate RF variability into the INLA posterior or hyperparameters, and that the resulting credible intervals are therefore narrower than those from a fully joint model. The iterative structure provides limited feedback, but does not fully resolve this issue. To substantiate the coherence claim, we will add empirical coverage diagnostics for the nominal 95% intervals to the simulation studies in the revised manuscript. revision: yes
-
Referee: [§3.2] §3.2 (INLA-RF2 algorithm): Directly replacing selected nodes of the latent field with RF values risks breaking the Markov property and the Gaussianity required by the SPDE approximation. The subsequent INLA step then computes marginals for an altered GMRF whose precision matrix is no longer consistent with the original mesh. A diagnostic (e.g., comparison of marginal variances before and after correction, or a small-scale joint-model benchmark) is needed to confirm that the final posteriors remain valid.
Authors: We recognize that node replacement in INLA-RF2 can affect the Markov property and Gaussianity assumptions of the SPDE approximation, potentially altering the precision matrix. While the corrections are applied selectively, this remains an approximation. We will add the recommended diagnostics, including comparisons of marginal variances before and after correction, to the revised manuscript to confirm posterior validity. revision: yes
Circularity Check
No significant circularity: new hybrid integration method is self-contained
full rationale
The paper proposes two explicit algorithmic constructions (INLA-RF1 using RF predictions as offset; INLA-RF2 correcting selected latent nodes) inside an iterative loop with a KL-divergence stopping rule. These are presented as novel methodological choices built on the established INLA-SPDE and Random Forest frameworks rather than derived from equations that reduce to their own fitted parameters or prior self-citations. No step equates a claimed prediction or uncertainty-propagation property to a quantity defined in terms of itself; the uncertainty-coherence claim is an asserted property of the proposed integration, not a tautological renaming or fitted-input prediction. The derivation chain therefore remains independent of the target results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption INLA-SPDE provides interpretable Bayesian inference and uncertainty quantification for spatio-temporal processes
- domain assumption Random Forest can effectively capture non-linear patterns and discontinuities in environmental data
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce two novel algorithms, collectively named INLA-RF, which integrate a statistical spatio-temporal model with RF in an iterative two-stage framework. The first algorithm (INLA-RF1) incorporates RF predictions as an offset in the INLA-SPDE model, while the second (INLA-RF2) uses RF to directly correct selected latent field nodes.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Both hybrid strategies enable uncertainty propagation between modeling stages... We propose a Kullback-Leibler divergence-based stopping criterion.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in ":" * " " * FUNCTION f...
-
[2]
author N. Cressie , author C. K. Wikle , title Statistics for Spatio-Temporal data , publisher Wiley , year 2011
work page 2011
-
[3]
author E. T. Krainski , author V. Gómez-Rubio , author H. Bakka , author A. Lenzi , author D. Castro-Camilo , author D. Simpson , author F. Lindgren , author H. Rue , title Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA , publisher CRC-Press , year 2018
work page 2018
-
[4]
author C. K. Wikle , author A. Zammit-Mangion , author N. Cressie , title Spatio-Temporal Statistics with R , publisher CRC-Press , year 2019
work page 2019
-
[5]
author H. Rue , author S. Martino , author N. Chopin , title Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , journal Journal of the Royal Statistical Society. Series B: Statistical Methodology volume 71 ( year 2009 ) pages 319--392 . :10.1111/j.1467-9868.2008.00700.x
-
[6]
author F. Lindgren , author H. Rue , author J. Lindström , title An explicit link between G aussian F ields and G aussian M arkov R andom F ields: the S tochastic P artial D ifferential E quation approach , journal J. Royal Stat. Soc.: Series B (Statistical Methodology) volume 73 ( year 2011 ) pages 423--498
work page 2011
-
[7]
author M. Pichler , author F. Hartig , title Machine learning and deep learning—a review for ecologists , journal Methods in Ecology and Evolution volume 14 ( year 2023 ) pages 994--1016 . :https://doi.org/10.1111/2041-210X.14061
-
[8]
author C. K. Wikle , author A. Zammit-Mangion , title Statistical deep learning for spatial and spatiotemporal data , journal Annual Review of Statistics and Its Application volume 10 ( year 2023 ) pages 247--270
work page 2023
-
[9]
author I. E. Agbehadji , author I. C. Obagbuwa , title Systematic review of machine learning and deep learning techniques for spatiotemporal air quality prediction , journal Atmosphere volume 15 ( year 2024 )
work page 2024
-
[10]
author E. W. Fox , author J. M. Ver Hoef , author A. R. Olsen , title Comparing spatial regression to random forests for large environmental data sets , journal PLoS One volume 15 ( year 2020 ) pages e0229509
work page 2020
-
[11]
author P. S. G. de Mattos Neto , author G. D. C. Cavalcanti , author D. S. de O. Santos Júnior , author J. A. de Souza , author F. Ren , title Hybrid systems using residual modeling for sea surface temperature forecasting , journal Scientific Reports volume 12 ( year 2022 ) pages 487 . https://doi.org/10.1038/s41598-021-04238-z. :10.1038/s41598-021-04238-z
-
[12]
author D. P. Johnson , author N. Ravi , author G. Filippelli , author A. Heintzelman , title A Novel Hybrid Approach: Integrating Bayesian SPDE and Deep Learning for Enhanced Spatiotemporal Modeling of PM2.5 Concentrations in Urban Airsheds for Sustainable Climate Action and Public Health , journal Sustainability volume 16 ( year 2024 ). https://www.mdpi....
work page 2024
-
[13]
author A. Kakouri , author T. Kontos , author G. Grivas , author G. Filippis , author M.-B. Korras-Carraca , author C. Matsoukas , author A. Gkikas , author E. Athanasopoulou , author O. Speyer , author C. Chatzidiakos , author E. Gerasopoulos , title Spatiotemporal modeling of long-term PM2.5 concentrations and population exposure in Greece, using machin...
-
[14]
author Q. Di , author H. Amini , author L. Shi , author I. Kloog , author R. Silvern , author J. T. Kelly , author M. B. Sabath , author C. Choirat , author P. Koutrakis , author A. Lyapustin , et al., title An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution , journal Environment Internat...
work page 2019
-
[15]
author I. A. Gheyas , author L. S. Smith , title A novel neural network ensemble architecture for time series forecasting , journal Neurocomputing volume 74 ( year 2011 ) pages 3855--3864 . :https://doi.org/10.1016/j.neucom.2011.08.005
-
[16]
author F. Saad , author J. Burnim , author C. Carroll , et al., title Scalable spatiotemporal prediction with bayesian neural fields , journal Nature Communications volume 15 ( year 2024 ) pages 7942 . https://doi.org/10.1038/s41467-024-51477-5. :10.1038/s41467-024-51477-5
-
[17]
author C. MacBride , author V. Davies , author D. Lee , title A spatial autoregressive random forest algorithm for small-area spatial prediction , journal The Annals of Applied Statistics volume 19 ( year 2025 ) pages 485 -- 504 . :10.1214/24-AOAS1969
-
[18]
author G. Jona Lasinio , author G. Mastrantonio , author A. Pollice , title Discussing the “big n problem” , journal Statistical Methods and Applications volume 22 ( year 2013 ) pages 97--112 . https://doi.org/10.1007/s10260-012-0207-2. :10.1007/s10260-012-0207-2
-
[19]
author F. Lindgren , author D. Bolin , author H. Rue , title The SPDE approach for Gaussian and non-Gaussian fields: 10 years and still running , journal Spatial Statistics volume 50 ( year 2022 ). :https://doi.org/10.1016/j.spasta.2022.100599, note special Issue: The Impact of Spatial Statistics
-
[20]
author F. Lindgren , author H. Rue , title Bayesian Spatial Modelling with R - INLA , journal Journal of Statistical Software volume 63 ( year 2015 ) pages 1--25 . :10.18637/jss.v063.i19
-
[21]
author H. Bakka , author H. Rue , author G.-A. Fuglstad , author A. Riebler , author D. Bolin , author J. Illian , author E. Krainski , author D. Simpson , author F. Lindgren , title Spatial modeling with r-inla: A review , journal WIREs Computational Statistics volume 10 ( year 2018 ) pages e1443 . :https://doi.org/10.1002/wics.1443
-
[22]
Gómez-Rubio , title Bayesian Inference with INLA , publisher Chapman & Hall/CRC Press , year 2020
author V. Gómez-Rubio , title Bayesian Inference with INLA , publisher Chapman & Hall/CRC Press , year 2020 . :10.1201/9781315175584
-
[23]
author J. Van Niekerk , author H. Rue , title Low-rank Variational Bayes correction to the Laplace method , journal Journal of Machine Learning Research volume 25 ( year 2024 ) pages 1--25
work page 2024
-
[24]
author J. Van Niekerk , author E. Krainski , author D. Rustand , author H. Rue , title A new avenue for Bayesian inference with INLA , journal Computational Statistics & Data Analysis volume 181 ( year 2023 ) pages 107692 . :10.1016/j.csda.2023.107692
-
[25]
author H. Rue , author L. Held , title Gaussian Markov Random Fields , publisher Chapman and Hall/CRC , year 2005 . :10.1201/9780203492024
-
[26]
author F. Lindgren , author H. Rue , author J. Lindström , title An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , journal Journal of the Royal Statistical Society: Series B (Statistical Methodology) volume 73 ( year 2011 ) pages 423--498 . :https://doi.org/10.1111/j.1467-98...
-
[27]
Breiman , title Random forests , journal Machine learning volume 45 ( year 2001 ) pages 5--32
author L. Breiman , title Random forests , journal Machine learning volume 45 ( year 2001 ) pages 5--32
work page 2001
-
[28]
author B. Efron , title Bootstrap Methods: Another Look at the Jackknife , journal The Annals of Statistics volume 7 ( year 1979 ) pages 1 -- 26 . https://doi.org/10.1214/aos/1176344552. :10.1214/aos/1176344552
-
[29]
author L. Breiman , author J. Friedman , author C. J. Stone , author R. Olshen , title Classification and regression tree analysis , publisher CRC Press , year 1984
work page 1984
-
[30]
author G. James , author D. Witten , author T. Hastie , author R. Tibshirani , title An Introduction to Statistical Learning: with Applications in R , edition 2nd ed., publisher Springer US , year 2021
work page 2021
-
[31]
author L. Patelli , author M. Cameletti , author N. Golini , author R. Ignaccolo , title A Path in Regression Random Forest Looking for Spatial Dependence: A Taxonomy and a Systematic Review , publisher Springer Nature Switzerland , address Cham , year 2024 , pp. pages 467--489 . :10.1007/978-3-031-69111-9_23
-
[32]
author A. Saha , author S. Basu , author A. Datta , title Random forests for spatially dependent data , journal Journal of the American Statistical Association volume 118 ( year 2023 ) pages 665--683
work page 2023
- [33]
-
[34]
author S. Kullback , title Information Theory and Statistics , A Wiley publication in mathematical statistics, publisher Dover Publications , year 1997
work page 1997
-
[35]
author M. Cameletti , author F. Lindgren , author D. Simpson , author H. Rue , title Spatio-temporal modeling of particulate matter concentration through the SPDE approach , journal AStA Advances in Statistical Analysis volume 97 ( year 2013 ) pages 109--131 . https://doi.org/10.1007/s10182-012-0196-3. :10.1007/s10182-012-0196-3
-
[36]
author P. Moraga , title Geospatial Health Data: Modeling and Visualization with R‑INLA and Shiny , publisher Chapman & Hall/CRC , year 2019
work page 2019
-
[37]
author G. Fioravanti , author S. Martino , author M. Cameletti , author G. Cattani , title Spatio-temporal modelling of PM10 daily concentrations in Italy using the SPDE approach , journal Atmospheric Environment volume 248 ( year 2021 ) pages 118192
work page 2021
-
[38]
author G. Fioravanti , author M. Cameletti , author S. Martino , author G. Cattani , author E. Pisoni , title A spatiotemporal analysis of NO concentrations during the Italian 2020 COVID-19 lockdown , journal Environmetrics volume 33 ( year 2022 ) pages e2723
work page 2020
-
[39]
author G. Fioravanti , author S. Martino , author M. Cameletti , author A. Toreti , title Interpolating climate variables by using INLA and the SPDE approach , journal International Journal of Climatology volume 43 ( year 2023 ) pages 6866--6886
work page 2023
-
[40]
author P. Otto , author A. F. Moro , author J. Rodeschini , author Q. Shaboviq , author R. Ignaccolo , author N. Golini , author M. Cameletti , author P. Maranzano , author F. Finazzi , author A. Fass\`o , title Spatiotemporal modelling of PM _ 2.5 concentrations in Lombardy (Italy): a comparative study , journal Environmental and Ecological Statistics vo...
-
[41]
author D. Simpson , author H. Rue , author A. Riebler , author T. G. Martins , author S. H. Sørbye , title Penalising model component complexity: A principled, practical approach to constructing priors , journal Statistical Science volume 32 ( year 2017 ) pages 1--28 . :10.1214/16-STS576
-
[42]
Journal of the American Statistical Association , author =
author G.-A. Fuglstad , author D. Simpson , author F. Lindgren , author H. Rue , title Constructing Priors that Penalize the Complexity of Gaussian Random Fields , journal Journal of the American Statistical Association volume 114 ( year 2019 ) pages 445--452 . :10.1080/01621459.2017.1415907
-
[43]
author D. R. Roberts , author V. Bahn , author S. Ciuti , author M. S. Boyce , author J. Elith , author G. Guillera‐Arroita , author S. Hauenstein , author J. J. Lahoz‐Monfort , author B. Schröder , author W. Thuiller , author D. I. Warton , author B. A. Wintle , author F. Hartig , author C. F. Dormann , title Cross‐validation strategies for data with tem...
-
[44]
author H. Meyer , author C. Reudenbach , author T. Hengl , author M. Katurji , author T. Nauss , title Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation , journal Environmental Modelling & Software volume 101 ( year 2018 ) pages 1--9 . :10.1016/j.envsoft.2017.12.001, note publis...
-
[45]
author OpenAI , title Chatgpt (gpt-4-turbo, accessed june 2025) , howpublished https://chat.openai.com , year 2025 . note Large language model
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.