Predicting disease severity and large-scale spread from coupled severity measurements and imperfect indicators: Application to beet yellows
Pith reviewed 2026-06-26 02:41 UTC · model grok-4.3
The pith
A two-step statistical framework uses indirect indicators like satellite data to predict local disease severity and reconstruct large-scale spread while handling zero inflation and spatio-temporal structure.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that their two-step approach, consisting of stacked hurdle models based on random forests for local predictions from indirect indicators and a semi-parametric spatio-temporal model for reconstruction, successfully leverages imperfect indirect indicators to predict disease severity and dynamics while accounting for zero inflation and spatio-temporal structure in the data.
What carries the argument
The two-step framework of stacked hurdle random forest models for local severity prediction from indirect indicators, followed by a semi-parametric spatio-temporal model for large-scale reconstruction.
If this is right
- Local severity can be predicted even with sparse direct data by using the indirect indicators.
- Large-scale disease dynamics can be reconstructed over space and time from the local predictions.
- The method is modular and generic, applicable to different diseases and indicator types.
- It specifically addresses zero inflation and spatio-temporal dependencies in the observations.
Where Pith is reading between the lines
- This approach might enable more frequent monitoring in resource-limited settings by supplementing field data with remote sensing.
- The framework could be tested on human or animal diseases where similar indirect indicators are available.
- Extensions might include uncertainty quantification in the predictions to improve reliability of the reconstructions.
Load-bearing premise
The indirect indicators contain sufficient signal to allow the stacked hurdle random forest models to produce usable local severity predictions that can then be fed into the spatio-temporal reconstruction step.
What would settle it
A field validation campaign that measures actual disease severity in locations where the model predicts high or low values and finds no agreement between predictions and measurements would falsify the central claim.
Figures
read the original abstract
Whether in human, animal, or plant health, effective disease management requires the ability to characterize disease dynamics across space and time. In this context, integrating indirect indicators with broad spatio-temporal coverage, even when they are noisy, can provide valuable complementary information to direct measurements, which are often sparse because they are more costly or intrusive to collect. In this article, we propose a statistical framework to leverage such indirect indicators to predict disease severity at the individual or local-scale level and reconstruct large-scale disease dynamics. This two-step approach is able to account for the specific characteristics of disease severity observations, including zero inflation and spatio-temporal structure. The first step relies on a stacked hurdle model based on multiple random forests to locally predict disease severity from the available indirect indicators. In the second step a semi-parametric spatio-temporal model is used to reconstruct large-scale epidemiological dynamics over space and time from the indicators-based predictions. The proposed methodology is designed to be both generic and modular, and is illustrated by a case study in plant health. This case study focuses on the monitoring of sugar beet yellows disease in France between 2019 and 2023 by combining sparse field measurements and satellite-based remote sensing data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a two-step statistical framework to predict local disease severity from imperfect indirect indicators (e.g., satellite remote sensing) while accounting for zero inflation, then reconstruct large-scale spatio-temporal disease dynamics. Step 1 uses stacked hurdle random forest models for local predictions; Step 2 feeds those predictions into a semi-parametric spatio-temporal model. The approach is presented as generic and modular and is illustrated via a case study on sugar beet yellows monitoring in France (2019–2023) that combines sparse field measurements with satellite data.
Significance. If the two-step procedure can be shown to produce usable local predictions whose errors do not materially distort the reconstructed dynamics, the framework would offer a practical way to fuse noisy broad-coverage indicators with sparse direct observations in plant and animal epidemiology. The explicit handling of zero inflation and the modular design are clear strengths; however, the absence of reported quantitative validation metrics (prediction accuracy, cross-validation scores, or comparison against direct measurements) in the abstract leaves the practical performance unassessed.
major comments (2)
- [Method (two-step procedure)] The manuscript provides no description of how prediction uncertainty or bias from the first-step stacked hurdle random forests is propagated into the second-step semi-parametric spatio-temporal model. Treating the RF outputs as if they were observed data risks feeding forward spatially structured residuals arising from imperfect satellite indicators or unmodeled factors, which could bias or overstate the reconstructed large-scale dynamics.
- [Abstract and case-study description] The abstract and method outline claim that the stacked hurdle random forests produce usable local severity predictions, yet no quantitative performance metrics, cross-validation results, or comparison against held-out field measurements are referenced. Without such evidence the central claim that the two-step approach successfully accounts for the characteristics of disease severity observations cannot be evaluated.
minor comments (2)
- [Step 1 description] Clarify the exact stacking procedure and how the hurdle components are combined across the multiple random forests.
- [Step 2 description] Specify the form of the semi-parametric spatio-temporal model (basis functions, covariance structure, or software implementation).
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments highlight important aspects of uncertainty handling and the need for explicit validation evidence. We address each point below and commit to revisions that strengthen the manuscript without altering its core claims.
read point-by-point responses
-
Referee: [Method (two-step procedure)] The manuscript provides no description of how prediction uncertainty or bias from the first-step stacked hurdle random forests is propagated into the second-step semi-parametric spatio-temporal model. Treating the RF outputs as if they were observed data risks feeding forward spatially structured residuals arising from imperfect satellite indicators or unmodeled factors, which could bias or overstate the reconstructed large-scale dynamics.
Authors: We agree that the manuscript does not describe propagation of prediction uncertainty or bias from the stacked hurdle random forests into the semi-parametric spatio-temporal model; the first-step outputs are used as fixed inputs. This is a genuine methodological limitation that could allow spatially structured residuals to influence the large-scale reconstruction. In the revised manuscript we will add an explicit discussion of this issue in the Methods and Discussion sections, together with a sensitivity analysis that perturbs the first-step predictions within their observed error ranges and re-runs the second-step model to quantify downstream effects on the reconstructed dynamics. revision: yes
-
Referee: [Abstract and case-study description] The abstract and method outline claim that the stacked hurdle random forests produce usable local severity predictions, yet no quantitative performance metrics, cross-validation results, or comparison against held-out field measurements are referenced. Without such evidence the central claim that the two-step approach successfully accounts for the characteristics of disease severity observations cannot be evaluated.
Authors: The case-study section of the full manuscript reports cross-validation results, prediction accuracy metrics, and direct comparisons against held-out field measurements for the stacked hurdle random forest models. These quantitative results are not summarized in the abstract. We will revise the abstract to include the key performance figures (e.g., cross-validated AUC for the hurdle component and RMSE for severity predictions) so that the claim of usable local predictions is immediately supported by evidence. revision: yes
Circularity Check
No significant circularity; standard two-step predictive pipeline on external data
full rationale
The described methodology trains stacked hurdle random forests on indirect indicators (e.g., satellite data) to generate local severity predictions, then feeds those outputs into a separate semi-parametric spatio-temporal model for large-scale reconstruction. This is a conventional data-driven workflow with no self-definitional steps, no fitted parameters renamed as predictions, and no load-bearing self-citations or uniqueness theorems that reduce the central claim to its own inputs. The approach remains self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Journal of Statistical Distributions and Applications , author =
A comparison of zero-inflated and hurdle models for modeling zero-inflated count data , volume =. Journal of Statistical Distributions and Applications , author =. 2021 , keywords =. doi:10.1186/s40488-021-00121-4 , abstract =
-
[2]
PLOS Neglected Tropical Diseases , author =
Hybrid. PLOS Neglected Tropical Diseases , author =. 2024 , keywords =. doi:10.1371/journal.pntd.0012599 , abstract =
-
[3]
Remote Sensing Letters , author =
Contextual land-cover classification: incorporating spatial dependence in land-cover classification models using random forests and the. Remote Sensing Letters , author =. 2010 , pages =. doi:10.1080/01431160903252327 , abstract =
-
[4]
Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables , volume =. PeerJ , author =. 2018 , pages =. doi:10.7717/peerj.5518 , abstract =
-
[5]
Journal of Agricultural, Biological and Environmental Statistics , author =
Longitudinal. Journal of Agricultural, Biological and Environmental Statistics , author =. 2025 , keywords =. doi:10.1007/s13253-025-00686-6 , abstract =
-
[6]
Building predictive models in R using the caret package
Building. Journal of Statistical Software , author =. 2008 , pages =. doi:10.18637/jss.v028.i05 , abstract =
-
[7]
Statistical Science , author =
Statistical. Statistical Science , author =. 2019 , pages =
2019
-
[8]
Computers and Electronics in Agriculture , author =
A random forest-based algorithm for data-intensive spatial interpolation in crop yield mapping , volume =. Computers and Electronics in Agriculture , author =. 2021 , keywords =. doi:10.1016/j.compag.2021.106094 , abstract =
-
[9]
Hyperparameters and tuning strategies for random forest , volume =. WIREs Data Mining and Knowledge Discovery , author =. 2019 , keywords =. doi:10.1002/widm.1301 , abstract =
-
[10]
Engineering Applications of Artificial Intelligence , author =
Dealing with zero-inflated data:. Engineering Applications of Artificial Intelligence , author =. 2025 , keywords =. doi:10.1016/j.engappai.2025.110339 , abstract =
-
[11]
Annales de l'Institut Henri Poincaré, Probabilités et Statistiques , author =
Trees, forests, and impurity-based variable importance in regression , volume =. Annales de l'Institut Henri Poincaré, Probabilités et Statistiques , author =. 2023 , keywords =. doi:10.1214/21-AIHP1240 , abstract =
-
[12]
Journal of Agricultural, Biological and Environmental Statistics , author =
Zero-. Journal of Agricultural, Biological and Environmental Statistics , author =. 2023 , keywords =. doi:10.1007/s13253-022-00516-z , abstract =
-
[13]
Journal of Applied Ecology , author =
Modelling the incidence of virus yellows in sugar beet in the. Journal of Applied Ecology , author =. 1998 , pages =. doi:10.1046/j.1365-2664.1998.355340.x , abstract =
-
[14]
Journal of Statistical Software , author=
ranger:. Journal of Statistical Software , author =. 2017 , keywords =. doi:10.18637/jss.v077.i01 , abstract =
-
[15]
International Journal of Environmental Research and Public Health , author =
Spatial and. International Journal of Environmental Research and Public Health , author =. 2015 , keywords =. doi:10.3390/ijerph120910536 , abstract =
-
[16]
Research Journal of Agriculture and Biological Sciences , author =
Evaluating water stress influence on growth and photosynthetic pigments of two sugar beet varieties , volume =. Research Journal of Agriculture and Biological Sciences , author =. 2008 , pages =
2008
-
[17]
Biometrical Journal , author =
Zero-inflated spatio-temporal models for disease mapping , volume =. Biometrical Journal , author =. 2017 , keywords =. doi:10.1002/bimj.201600120 , abstract =
-
[18]
Statistica Sinica , author =. 2015 , pmid =. doi:10.5705/ss.2013.212w , abstract =
-
[19]
WIREs Computational Statistics , author =
Zero-inflated modeling part. WIREs Computational Statistics , author =. 2022 , keywords =. doi:10.1002/wics.1540 , abstract =
-
[20]
New insights into virus yellows distribution in. Plant Pathology , author =. 2021 , keywords =. doi:10.1111/ppa.13306 , abstract =
-
[21]
Virus. Pathogens , author =. 2022 , keywords =. doi:10.3390/pathogens11080885 , abstract =
-
[22]
The effects of beet yellows virus on the growth and physiology of sugar beet (. Plant Pathology , author =. 1999 , keywords =. doi:10.1046/j.1365-3059.1999.00307.x , abstract =
-
[23]
Precision Agriculture , author =
Spectral signatures of sugar beet leaves for the detection and differentiation of diseases , volume =. Precision Agriculture , author =. 2010 , keywords =. doi:10.1007/s11119-010-9180-7 , abstract =
-
[24]
Field Crops Research , author =
Effects of beet yellows virus and beet mild yellowing virus on leaf area dynamics of sugar beet (. Field Crops Research , author =. 1999 , keywords =. doi:10.1016/S0378-4290(98)00155-5 , abstract =
-
[25]
Hyperspectral. Phytopathology® , author =. 2023 , keywords =. doi:10.1094/PHYTO-03-22-0086-R , abstract =
-
[26]
Remote Sensing of Environment , author =
Estimating leaf chlorophyll content in sugar beet canopies using millimeter- to centimeter-scale reflectance imagery , volume =. Remote Sensing of Environment , author =. 2017 , keywords =. doi:10.1016/j.rse.2017.06.008 , abstract =
-
[27]
Chlorophyll. Sugar Tech , author =. 2023 , keywords =. doi:10.1007/s12355-022-01184-6 , abstract =
-
[28]
Remote Sensing of Environment , author =. 1996 , pages =. doi:10.1016/S0034-4257(96)00067-3 , abstract =
-
[29]
Bouasria, Abdelkrim and Rahimi, Abdelmejid and El Mjiri, Ikram and Namr, Khalid Ibno and Ettachfini, El Mostafa and Bounif, Mohammed , month = nov, year =. Use of. 2021. doi:10.1109/IEEECONF53624.2021.9668059 , abstract =
-
[30]
Estimating. Agronomy , author =. 2023 , keywords =. doi:10.3390/agronomy13112743 , abstract =
-
[31]
Journal of Agricultural, Biological and Environmental Statistics , author =
A class of models for large zero-inflated spatial data , volume =. Journal of Agricultural, Biological and Environmental Statistics , author =. 2025 , keywords =. doi:10.1007/s13253-024-00619-9 , abstract =
-
[32]
Shen, Chung-Wei and Hsu, Bu-Ren and Hsu, Chia-Ming and Chen, Chun-Shu , month = sep, year =. Efficient estimation for flexible spatial zero-inflated models with environmental applications , url =. doi:10.48550/arXiv.2509.13054 , abstract =
-
[33]
Wilson, Tyler and McDonald, Andrew and Galib, Asadullah Hill and Tan, Pang-Ning and Luo, Lifeng , year =. Beyond. Proceedings of the 28th. doi:10.1145/3534678.3539464 , abstract =
-
[34]
Modelling and interpreting disease progress in time , isbn =
Xu, Xiangming , editor =. Modelling and interpreting disease progress in time , isbn =. The. 2006 , doi =
2006
-
[35]
Phytopathology Research , author =
Analysis and simulation of plant disease progress curves in. Phytopathology Research , author =. 2021 , keywords =. doi:10.1186/s42483-021-00098-7 , abstract =
-
[36]
Syarif, Iwan and Indiarto, Dito Hafizh and Prasetyaningrum, Ira and Badriyah, Tessy and Satriyanto, Edi , month = oct, year =. Corn. 2018. doi:10.1109/iCAST1.2018.8751583 , abstract =
-
[37]
Computational Management Science , author =
Non-. Computational Management Science , author =. 2006 , keywords =. doi:10.1007/s10287-005-0006-4 , abstract =
-
[38]
Applied Mathematics & Optimization , author =
Analysis of a. Applied Mathematics & Optimization , author =. 2022 , keywords =. doi:10.1007/s00245-022-09858-z , abstract =
-
[39]
Journal of Computational and Applied Mathematics , author =
Modeling plant virus propagation with seasonality , volume =. Journal of Computational and Applied Mathematics , author =. 2019 , keywords =. doi:10.1016/j.cam.2018.06.022 , abstract =
-
[40]
, month = may, year =
Lawson, Andrew B. , month = may, year =. Bayesian
-
[41]
ISPRS International Journal of Geo-Information , author =
A. ISPRS International Journal of Geo-Information , author =. 2024 , keywords =. doi:10.3390/ijgi13030097 , abstract =
-
[42]
A spatiotemporal geostatistical hurdle model approach for short-term deforestation prediction , volume =. Spatial Statistics , author =. 2017 , keywords =. doi:10.1016/j.spasta.2017.06.003 , abstract =
-
[43]
Spatio-temporal prediction of crop disease severity for agricultural emergency management based on recurrent neural networks , volume =. GeoInformatica , author =. 2018 , keywords =. doi:10.1007/s10707-017-0314-1 , abstract =
-
[44]
Environmental Technology & Innovation , author =
Modeling spatiotemporal distribution of yellow rust wheat pathogen using machine learning algorithms:. Environmental Technology & Innovation , author =. 2024 , keywords =. doi:10.1016/j.eti.2024.103865 , abstract =
-
[45]
Theoretical and Applied Climatology , author =
Comparison of linear, generalized additive models and machine learning algorithms for spatial climate interpolation , volume =. Theoretical and Applied Climatology , author =. 2024 , pages =. doi:10.1007/s00704-023-04725-5 , abstract =
-
[46]
Annual Review of Public Health , author =
A. Annual Review of Public Health , author =. 2012 , pages =. doi:10.1146/annurev-publhealth-031811-124655 , abstract =
-
[47]
Prediction of. Agriculture , author =. 2021 , keywords =. doi:10.3390/agriculture11111079 , abstract =
-
[48]
Modeling infectious disease dynamics:. Spatial Statistics , author =. 2022 , keywords =. doi:10.1016/j.spasta.2022.100691 , abstract =
-
[49]
Spatial Information Research , author =
Exploring vegetation indices adequate in detecting twister disease of onion using. Spatial Information Research , author =. 2020 , keywords =. doi:10.1007/s41324-019-00297-7 , abstract =
-
[50]
Canadian Journal of Remote Sensing , author =
Experimental assessment of the. Canadian Journal of Remote Sensing , author =. 2009 , pages =. doi:10.5589/m09-010 , abstract =
-
[51]
Opportunities and. Phytopathology® , author =. 2025 , keywords =. doi:10.1094/PHYTO-11-24-0359-FI , abstract =
-
[52]
Field Crops Research , author =
Remote sensing to detect plant stress induced by. Field Crops Research , author =. 2011 , keywords =. doi:10.1016/j.fcr.2011.02.007 , abstract =
-
[53]
National Science Review , author =
Challenges and opportunities in remote sensing-based crop monitoring: a review , volume =. National Science Review , author =. 2023 , pages =. doi:10.1093/nsr/nwac290 , abstract =
-
[54]
Entomologia Generalis , author =
Producing sugar beets without neonicotinoids:. Entomologia Generalis , author =. doi:10.1127/entomologia/2022/1511 , abstract =
-
[55]
Geocarto International , author =
Enhancing maize streak virus detection: a comparative analysis of. Geocarto International , author =. 2025 , keywords =. doi:10.1080/10106049.2025.2480701 , abstract =
-
[56]
European Journal of Agronomy , author =
Optimisation of the correlation between normalised difference vegetation index and sugar beet yield using multispectral remote sensing data , volume =. European Journal of Agronomy , author =. 2025 , keywords =. doi:10.1016/j.eja.2025.127820 , abstract =
-
[57]
Recognition of. Agronomy , author =. 2022 , keywords =. doi:10.3390/agronomy12010014 , abstract =
-
[58]
Detection of grapevine yellows using multispectral imaging , copyright =. Remote. 2024 , doi =
2024
-
[59]
Journal of Agrometeorology , author =
Spectral reflectance characteristics of healthy and yellow mosaic virus infected soybean (. Journal of Agrometeorology , author =. 2013 , pages =. doi:10.54386/jam.v15i1.1435 , abstract =
-
[60]
Spatial and Spatio-temporal Epidemiology , author =
Review of methods for space–time disease surveillance , volume =. Spatial and Spatio-temporal Epidemiology , author =. 2010 , keywords =. doi:10.1016/j.sste.2009.12.001 , abstract =
-
[61]
Stochastic Environmental Research and Risk Assessment , author =
Spatio-temporal data mining in ecological and veterinary epidemiology , volume =. Stochastic Environmental Research and Risk Assessment , author =. 2017 , keywords =. doi:10.1007/s00477-016-1374-8 , abstract =
-
[62]
Tropical Plant Pathology , author =
Plant disease severity estimated visually: a century of research, best practices, and opportunities for improving methods and practices to maximize accuracy , volume =. Tropical Plant Pathology , author =. 2022 , keywords =. doi:10.1007/s40858-021-00439-z , abstract =
-
[63]
Disease. Plant Disease , author =. 2023 , keywords =. doi:10.1094/PDIS-12-21-2734-RE , abstract =
-
[64]
Thirteen challenges in modelling plant diseases , volume =. Epidemics , author =. 2015 , keywords =. doi:10.1016/j.epidem.2014.06.002 , abstract =
-
[65]
Frontiers in Environmental Science , author =
A hybrid modeling approach to simulating foot-and-mouth disease outbreaks in. Frontiers in Environmental Science , author =. 2015 , keywords =. doi:10.3389/fenvs.2015.00017 , abstract =
-
[66]
Tropical Medicine and Infectious Disease , author =
Spatiotemporal. Tropical Medicine and Infectious Disease , author =. 2022 , keywords =. doi:10.3390/tropicalmed7090232 , abstract =
-
[67]
Annual Review of Phytopathology , author =
Remote. Annual Review of Phytopathology , author =. 2020 , pages =. doi:10.1146/annurev-phyto-010820-012832 , abstract =
-
[68]
Frontiers in Veterinary Science , author =
Complex. Frontiers in Veterinary Science , author =. 2019 , keywords =. doi:10.3389/fvets.2019.00153 , abstract =
-
[69]
Getsova, Zhikva and Rangelova, Vanya , month = dec, year =. Syndromic. doi:10.20944/preprints202512.2139.v1 , abstract =
-
[70]
IEEE Transactions on Geoscience and Remote Sensing , author =
Spectral–. IEEE Transactions on Geoscience and Remote Sensing , author =. 2022 , keywords =. doi:10.1109/TGRS.2022.3177935 , abstract =
-
[71]
International Journal of Remote Sensing , author =
Densely connected deep random forest for hyperspectral imagery classification , volume =. International Journal of Remote Sensing , author =. 2019 , pages =. doi:10.1080/01431161.2018.1547932 , abstract =
-
[72]
Pavlyshenko, Bohdan , month = aug, year =. Using. 2018. doi:10.1109/DSMP.2018.8478522 , abstract =
-
[73]
, month = jul, year =
Lawson, Andrew B. , month = jul, year =. Statistical
-
[74]
Factor. Phytopathology® , author =. 2025 , keywords =. doi:10.1094/PHYTO-01-25-0014-FI , abstract =
-
[75]
Compartmental
Brauer, Fred , editor =. Compartmental. Mathematical. 2008 , doi =
2008
-
[76]
Mathematical Biosciences , author =
Some discrete-time. Mathematical Biosciences , author =. 1994 , pages =. doi:10.1016/0025-5564(94)90025-6 , abstract =
-
[77]
Neonicotinoid. Agriculture , author =. 2020 , keywords =. doi:10.3390/agriculture10100484 , abstract =
-
[78]
Pest Management Science , author =
Neonicotinoid concentrations in arable soils after seed treatment applications in preceding years , volume =. Pest Management Science , author =. 2014 , pmid =. doi:10.1002/ps.3836 , abstract =
-
[79]
Neonicotinoids in bees: a review on concentrations, side-effects and risk assessment , volume =. Ecotoxicology , author =. 2012 , keywords =. doi:10.1007/s10646-012-0863-x , abstract =
-
[80]
Human exposure to neonicotinoid insecticides and the evaluation of their potential toxicity:. Chemosphere , author =. 2018 , keywords =. doi:10.1016/j.chemosphere.2017.10.149 , abstract =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.