A Retrospective Benchmark of Spatiotemporal Covariates for Daily Active-Fire Detection in Cerrado Conservation Units
Pith reviewed 2026-06-28 07:36 UTC · model grok-4.3
The pith
A benchmark of nested covariates creates a reproducible reference for ranking atmospheric, surface, spatial and memory features in daily active-fire detection at Cerrado conservation units.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that the complete temporal-memory covariate stage produces the highest mean AUC-PR in temporal cross-validation across all three model families, while the full pipeline of satellite labels, constrained pseudo-absences and staged covariates supplies a reproducible reference for comparing atmospheric, surface, static spatial and short-term memory contributions in daily conservation-unit scale active-fire ranking.
What carries the argument
Four nested stages of spatiotemporal covariates (atmospheric, surface, static spatial, short-term memory) extracted via Google Earth Engine, paired with INPE BDQueimadas labels and MapBiomas-filtered pseudo-absences, evaluated by logistic regression, random forest and XGBoost under five-fold time-series cross-validation and held-out AOI tests.
If this is right
- Adding short-term memory covariates raises mean AUC-PR in temporal cross-validation for logistic regression, random forest and XGBoost.
- Random forest reaches its highest held-out AUC-PR at the third covariate stage in both tested conservation units.
- XGBoost maps exhibit higher recall but generate larger warning volumes than the other models in the 1:100 prevalence held-out tests.
- The staged design allows direct isolation of each covariate class's contribution to daily ranking performance.
Where Pith is reading between the lines
- The benchmark could serve as a template for testing whether the same stage-wise gains appear when the same models are applied to other fire-prone regions with different vegetation and climate.
- Score maps from the best stages might be combined with cost or travel-time layers to quantify potential reductions in prevention response times inside conservation units.
- Replacing the retrospective same-day covariates with lagged versions would turn the benchmark into a test of prospective forecasting skill.
Load-bearing premise
Constrained pseudo-absences created by same-year land-cover filtering represent true non-fire locations without systematic bias that would change model ranking performance on actual fire events.
What would settle it
If independently sampled real non-fire locations produce substantially different AUC-PR rankings or stage-wise performance orderings than the land-cover filtered pseudo-absences, the benchmark's covariate comparisons would be invalidated.
Figures
read the original abstract
Wildfires threaten biodiversity, carbon stocks, and management capacity in the Brazilian Cerrado, where Conservation Units and their official buffer zones must allocate prevention resources under a strong dry-season fire regime. This work develops a retrospective daily active-fire detection benchmark for the Cerrado portion of Minas Gerais, Brazil, using INPE BDQueimadas reference satellite labels (AQUA_M-T), constrained pseudo absences with same-year MapBiomas Collection 9 land-cover filtering, and four nested covariate stages extracted through Google Earth Engine. Logistic Regression, Random Forest, and XGBoost are evaluated under five-fold time-series cross-validation on a global training base and on independent imbalanced test sets spatially held out to Parque Estadual do Pau Furado and Parque Estadual da Serra do Cabral with their official buffer zones. AUC-PR is the primary metric, with AUC-ROC, threshold precision and recall, SHAP explanations, and retrospective score maps used as complementary diagnostics. Temporal cross-validation showed the highest mean AUC-PR at the complete temporal-memory stage for all three model families. Held-out AOI tests were weaker under the stricter 1:100 prevalence design: Random Forest peaked at Stage 3 in both AOIs, while XGBoost maps exposed high-recall, high-warning-volume behavior. The resulting baseline provides a reproducible reference for comparing atmospheric, surface, static spatial, and short-term memory covariates in daily CU-scale active-fire detection ranking. Because several stages use same-day covariates, the study is a retrospective classification benchmark rather than a prospective forecast.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a retrospective daily active-fire detection benchmark for Cerrado Conservation Units in Minas Gerais, Brazil. It combines INPE BDQueimadas (AQUA_M-T) reference labels with constrained pseudo-absences filtered by same-year MapBiomas Collection 9 land-cover classes, extracts four nested covariate stages (atmospheric, surface, static spatial, short-term memory) via Google Earth Engine, and evaluates Logistic Regression, Random Forest, and XGBoost under five-fold time-series cross-validation plus spatially held-out tests in two parks (Parque Estadual do Pau Furado and Parque Estadual da Serra do Cabral) at 1:100 prevalence. AUC-PR is the primary metric, supplemented by AUC-ROC, threshold metrics, SHAP values, and score maps. Temporal CV favors the full temporal-memory stage for all models; held-out tests are weaker, with Random Forest peaking at Stage 3. The work positions itself as a reproducible reference for ranking covariate utility in daily CU-scale detection rather than a prospective forecast.
Significance. If the pseudo-absence construction proves unbiased, the benchmark supplies a useful, publicly reproducible reference for comparing the incremental value of atmospheric, surface, static, and memory covariates in fire-detection models for conservation units. Credit is due for reliance on external public data sources, time-series cross-validation, multiple complementary metrics including AUC-PR on imbalanced held-out sets, and explicit framing as retrospective classification. The practical focus on resource allocation under dry-season regimes in Brazilian CUs is well-motivated.
major comments (1)
- [Methods (pseudo-absence generation and data preparation)] Methods section on pseudo-absence construction: The same-year MapBiomas Collection 9 land-cover filtering used to generate constrained pseudo-absences risks covariate-dependent bias in the negative class. Land-cover classes (e.g., cerrado vs. agriculture) systematically co-vary with the surface (NDVI, LST) and atmospheric predictors successively added in Stages 2–4. Because the central claim is that the benchmark enables reliable ranking of the marginal utility of these stages, any training-set distortion from the filtering can alter apparent performance gains even if the underlying fire process is unchanged. The 1:100 held-out prevalence tests do not correct for this training-set effect. A sensitivity analysis comparing alternative absence sampling schemes (random, different-year, or fire-history-based) is required to establish that the negative samples are exchangeable with true non-fi
minor comments (2)
- [Abstract] Abstract: Lacks any mention of data quality controls, precise definitions of the four covariate stages, or the potential for bias in the pseudo-absence procedure, which leaves the central claim only partially verifiable from the summary.
- [Results (held-out tests)] Results (held-out AOI tests): The observation that Random Forest peaks at Stage 3 rather than the full stage under 1:100 prevalence should be discussed in relation to the overall recommendation of the complete temporal-memory stage as the reference benchmark.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive feedback on our manuscript. We address the major comment regarding pseudo-absence construction below.
read point-by-point responses
-
Referee: Methods section on pseudo-absence construction: The same-year MapBiomas Collection 9 land-cover filtering used to generate constrained pseudo-absences risks covariate-dependent bias in the negative class. Land-cover classes (e.g., cerrado vs. agriculture) systematically co-vary with the surface (NDVI, LST) and atmospheric predictors successively added in Stages 2–4. Because the central claim is that the benchmark enables reliable ranking of the marginal utility of these stages, any training-set distortion from the filtering can alter apparent performance gains even if the underlying fire process is unchanged. The 1:100 held-out prevalence tests do not correct for this training-set effect. A sensitivity analysis comparing alternative absence sampling schemes (random, different-year, or fire-history-based) is required to establish that the negative samples are exchangeable with true non-fi
Authors: We agree that the land-cover filtering introduces a potential source of bias that could affect the apparent incremental value of the covariate stages. The filtering was chosen to generate more realistic pseudo-absences by restricting to land-cover classes where fires are ecologically plausible, using same-year data to align with the reference labels. However, to address the concern about training-set distortion, we will conduct the suggested sensitivity analysis by comparing the current scheme with random sampling within the study area and different-year MapBiomas filtering. This will be added to the Methods and Results sections in the revised manuscript, including updated stage rankings under alternative schemes. revision: yes
Circularity Check
Empirical benchmark relies on external data and held-out tests; no derivation reduces to inputs
full rationale
The manuscript describes a standard retrospective ML classification benchmark. Covariate stages are extracted from public GEE sources; models (LR, RF, XGBoost) are trained under time-series CV and evaluated on spatially held-out AOIs using AUC-PR and related metrics computed directly on reference labels (INPE BDQueimadas) and MapBiomas-filtered pseudo-absences. No equations, first-principles derivations, or 'predictions' are claimed that reduce by construction to fitted parameters or self-referential definitions. Performance numbers are independent of any internal model definition. Self-citations (if present) are not invoked to justify uniqueness or forbid alternatives. The central output is a reproducible reference ranking, not a closed mathematical result.
Axiom & Free-Parameter Ledger
free parameters (1)
- prevalence ratio in held-out tests =
1:100
axioms (2)
- domain assumption Constrained pseudo absences with same-year MapBiomas land-cover filtering represent unbiased non-fire locations
- domain assumption Time-series cross-validation properly accounts for temporal autocorrelation in daily fire data
Reference graph
Works this paper leans on
-
[1]
doi: 10.4996/fireecology.0701024. 24 PREPRINT- JUNE4, 2026 Luisa Maria Diele-Viegas, Lilian Sales, Juliana Hipólito, Claudjane Amorim, Eder Johnson de Pereira, Paulo Ferreira, Cody Folta, Lucas Ferrante, Philip Fearnside, Ana Claudia Mendes Malhado, Carlos Frederico Duarte Rocha, and Mariana M. Vale. We’re building it up to burn it down: fire occurrence a...
-
[2]
doi: 10.7717/peerj.14276. Tânia Beatriz Hoffmann, Andeise Cerqueira Dutra, Yosio Edemir Shimabukuro, Egidio Arai, Henrique Luis Godinho Cassol, Cesare Di Girolamo Neto, and Valdete Duarte. Fire occurrence in the Brazilian savanna conservation units and their buffer zones. InIGARSS 2020 – 2020 IEEE International Geoscience and Remote Sensing Symposium, pag...
-
[3]
doi: 10.1109/IGARSS39084.2020.9324164. Alberto W. Setzer, Raffi A. Sismanoglu, and José Guilherme Martins dos Santos. Método do cálculo do risco de fogo do programa do INPE — versão 11, junho/2019. Technical Note sid.inpe.br/mtc-m21c/2019/11.21.11.03-PRP, Instituto Nacional de Pesquisas Espaciais (INPE),
-
[4]
Piyush Jain, Sean C
Accessed 2026- 05-09. Piyush Jain, Sean C. P. Coogan, Sriram G. Subramanian, Mark Crowley, Steve Taylor, and Mike D. Flannigan. A review of machine learning applications in wildfire science and management.Environmental Reviews, 28(4):478–505,
2026
-
[5]
doi: 10.1139/er-2020-0019. Roberto Cilli, Mario Elia, Marina D’Este, Vincenzo Giannico, Nicola Amoroso, Angela Lombardi, Ester Pantaleo, Alfonso Monaco, Giovanni Sanesi, Sabina Tangaro, Roberto Bellotti, and Raffaele Lafortezza. Explainable artificial intelligence (XAI) detects wildfire occurrence in the Mediterranean countries of Southern Europe.Scientif...
-
[6]
Chuang Yang, Ping Yao, Qiuhua Wang, Shaojun Wang, Dong Xing, Yanxia Wang, and Ji Zhang
doi: 10.1038/s41598-022-20347-9. Chuang Yang, Ping Yao, Qiuhua Wang, Shaojun Wang, Dong Xing, Yanxia Wang, and Ji Zhang. XGBoost-based susceptibility model exhibits high accuracy and robustness in plateau forest fire prediction.Forests, 17(1):74,
-
[7]
Rong Bian, Keji Chen, Guoqiang Li, Zhengyong Wang, Yilin Qiu, Hua Bai, and Wangying Kong
doi: 10.3390/f17010074. Rong Bian, Keji Chen, Guoqiang Li, Zhengyong Wang, Yilin Qiu, Hua Bai, and Wangying Kong. Evaluation of three algorithms and forest fire risk prediction in Zhejiang province of China.Forests, 15(12):2146,
-
[8]
doi: 10.3390/f15122146. Kamilla Martins Freitas, Ronie Silva Juvanhol, Carlos Jorge Gomes Pinheiro, and Anderson Alvarenga de Moura Meneses. Prediction of forest fire susceptibility using machine learning tools in the Triunfo do Xingu Environmental Protection Area, Amazon, Brazil.Journal of South American Earth Sciences, 153:105366,
-
[9]
doi: 10.1016/j. jsames.2025.105366. David Gunning, Mark Stefik, Jaesik Choi, Timothy Miller, Simone Stumpf, and Guang-Zhong Yang. XAI—explainable artificial intelligence.Science Robotics, 4(37):eaay7120,
work page doi:10.1016/j 2025
-
[10]
doi: 10.1126/scirobotics.aay7120. Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models.ACM Computing Surveys, 51(5):93:1–93:42,
-
[11]
doi: 10.1145/3236009. Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems 30, volume 30, pages 4768–4777,
-
[12]
Morgane Barbet-Massin, Walter Jetz, Wilfried Thuiller, and Niklaus E
doi: 10.1071/WF20134. Morgane Barbet-Massin, Walter Jetz, Wilfried Thuiller, and Niklaus E. Zimmermann. Selecting pseudo-absences for species distribution models: how, where and how many?Methods in Ecology and Evolution, 3(2):327–338,
-
[13]
doi: 10.1111/j.2041-210X.2011.00172.x. David W. Hosmer, Stanley Lemeshow, and Rodney X. Sturdivant.Applied Logistic Regression. Wiley, 3 edition,
-
[14]
2013, Applied Logistic Regression: Third Edition (wiley), doi: 10.1002/9781118548387
doi: 10.1002/9781118548387. Leo Breiman. Random forests.Machine Learning, 45(1):5–32,
-
[15]
doi: 10.1023/A:1010933404324. Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–794. ACM,
-
[16]
Xgboost: A scalable tree boosting system
doi: 10.1145/2939672.2939785. Charles E. Van Wagner. Development and structure of the Canadian forest fire weather index system. Forestry Technical Report 35, Canadian Forestry Service, Petawawa National Forestry Institute, Chalk River, ON,
-
[17]
The relationship between precision-recall and roc curves
ACM. doi: 10.1145/1143844.1143874. 25 PREPRINT- JUNE4, 2026 Takaya Saito and Marc Rehmsmeier. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.PLOS ONE, 10(3):e0118432,
-
[18]
doi: 10.1371/journal.pone.0118432. Rafael H. M. Pereira and Rogério Jerônimo Barbosa.geobr: Download Official Spatial Data Sets of Brazil,
-
[19]
Louis Giglio, Wilfrid Schroeder, and Christopher O
doi: 10.1016/S0034-4257(03) 00184-6. Louis Giglio, Wilfrid Schroeder, and Christopher O. Justice. The collection 6 MODIS active fire detection algorithm and fire products.Remote Sensing of Environment, 178:31–41,
-
[20]
doi: 10.1016/j.rse.2016.02.054. Renata Libonati, Carlos C. DaCamara, Alberto W. Setzer, Fabiano Morelli, and Arturo E. Melchiori. An algorithm for burned area detection in the Brazilian Cerrado using 8-micron MODIS imagery.Remote Sensing, 7(11):15782–15803,
-
[21]
doi: 10.3390/rs71115782. Julia A. Rodrigues, Renata Libonati, Allan A. Pereira, Joana M. P. Nogueira, Filippe L. M. Santos, Leonardo F. Peres, and Alberto W. Setzer. How well do global burned area products represent fire patterns in the Brazilian Savanna biome? an accuracy assessment of the MCD64 collections.International Journal of Applied Earth Observat...
-
[22]
Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore
doi: 10.1016/j.jag.2019.02.010. Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore. Google Earth Engine: Planetary-scale geospatial analysis for everyone.Remote Sensing of Environment, 202:18–27,
-
[23]
doi: 10.1016/j.rse.2017.06.031. Stijn Hantson, Marc Padilla, Davide Corti, and Emilio Chuvieco. Strengths and weaknesses of MODIS hotspots to characterize global fire occurrence.Remote Sensing of Environment, 131:152–159,
-
[24]
doi: 10.1016/j.rse.2012.12
-
[25]
doi: 10.5194/essd-11-529-2019. Mark C. de Jong, Martin J. Wooster, Karl Kitchen, Cathy Manley, Rob Gazzard, and Frank F. McCall. Calibration and evaluation of the Canadian forest fire weather index (FWI) system for improved wildland fire danger rating in the United Kingdom.Natural Hazards and Earth System Sciences, 16(5):1217–1237,
-
[26]
Rosane B
doi: 10.5194/ nhess-16-1217-2016. Rosane B. L. Cavalcante, Bruno M. Souza, Silvio J. Ramos, Markus Gastauer, Wilson R. Nascimento Junior, Cecílio F. Caldeira, and Pedro W. M. Souza-Filho. Assessment of fire hazard weather indices in the eastern Amazon: a case study for different land uses.Acta Amazonica, 51(4):352–362,
2016
-
[27]
doi: 10.1590/1809-4392202101172. Aleida Yadira Vilchis-Francés, Carlos Díaz-Delgado, Rocío Becerril Piña, Carlos Alberto Mastachi Loza, Miguel Ángel Gómez-Albores, and Khalidou M. Bâ. Daily prediction modeling of forest fire ignition using meteorological drought indices in the Mexican highlands.iForest - Biogeosciences and Forestry, 14:437–446,
-
[28]
Xiang Hou, Zhiwei Wu, Shihao Zhu, Zhengjie Li, and Shun Li
doi: 10.1016/j.foreco.2021.119897. Xiang Hou, Zhiwei Wu, Shihao Zhu, Zhengjie Li, and Shun Li. Comparative analysis of machine learning-based predictive models for fine dead fuel moisture of subtropical forest in China.Forests, 15(5):736,
-
[29]
doi: 10.3390/f15050736. 26
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.