A Retrospective Benchmark of Spatiotemporal Covariates for Daily Active-Fire Detection in Cerrado Conservation Units

Alexandre Luis Magalh\~aes Levada; Fredy Jo\~ao Valente; Juliano Eleno Silva P\'adua

arxiv: 2606.04170 · v1 · pith:JSNLYXOAnew · submitted 2026-06-02 · 📊 stat.AP

A Retrospective Benchmark of Spatiotemporal Covariates for Daily Active-Fire Detection in Cerrado Conservation Units

Juliano Eleno Silva P\'adua , Alexandre Luis Magalh\~aes Levada , Fredy Jo\~ao Valente This is my paper

Pith reviewed 2026-06-28 07:36 UTC · model grok-4.3

classification 📊 stat.AP

keywords active-fire detectionCerrado conservation unitsspatiotemporal covariatesretrospective benchmarkmachine learningAUC-PRpseudo-absencestime-series cross-validation

0 comments

The pith

A benchmark of nested covariates creates a reproducible reference for ranking atmospheric, surface, spatial and memory features in daily active-fire detection at Cerrado conservation units.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs a retrospective daily active-fire detection benchmark for conservation units and buffers in the Cerrado portion of Minas Gerais using INPE satellite labels and land-cover filtered pseudo-absences. It evaluates logistic regression, random forest and XGBoost across four nested covariate stages drawn from Google Earth Engine under time-series cross-validation and spatially held-out tests at two parks. AUC-PR rises with the addition of short-term memory covariates in cross-validation for all model families, while held-out results show random forest peaking at stage three and XGBoost favoring higher recall at the cost of more warnings. The work supplies a standard, reproducible setup for comparing how different covariate classes contribute to ranking potential fire locations on imbalanced daily data. Because several covariates are same-day, the benchmark remains a classification reference rather than a forward forecast.

Core claim

The paper establishes that the complete temporal-memory covariate stage produces the highest mean AUC-PR in temporal cross-validation across all three model families, while the full pipeline of satellite labels, constrained pseudo-absences and staged covariates supplies a reproducible reference for comparing atmospheric, surface, static spatial and short-term memory contributions in daily conservation-unit scale active-fire ranking.

What carries the argument

Four nested stages of spatiotemporal covariates (atmospheric, surface, static spatial, short-term memory) extracted via Google Earth Engine, paired with INPE BDQueimadas labels and MapBiomas-filtered pseudo-absences, evaluated by logistic regression, random forest and XGBoost under five-fold time-series cross-validation and held-out AOI tests.

If this is right

Adding short-term memory covariates raises mean AUC-PR in temporal cross-validation for logistic regression, random forest and XGBoost.
Random forest reaches its highest held-out AUC-PR at the third covariate stage in both tested conservation units.
XGBoost maps exhibit higher recall but generate larger warning volumes than the other models in the 1:100 prevalence held-out tests.
The staged design allows direct isolation of each covariate class's contribution to daily ranking performance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The benchmark could serve as a template for testing whether the same stage-wise gains appear when the same models are applied to other fire-prone regions with different vegetation and climate.
Score maps from the best stages might be combined with cost or travel-time layers to quantify potential reductions in prevention response times inside conservation units.
Replacing the retrospective same-day covariates with lagged versions would turn the benchmark into a test of prospective forecasting skill.

Load-bearing premise

Constrained pseudo-absences created by same-year land-cover filtering represent true non-fire locations without systematic bias that would change model ranking performance on actual fire events.

What would settle it

If independently sampled real non-fire locations produce substantially different AUC-PR rankings or stage-wise performance orderings than the land-cover filtered pseudo-absences, the benchmark's covariate comparisons would be invalidated.

Figures

Figures reproduced from arXiv: 2606.04170 by Alexandre Luis Magalh\~aes Levada, Fredy Jo\~ao Valente, Juliano Eleno Silva P\'adua.

**Figure 1.** Figure 1: Cerrado-MG study area. Panel (a) shows Minas Gerais in Brazil; panel (b) shows the Cerrado footprint inside Minas Gerais; panel (c) shows Copernicus DEM clipped to the extraction mask. Two state CUs define the independent AOI test geography: Parque Estadual do Pau Furado, in Uberlândia, Minas Gerais, and Parque Estadual da Serra do Cabral, in Augusto de Lima, Minas Gerais. Each CU is paired with its offici… view at source ↗

**Figure 2.** Figure 2: Held-out AOIs. Each panel locates the AOI inside Cerrado-MG and shows Copernicus DEM over the CU and [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Dynamic MapBiomas Collection 9 exclusion mask by year. Excluded pixels are removed from same-year [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Positive-label screening after land-cover filtering. Red points are retained; blue points intersect excluded [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Fire-season diagnostics for BDQueimadas detections inside Cerrado-MG. Panels show monthly counts and [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Fire-density contrast between the May through October fire season and the remaining months inside Cerrado [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Technical workflow for the retrospective daily Cerrado active-fire detection evaluation. [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Temporal-validation metrics by ablation stage and model [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Stage 4 precision-recall curves on pooled temporal-validation predictions [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 10.** Figure 10: Stage 4 ROC curves on pooled temporal-validation predictions. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗

**Figure 11.** Figure 11: shows the corresponding Stage 4 threshold sensitivity for Precision and Recall. Each panel summarizes the five temporal-validation folds with the fold mean and a 95 percent confidence interval. The curves show the Precision-Recall tradeoff beyond the fixed 0.5 diagnostic cutoff. (a) Logistic Regression (b) Random Forest (c) XGBoost [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗

**Figure 12.** Figure 12: Stage 4 temporal-validation SHAP summaries for (a) Logistic Regression, (b) Random Forest, and (c) XGBoost. 5.3 Independent AOI test performance The independent AOI tests evaluate spatial transfer under a stricter 1:100 positive to pseudo-absence sampling ratio than the global training folds. This design fixes the AOI no-skill AUC-PR baseline near 0.010. Serra do Cabral is the primary AOI transfer reading… view at source ↗

**Figure 13.** Figure 13: Serra do Cabral AOI metrics by stage and model (262 positives, 26,200 pseudo absences) [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 14.** Figure 14: Pau Furado AOI metrics by stage and model (6 positives, 600 pseudo absences). 18 [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗

**Figure 15.** Figure 15: expands these fixed-threshold readings across the full probability-score threshold range for both AOIs. The Serra do Cabral row is the more stable transfer diagnostic, while the Pau Furado row should be read as low-support stress-test behavior. (a) Serra do Cabral, Logistic Regression (b) Serra do Cabral, Random Forest (c) Serra do Cabral, XGBoost (d) Pau Furado, Logistic Regression (e) Pau Furado, Random… view at source ↗

**Figure 16.** Figure 16: Stage 4 Pau Furado AOI SHAP summaries for (a) Logistic Regression, (b) Random Forest, and (c) XGBoost. For Pau Furado, the small positive support makes the SHAP plots primarily diagnostic of model behavior on the sampled AOI point set. For Serra do Cabral, the larger test set supports a more stable comparison of AOI-specific explanatory profiles. Random Forest produced the highest Stage 4 AOI AUC-PR, whil… view at source ↗

**Figure 17.** Figure 17: Stage 4 Serra do Cabral AOI SHAP summaries for (a) Logistic Regression, (b) Random Forest, and (c) XGBoost. 5.5 Operational-style retrospective maps The final output is a retrospective diagnostic simulation over dense 500 m AOI grids. The main text reports one high-fire daily heatmap for each AOI. The selected examples are Pau Furado on 18 September 2019 and Serra do Cabral on 13 September 2024. Both figu… view at source ↗

**Figure 18.** Figure 18: Pau Furado Stage 4 operational-style diagnostic heatmap for 18 September 2019. Panels share the same grid and date [PITH_FULL_IMAGE:figures/full_fig_p022_18.png] view at source ↗

**Figure 19.** Figure 19: Serra do Cabral Stage 4 operational-style diagnostic heatmap for 13 September 2024. Panels share the same grid and date. 6 Discussion The results support four main findings. First, feature fusion improves temporal validation in the Cerrado-MG training domain when atmospheric, surface, static spatial, and temporal-memory covariates are added in a row-aligned sequence. Second, transfer to held-out CU and bu… view at source ↗

read the original abstract

Wildfires threaten biodiversity, carbon stocks, and management capacity in the Brazilian Cerrado, where Conservation Units and their official buffer zones must allocate prevention resources under a strong dry-season fire regime. This work develops a retrospective daily active-fire detection benchmark for the Cerrado portion of Minas Gerais, Brazil, using INPE BDQueimadas reference satellite labels (AQUA_M-T), constrained pseudo absences with same-year MapBiomas Collection 9 land-cover filtering, and four nested covariate stages extracted through Google Earth Engine. Logistic Regression, Random Forest, and XGBoost are evaluated under five-fold time-series cross-validation on a global training base and on independent imbalanced test sets spatially held out to Parque Estadual do Pau Furado and Parque Estadual da Serra do Cabral with their official buffer zones. AUC-PR is the primary metric, with AUC-ROC, threshold precision and recall, SHAP explanations, and retrospective score maps used as complementary diagnostics. Temporal cross-validation showed the highest mean AUC-PR at the complete temporal-memory stage for all three model families. Held-out AOI tests were weaker under the stricter 1:100 prevalence design: Random Forest peaked at Stage 3 in both AOIs, while XGBoost maps exposed high-recall, high-warning-volume behavior. The resulting baseline provides a reproducible reference for comparing atmospheric, surface, static spatial, and short-term memory covariates in daily CU-scale active-fire detection ranking. Because several stages use same-day covariates, the study is a retrospective classification benchmark rather than a prospective forecast.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper delivers a new benchmark for covariate ranking in Cerrado fire detection with held-out AOI tests, but the pseudo-absence sampling risks biasing the stage comparisons.

read the letter

The main thing here is a new retrospective benchmark for daily active-fire detection in Cerrado conservation units, built on INPE labels, same-year MapBiomas filtering for pseudo-absences, four nested covariate stages from Google Earth Engine, and held-out tests on two specific parks at 1:100 prevalence.

It does some things solidly. The nested stages allow direct comparison of atmospheric, surface, static spatial, and short-term memory covariates. Time-series cross-validation plus AUC-PR focus fits the imbalanced daily setting, and the held-out AOI results (RF peaking at stage 3) plus SHAP and score maps give concrete diagnostics. The protocol is reproducible enough to serve as a reference for similar work.

The soft spot is the negative class construction. Filtering candidate absences with same-year MapBiomas land-cover classes can correlate with the predictors added in later stages, such as NDVI or precipitation differences between cerrado and agriculture. This could alter the apparent marginal value of stages 2-4 even if the underlying fire process stays the same. The 1:100 held-out tests do not correct for training-set distortion, and the abstract gives no details on data quality checks or bias mitigation. That leaves the reliability of the baseline ranking only partially verifiable.

This is for applied remote sensing and conservation researchers working on fire management in the Cerrado or similar biomes. A reader looking for a practical, reproducible reference on covariate staging would find it useful. The evaluation setup shows honest engagement with the literature and data constraints, so it deserves a serious referee to check the sampling details and see whether the stage rankings hold.

Referee Report

1 major / 2 minor

Summary. The paper develops a retrospective daily active-fire detection benchmark for Cerrado Conservation Units in Minas Gerais, Brazil. It combines INPE BDQueimadas (AQUA_M-T) reference labels with constrained pseudo-absences filtered by same-year MapBiomas Collection 9 land-cover classes, extracts four nested covariate stages (atmospheric, surface, static spatial, short-term memory) via Google Earth Engine, and evaluates Logistic Regression, Random Forest, and XGBoost under five-fold time-series cross-validation plus spatially held-out tests in two parks (Parque Estadual do Pau Furado and Parque Estadual da Serra do Cabral) at 1:100 prevalence. AUC-PR is the primary metric, supplemented by AUC-ROC, threshold metrics, SHAP values, and score maps. Temporal CV favors the full temporal-memory stage for all models; held-out tests are weaker, with Random Forest peaking at Stage 3. The work positions itself as a reproducible reference for ranking covariate utility in daily CU-scale detection rather than a prospective forecast.

Significance. If the pseudo-absence construction proves unbiased, the benchmark supplies a useful, publicly reproducible reference for comparing the incremental value of atmospheric, surface, static, and memory covariates in fire-detection models for conservation units. Credit is due for reliance on external public data sources, time-series cross-validation, multiple complementary metrics including AUC-PR on imbalanced held-out sets, and explicit framing as retrospective classification. The practical focus on resource allocation under dry-season regimes in Brazilian CUs is well-motivated.

major comments (1)

[Methods (pseudo-absence generation and data preparation)] Methods section on pseudo-absence construction: The same-year MapBiomas Collection 9 land-cover filtering used to generate constrained pseudo-absences risks covariate-dependent bias in the negative class. Land-cover classes (e.g., cerrado vs. agriculture) systematically co-vary with the surface (NDVI, LST) and atmospheric predictors successively added in Stages 2–4. Because the central claim is that the benchmark enables reliable ranking of the marginal utility of these stages, any training-set distortion from the filtering can alter apparent performance gains even if the underlying fire process is unchanged. The 1:100 held-out prevalence tests do not correct for this training-set effect. A sensitivity analysis comparing alternative absence sampling schemes (random, different-year, or fire-history-based) is required to establish that the negative samples are exchangeable with true non-fi

minor comments (2)

[Abstract] Abstract: Lacks any mention of data quality controls, precise definitions of the four covariate stages, or the potential for bias in the pseudo-absence procedure, which leaves the central claim only partially verifiable from the summary.
[Results (held-out tests)] Results (held-out AOI tests): The observation that Random Forest peaks at Stage 3 rather than the full stage under 1:100 prevalence should be discussed in relation to the overall recommendation of the complete temporal-memory stage as the reference benchmark.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough review and constructive feedback on our manuscript. We address the major comment regarding pseudo-absence construction below.

read point-by-point responses

Referee: Methods section on pseudo-absence construction: The same-year MapBiomas Collection 9 land-cover filtering used to generate constrained pseudo-absences risks covariate-dependent bias in the negative class. Land-cover classes (e.g., cerrado vs. agriculture) systematically co-vary with the surface (NDVI, LST) and atmospheric predictors successively added in Stages 2–4. Because the central claim is that the benchmark enables reliable ranking of the marginal utility of these stages, any training-set distortion from the filtering can alter apparent performance gains even if the underlying fire process is unchanged. The 1:100 held-out prevalence tests do not correct for this training-set effect. A sensitivity analysis comparing alternative absence sampling schemes (random, different-year, or fire-history-based) is required to establish that the negative samples are exchangeable with true non-fi

Authors: We agree that the land-cover filtering introduces a potential source of bias that could affect the apparent incremental value of the covariate stages. The filtering was chosen to generate more realistic pseudo-absences by restricting to land-cover classes where fires are ecologically plausible, using same-year data to align with the reference labels. However, to address the concern about training-set distortion, we will conduct the suggested sensitivity analysis by comparing the current scheme with random sampling within the study area and different-year MapBiomas filtering. This will be added to the Methods and Results sections in the revised manuscript, including updated stage rankings under alternative schemes. revision: yes

Circularity Check

0 steps flagged

Empirical benchmark relies on external data and held-out tests; no derivation reduces to inputs

full rationale

The manuscript describes a standard retrospective ML classification benchmark. Covariate stages are extracted from public GEE sources; models (LR, RF, XGBoost) are trained under time-series CV and evaluated on spatially held-out AOIs using AUC-PR and related metrics computed directly on reference labels (INPE BDQueimadas) and MapBiomas-filtered pseudo-absences. No equations, first-principles derivations, or 'predictions' are claimed that reduce by construction to fitted parameters or self-referential definitions. Performance numbers are independent of any internal model definition. Self-citations (if present) are not invoked to justify uniqueness or forbid alternatives. The central output is a reproducible reference ranking, not a closed mathematical result.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of pseudo-absence generation via land-cover filtering and the appropriateness of the four nested covariate stages for capturing fire-relevant signals; these are domain assumptions rather than derived quantities.

free parameters (1)

prevalence ratio in held-out tests = 1:100
Stricter 1:100 prevalence design used to simulate real-world imbalance in AOI tests

axioms (2)

domain assumption Constrained pseudo absences with same-year MapBiomas land-cover filtering represent unbiased non-fire locations
Core step for creating training data for all models
domain assumption Time-series cross-validation properly accounts for temporal autocorrelation in daily fire data
Used as primary evaluation protocol

pith-pipeline@v0.9.1-grok · 5827 in / 1401 out tokens · 44916 ms · 2026-06-28T07:36:19.786471+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 27 canonical work pages

[1]

doi: 10.4996/fireecology.0701024. 24 PREPRINT- JUNE4, 2026 Luisa Maria Diele-Viegas, Lilian Sales, Juliana Hipólito, Claudjane Amorim, Eder Johnson de Pereira, Paulo Ferreira, Cody Folta, Lucas Ferrante, Philip Fearnside, Ana Claudia Mendes Malhado, Carlos Frederico Duarte Rocha, and Mariana M. Vale. We’re building it up to burn it down: fire occurrence a...

work page doi:10.4996/fireecology.0701024 2026
[2]

Tânia Beatriz Hoffmann, Andeise Cerqueira Dutra, Yosio Edemir Shimabukuro, Egidio Arai, Henrique Luis Godinho Cassol, Cesare Di Girolamo Neto, and Valdete Duarte

doi: 10.7717/peerj.14276. Tânia Beatriz Hoffmann, Andeise Cerqueira Dutra, Yosio Edemir Shimabukuro, Egidio Arai, Henrique Luis Godinho Cassol, Cesare Di Girolamo Neto, and Valdete Duarte. Fire occurrence in the Brazilian savanna conservation units and their buffer zones. InIGARSS 2020 – 2020 IEEE International Geoscience and Remote Sensing Symposium, pag...

work page doi:10.7717/peerj.14276 2020
[3]

Alberto W

doi: 10.1109/IGARSS39084.2020.9324164. Alberto W. Setzer, Raffi A. Sismanoglu, and José Guilherme Martins dos Santos. Método do cálculo do risco de fogo do programa do INPE — versão 11, junho/2019. Technical Note sid.inpe.br/mtc-m21c/2019/11.21.11.03-PRP, Instituto Nacional de Pesquisas Espaciais (INPE),

work page doi:10.1109/igarss39084.2020.9324164 2020
[4]

Piyush Jain, Sean C

Accessed 2026- 05-09. Piyush Jain, Sean C. P. Coogan, Sriram G. Subramanian, Mark Crowley, Steve Taylor, and Mike D. Flannigan. A review of machine learning applications in wildfire science and management.Environmental Reviews, 28(4):478–505,

2026
[5]

doi: 10.1139/er-2020-0019. Roberto Cilli, Mario Elia, Marina D’Este, Vincenzo Giannico, Nicola Amoroso, Angela Lombardi, Ester Pantaleo, Alfonso Monaco, Giovanni Sanesi, Sabina Tangaro, Roberto Bellotti, and Raffaele Lafortezza. Explainable artificial intelligence (XAI) detects wildfire occurrence in the Mediterranean countries of Southern Europe.Scientif...

work page doi:10.1139/er-2020-0019 2020
[6]

Chuang Yang, Ping Yao, Qiuhua Wang, Shaojun Wang, Dong Xing, Yanxia Wang, and Ji Zhang

doi: 10.1038/s41598-022-20347-9. Chuang Yang, Ping Yao, Qiuhua Wang, Shaojun Wang, Dong Xing, Yanxia Wang, and Ji Zhang. XGBoost-based susceptibility model exhibits high accuracy and robustness in plateau forest fire prediction.Forests, 17(1):74,

work page doi:10.1038/s41598-022-20347-9
[7]

Rong Bian, Keji Chen, Guoqiang Li, Zhengyong Wang, Yilin Qiu, Hua Bai, and Wangying Kong

doi: 10.3390/f17010074. Rong Bian, Keji Chen, Guoqiang Li, Zhengyong Wang, Yilin Qiu, Hua Bai, and Wangying Kong. Evaluation of three algorithms and forest fire risk prediction in Zhejiang province of China.Forests, 15(12):2146,

work page doi:10.3390/f17010074
[8]

Kamilla Martins Freitas, Ronie Silva Juvanhol, Carlos Jorge Gomes Pinheiro, and Anderson Alvarenga de Moura Meneses

doi: 10.3390/f15122146. Kamilla Martins Freitas, Ronie Silva Juvanhol, Carlos Jorge Gomes Pinheiro, and Anderson Alvarenga de Moura Meneses. Prediction of forest fire susceptibility using machine learning tools in the Triunfo do Xingu Environmental Protection Area, Amazon, Brazil.Journal of South American Earth Sciences, 153:105366,

work page doi:10.3390/f15122146
[9]

Orthogonal Representations for Robust Context-Dependent Task Performance in Brains and Neural Networks

doi: 10.1016/j. jsames.2025.105366. David Gunning, Mark Stefik, Jaesik Choi, Timothy Miller, Simone Stumpf, and Guang-Zhong Yang. XAI—explainable artificial intelligence.Science Robotics, 4(37):eaay7120,

work page doi:10.1016/j 2025
[10]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi

doi: 10.1126/scirobotics.aay7120. Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models.ACM Computing Surveys, 51(5):93:1–93:42,

work page doi:10.1126/scirobotics.aay7120
[11]

doi: 10.1145/3236009. Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems 30, volume 30, pages 4768–4777,

work page doi:10.1145/3236009
[12]

Morgane Barbet-Massin, Walter Jetz, Wilfried Thuiller, and Niklaus E

doi: 10.1071/WF20134. Morgane Barbet-Massin, Walter Jetz, Wilfried Thuiller, and Niklaus E. Zimmermann. Selecting pseudo-absences for species distribution models: how, where and how many?Methods in Ecology and Evolution, 3(2):327–338,

work page doi:10.1071/wf20134
[13]

doi: 10.1111/j.2041-210X.2011.00172.x. David W. Hosmer, Stanley Lemeshow, and Rodney X. Sturdivant.Applied Logistic Regression. Wiley, 3 edition,

work page doi:10.1111/j.2041-210x.2011.00172.x 2041
[14]

2013, Applied Logistic Regression: Third Edition (wiley), doi: 10.1002/9781118548387

doi: 10.1002/9781118548387. Leo Breiman. Random forests.Machine Learning, 45(1):5–32,

work page doi:10.1002/9781118548387
[15]

Random forests,

doi: 10.1023/A:1010933404324. Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–794. ACM,

work page doi:10.1023/a:1010933404324
[16]

Xgboost: A scalable tree boosting system

doi: 10.1145/2939672.2939785. Charles E. Van Wagner. Development and structure of the Canadian forest fire weather index system. Forestry Technical Report 35, Canadian Forestry Service, Petawawa National Forestry Institute, Chalk River, ON,

work page doi:10.1145/2939672.2939785
[17]

The relationship between precision-recall and roc curves

ACM. doi: 10.1145/1143844.1143874. 25 PREPRINT- JUNE4, 2026 Takaya Saito and Marc Rehmsmeier. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.PLOS ONE, 10(3):e0118432,

work page doi:10.1145/1143844.1143874 2026
[18]

Rafael H

doi: 10.1371/journal.pone.0118432. Rafael H. M. Pereira and Rogério Jerônimo Barbosa.geobr: Download Official Spatial Data Sets of Brazil,

work page doi:10.1371/journal.pone.0118432
[19]

Louis Giglio, Wilfrid Schroeder, and Christopher O

doi: 10.1016/S0034-4257(03) 00184-6. Louis Giglio, Wilfrid Schroeder, and Christopher O. Justice. The collection 6 MODIS active fire detection algorithm and fire products.Remote Sensing of Environment, 178:31–41,

work page doi:10.1016/s0034-4257(03
[20]

Renata Libonati, Carlos C

doi: 10.1016/j.rse.2016.02.054. Renata Libonati, Carlos C. DaCamara, Alberto W. Setzer, Fabiano Morelli, and Arturo E. Melchiori. An algorithm for burned area detection in the Brazilian Cerrado using 8-micron MODIS imagery.Remote Sensing, 7(11):15782–15803,

work page doi:10.1016/j.rse.2016.02.054 2016
[21]

doi: 10.3390/rs71115782. Julia A. Rodrigues, Renata Libonati, Allan A. Pereira, Joana M. P. Nogueira, Filippe L. M. Santos, Leonardo F. Peres, and Alberto W. Setzer. How well do global burned area products represent fire patterns in the Brazilian Savanna biome? an accuracy assessment of the MCD64 collections.International Journal of Applied Earth Observat...

work page doi:10.3390/rs71115782
[22]

Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore

doi: 10.1016/j.jag.2019.02.010. Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore. Google Earth Engine: Planetary-scale geospatial analysis for everyone.Remote Sensing of Environment, 202:18–27,

work page doi:10.1016/j.jag.2019.02.010 2019
[23]

Gorelick, M

doi: 10.1016/j.rse.2017.06.031. Stijn Hantson, Marc Padilla, Davide Corti, and Emilio Chuvieco. Strengths and weaknesses of MODIS hotspots to characterize global fire occurrence.Remote Sensing of Environment, 131:152–159,

work page doi:10.1016/j.rse.2017.06.031 2017
[24]

doi: 10.1016/j.rse.2012.12

work page doi:10.1016/j.rse.2012.12 2012
[25]

doi: 10.5194/essd-11-529-2019. Mark C. de Jong, Martin J. Wooster, Karl Kitchen, Cathy Manley, Rob Gazzard, and Frank F. McCall. Calibration and evaluation of the Canadian forest fire weather index (FWI) system for improved wildland fire danger rating in the United Kingdom.Natural Hazards and Earth System Sciences, 16(5):1217–1237,

work page doi:10.5194/essd-11-529-2019 2019
[26]

Rosane B

doi: 10.5194/ nhess-16-1217-2016. Rosane B. L. Cavalcante, Bruno M. Souza, Silvio J. Ramos, Markus Gastauer, Wilson R. Nascimento Junior, Cecílio F. Caldeira, and Pedro W. M. Souza-Filho. Assessment of fire hazard weather indices in the eastern Amazon: a case study for different land uses.Acta Amazonica, 51(4):352–362,

2016
[27]

Aleida Yadira Vilchis-Francés, Carlos Díaz-Delgado, Rocío Becerril Piña, Carlos Alberto Mastachi Loza, Miguel Ángel Gómez-Albores, and Khalidou M

doi: 10.1590/1809-4392202101172. Aleida Yadira Vilchis-Francés, Carlos Díaz-Delgado, Rocío Becerril Piña, Carlos Alberto Mastachi Loza, Miguel Ángel Gómez-Albores, and Khalidou M. Bâ. Daily prediction modeling of forest fire ignition using meteorological drought indices in the Mexican highlands.iForest - Biogeosciences and Forestry, 14:437–446,

work page doi:10.1590/1809-4392202101172
[28]

Xiang Hou, Zhiwei Wu, Shihao Zhu, Zhengjie Li, and Shun Li

doi: 10.1016/j.foreco.2021.119897. Xiang Hou, Zhiwei Wu, Shihao Zhu, Zhengjie Li, and Shun Li. Comparative analysis of machine learning-based predictive models for fine dead fuel moisture of subtropical forest in China.Forests, 15(5):736,

work page doi:10.1016/j.foreco.2021.119897 2021
[29]

doi: 10.3390/f15050736. 26

work page doi:10.3390/f15050736

[1] [1]

doi: 10.4996/fireecology.0701024. 24 PREPRINT- JUNE4, 2026 Luisa Maria Diele-Viegas, Lilian Sales, Juliana Hipólito, Claudjane Amorim, Eder Johnson de Pereira, Paulo Ferreira, Cody Folta, Lucas Ferrante, Philip Fearnside, Ana Claudia Mendes Malhado, Carlos Frederico Duarte Rocha, and Mariana M. Vale. We’re building it up to burn it down: fire occurrence a...

work page doi:10.4996/fireecology.0701024 2026

[2] [2]

Tânia Beatriz Hoffmann, Andeise Cerqueira Dutra, Yosio Edemir Shimabukuro, Egidio Arai, Henrique Luis Godinho Cassol, Cesare Di Girolamo Neto, and Valdete Duarte

doi: 10.7717/peerj.14276. Tânia Beatriz Hoffmann, Andeise Cerqueira Dutra, Yosio Edemir Shimabukuro, Egidio Arai, Henrique Luis Godinho Cassol, Cesare Di Girolamo Neto, and Valdete Duarte. Fire occurrence in the Brazilian savanna conservation units and their buffer zones. InIGARSS 2020 – 2020 IEEE International Geoscience and Remote Sensing Symposium, pag...

work page doi:10.7717/peerj.14276 2020

[3] [3]

Alberto W

doi: 10.1109/IGARSS39084.2020.9324164. Alberto W. Setzer, Raffi A. Sismanoglu, and José Guilherme Martins dos Santos. Método do cálculo do risco de fogo do programa do INPE — versão 11, junho/2019. Technical Note sid.inpe.br/mtc-m21c/2019/11.21.11.03-PRP, Instituto Nacional de Pesquisas Espaciais (INPE),

work page doi:10.1109/igarss39084.2020.9324164 2020

[4] [4]

Piyush Jain, Sean C

Accessed 2026- 05-09. Piyush Jain, Sean C. P. Coogan, Sriram G. Subramanian, Mark Crowley, Steve Taylor, and Mike D. Flannigan. A review of machine learning applications in wildfire science and management.Environmental Reviews, 28(4):478–505,

2026

[5] [5]

doi: 10.1139/er-2020-0019. Roberto Cilli, Mario Elia, Marina D’Este, Vincenzo Giannico, Nicola Amoroso, Angela Lombardi, Ester Pantaleo, Alfonso Monaco, Giovanni Sanesi, Sabina Tangaro, Roberto Bellotti, and Raffaele Lafortezza. Explainable artificial intelligence (XAI) detects wildfire occurrence in the Mediterranean countries of Southern Europe.Scientif...

work page doi:10.1139/er-2020-0019 2020

[6] [6]

Chuang Yang, Ping Yao, Qiuhua Wang, Shaojun Wang, Dong Xing, Yanxia Wang, and Ji Zhang

doi: 10.1038/s41598-022-20347-9. Chuang Yang, Ping Yao, Qiuhua Wang, Shaojun Wang, Dong Xing, Yanxia Wang, and Ji Zhang. XGBoost-based susceptibility model exhibits high accuracy and robustness in plateau forest fire prediction.Forests, 17(1):74,

work page doi:10.1038/s41598-022-20347-9

[7] [7]

Rong Bian, Keji Chen, Guoqiang Li, Zhengyong Wang, Yilin Qiu, Hua Bai, and Wangying Kong

doi: 10.3390/f17010074. Rong Bian, Keji Chen, Guoqiang Li, Zhengyong Wang, Yilin Qiu, Hua Bai, and Wangying Kong. Evaluation of three algorithms and forest fire risk prediction in Zhejiang province of China.Forests, 15(12):2146,

work page doi:10.3390/f17010074

[8] [8]

Kamilla Martins Freitas, Ronie Silva Juvanhol, Carlos Jorge Gomes Pinheiro, and Anderson Alvarenga de Moura Meneses

doi: 10.3390/f15122146. Kamilla Martins Freitas, Ronie Silva Juvanhol, Carlos Jorge Gomes Pinheiro, and Anderson Alvarenga de Moura Meneses. Prediction of forest fire susceptibility using machine learning tools in the Triunfo do Xingu Environmental Protection Area, Amazon, Brazil.Journal of South American Earth Sciences, 153:105366,

work page doi:10.3390/f15122146

[9] [9]

Orthogonal Representations for Robust Context-Dependent Task Performance in Brains and Neural Networks

doi: 10.1016/j. jsames.2025.105366. David Gunning, Mark Stefik, Jaesik Choi, Timothy Miller, Simone Stumpf, and Guang-Zhong Yang. XAI—explainable artificial intelligence.Science Robotics, 4(37):eaay7120,

work page doi:10.1016/j 2025

[10] [10]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi

doi: 10.1126/scirobotics.aay7120. Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models.ACM Computing Surveys, 51(5):93:1–93:42,

work page doi:10.1126/scirobotics.aay7120

[11] [11]

doi: 10.1145/3236009. Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems 30, volume 30, pages 4768–4777,

work page doi:10.1145/3236009

[12] [12]

Morgane Barbet-Massin, Walter Jetz, Wilfried Thuiller, and Niklaus E

doi: 10.1071/WF20134. Morgane Barbet-Massin, Walter Jetz, Wilfried Thuiller, and Niklaus E. Zimmermann. Selecting pseudo-absences for species distribution models: how, where and how many?Methods in Ecology and Evolution, 3(2):327–338,

work page doi:10.1071/wf20134

[13] [13]

doi: 10.1111/j.2041-210X.2011.00172.x. David W. Hosmer, Stanley Lemeshow, and Rodney X. Sturdivant.Applied Logistic Regression. Wiley, 3 edition,

work page doi:10.1111/j.2041-210x.2011.00172.x 2041

[14] [14]

2013, Applied Logistic Regression: Third Edition (wiley), doi: 10.1002/9781118548387

doi: 10.1002/9781118548387. Leo Breiman. Random forests.Machine Learning, 45(1):5–32,

work page doi:10.1002/9781118548387

[15] [15]

Random forests,

doi: 10.1023/A:1010933404324. Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–794. ACM,

work page doi:10.1023/a:1010933404324

[16] [16]

Xgboost: A scalable tree boosting system

doi: 10.1145/2939672.2939785. Charles E. Van Wagner. Development and structure of the Canadian forest fire weather index system. Forestry Technical Report 35, Canadian Forestry Service, Petawawa National Forestry Institute, Chalk River, ON,

work page doi:10.1145/2939672.2939785

[17] [17]

The relationship between precision-recall and roc curves

ACM. doi: 10.1145/1143844.1143874. 25 PREPRINT- JUNE4, 2026 Takaya Saito and Marc Rehmsmeier. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.PLOS ONE, 10(3):e0118432,

work page doi:10.1145/1143844.1143874 2026

[18] [18]

Rafael H

doi: 10.1371/journal.pone.0118432. Rafael H. M. Pereira and Rogério Jerônimo Barbosa.geobr: Download Official Spatial Data Sets of Brazil,

work page doi:10.1371/journal.pone.0118432

[19] [19]

Louis Giglio, Wilfrid Schroeder, and Christopher O

doi: 10.1016/S0034-4257(03) 00184-6. Louis Giglio, Wilfrid Schroeder, and Christopher O. Justice. The collection 6 MODIS active fire detection algorithm and fire products.Remote Sensing of Environment, 178:31–41,

work page doi:10.1016/s0034-4257(03

[20] [20]

Renata Libonati, Carlos C

doi: 10.1016/j.rse.2016.02.054. Renata Libonati, Carlos C. DaCamara, Alberto W. Setzer, Fabiano Morelli, and Arturo E. Melchiori. An algorithm for burned area detection in the Brazilian Cerrado using 8-micron MODIS imagery.Remote Sensing, 7(11):15782–15803,

work page doi:10.1016/j.rse.2016.02.054 2016

[21] [21]

doi: 10.3390/rs71115782. Julia A. Rodrigues, Renata Libonati, Allan A. Pereira, Joana M. P. Nogueira, Filippe L. M. Santos, Leonardo F. Peres, and Alberto W. Setzer. How well do global burned area products represent fire patterns in the Brazilian Savanna biome? an accuracy assessment of the MCD64 collections.International Journal of Applied Earth Observat...

work page doi:10.3390/rs71115782

[22] [22]

Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore

doi: 10.1016/j.jag.2019.02.010. Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore. Google Earth Engine: Planetary-scale geospatial analysis for everyone.Remote Sensing of Environment, 202:18–27,

work page doi:10.1016/j.jag.2019.02.010 2019

[23] [23]

Gorelick, M

doi: 10.1016/j.rse.2017.06.031. Stijn Hantson, Marc Padilla, Davide Corti, and Emilio Chuvieco. Strengths and weaknesses of MODIS hotspots to characterize global fire occurrence.Remote Sensing of Environment, 131:152–159,

work page doi:10.1016/j.rse.2017.06.031 2017

[24] [24]

doi: 10.1016/j.rse.2012.12

work page doi:10.1016/j.rse.2012.12 2012

[25] [25]

doi: 10.5194/essd-11-529-2019. Mark C. de Jong, Martin J. Wooster, Karl Kitchen, Cathy Manley, Rob Gazzard, and Frank F. McCall. Calibration and evaluation of the Canadian forest fire weather index (FWI) system for improved wildland fire danger rating in the United Kingdom.Natural Hazards and Earth System Sciences, 16(5):1217–1237,

work page doi:10.5194/essd-11-529-2019 2019

[26] [26]

Rosane B

doi: 10.5194/ nhess-16-1217-2016. Rosane B. L. Cavalcante, Bruno M. Souza, Silvio J. Ramos, Markus Gastauer, Wilson R. Nascimento Junior, Cecílio F. Caldeira, and Pedro W. M. Souza-Filho. Assessment of fire hazard weather indices in the eastern Amazon: a case study for different land uses.Acta Amazonica, 51(4):352–362,

2016

[27] [27]

Aleida Yadira Vilchis-Francés, Carlos Díaz-Delgado, Rocío Becerril Piña, Carlos Alberto Mastachi Loza, Miguel Ángel Gómez-Albores, and Khalidou M

doi: 10.1590/1809-4392202101172. Aleida Yadira Vilchis-Francés, Carlos Díaz-Delgado, Rocío Becerril Piña, Carlos Alberto Mastachi Loza, Miguel Ángel Gómez-Albores, and Khalidou M. Bâ. Daily prediction modeling of forest fire ignition using meteorological drought indices in the Mexican highlands.iForest - Biogeosciences and Forestry, 14:437–446,

work page doi:10.1590/1809-4392202101172

[28] [28]

Xiang Hou, Zhiwei Wu, Shihao Zhu, Zhengjie Li, and Shun Li

doi: 10.1016/j.foreco.2021.119897. Xiang Hou, Zhiwei Wu, Shihao Zhu, Zhengjie Li, and Shun Li. Comparative analysis of machine learning-based predictive models for fine dead fuel moisture of subtropical forest in China.Forests, 15(5):736,

work page doi:10.1016/j.foreco.2021.119897 2021

[29] [29]

doi: 10.3390/f15050736. 26

work page doi:10.3390/f15050736