Recognition: 3 theorem links
· Lean TheoremLow-Rank Adaptation of Geospatial Foundation Models for Wildfire Mapping Using Sentinel-2 Data
Pith reviewed 2026-05-08 18:21 UTC · model grok-4.3
The pith
Low-Rank Adaptation lets geospatial foundation models map wildfires across regions more accurately than full fine-tuning while changing less than 1% of parameters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across experiments on burned-area mapping, Low-Rank Adaptation of the Prithvi-v2 model achieves the highest accuracy and the largest gains over full fine-tuning. It does so while updating less than 1% of the model's parameters and shows superior cross-domain generalization in spatial and temporal tests across US and Canada biomes. The study compares Terramind, DINOv3, and Prithvi-v2 using 3820 wildfire events from 2017-2023.
What carries the argument
Low-Rank Adaptation (LoRA), which adapts the models by training only low-rank update matrices for selected layers instead of all parameters.
If this is right
- LoRA consistently outperforms full fine-tuning in cross-domain settings for all three models tested.
- Prithvi-v2 with LoRA gives the best accuracy-efficiency trade-off for burned-area mapping.
- Decoder-only fine-tuning proves less effective than LoRA for generalization across regions and times.
- Geospatial foundation models become practical for operational wildfire mapping when paired with parameter-efficient methods like LoRA.
Where Pith is reading between the lines
- Similar lightweight adaptation might extend to other satellite tasks such as flood detection or land-cover change.
- Testing on imagery from additional continents would reveal whether the cross-domain gains hold beyond North America.
- The reduced compute cost could support more frequent map updates as new Sentinel-2 acquisitions arrive.
Load-bearing premise
That the selected 3820 wildfire events from 2017-2023 and the spatial-temporal splits across US and Canada biomes sufficiently capture real-world domain shifts without data leakage or selection bias.
What would settle it
Observing that on a held-out test set from a new biome or later year, the LoRA-adapted Prithvi-v2 no longer shows higher accuracy than the fully fine-tuned version.
Figures
read the original abstract
Wildfire burned-area mapping is essential for damage assessment, emissions modeling, and understanding fire-climate interactions across diverse ecological regions. Recent geospatial foundation models provide strong general-purpose representations for satellite imagery, yet there is still no clear understanding of how to efficiently adapt these models for downstream Earth observation tasks, particularly under geographic and temporal domain shift. This study evaluates three state-of-the-art Geospatial Foundation Models (GFMs) - Terramind, DINOv3, and Prithvi-v2 - for burned-area mapping across the United States and Canada using Sentinel-2 data. Leveraging 3,820 wildfire events from 2017-2023, we conduct spatial and temporal generalization tests across diverse biomes. We systematically compare full fine-tuning, decoder-only fine-tuning, and Low-Rank Adaptation (LoRA) for adapting each model. Across all experiments, LoRA provides the strongest cross-domain generalization while updating less than 1% of parameters, demonstrating a favorable trade-off between accuracy and efficiency. Prithvi-v2 with LoRA achieves the highest overall accuracy and the largest improvement compared to full fine-tuning. These findings indicate that geospatial foundation models, when adapted using lightweight parameter-efficient methods such as LoRA, offer a robust and scalable solution for large-scale burned-area mapping. Code is available at https://github.com/alishibli97/wildfire-lora-gfm.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper evaluates three geospatial foundation models (Terramind, DINOv3, Prithvi-v2) for burned-area mapping on Sentinel-2 data from 3,820 wildfire events (2017-2023) across US and Canada biomes. It systematically compares full fine-tuning, decoder-only fine-tuning, and LoRA adaptation, claiming that LoRA delivers the strongest cross-domain generalization while updating <1% of parameters and that Prithvi-v2+LoRA attains the highest accuracy with the largest gains relative to full fine-tuning.
Significance. If the empirical results hold under verified splits, the work would establish a clear efficiency-accuracy trade-off for adapting large GFMs to domain-shifted Earth-observation tasks, with direct relevance to scalable wildfire monitoring and emissions modeling. The provision of code further supports reproducibility.
major comments (2)
- [Experimental setup / data partitioning] The headline generalization claim (LoRA strongest cross-domain performance) is load-bearing on the spatial/temporal splits of the 3,820 events. The manuscript must explicitly describe the partitioning procedure (e.g., how events are assigned by geographic coordinates, year, or biome to ensure zero event-level, pixel-level, or seasonal overlap between train and test sets) and report any checks for leakage; without this, the reported accuracy deltas cannot be confidently attributed to adaptation robustness rather than memorization or selection effects.
- [Results and discussion] Results lack specification of the exact evaluation metrics (e.g., IoU, F1-score, overall accuracy), statistical tests for significance of differences, and complete baseline tables including all three adaptation methods for each model. These omissions prevent verification of the claim that Prithvi-v2+LoRA shows the largest improvement over full fine-tuning.
minor comments (2)
- [Abstract / code availability] The code repository link is a positive contribution for reproducibility; ensure the released scripts include the exact train/test split generation code and hyperparameter settings used for each GFM.
- [Methods] Clarify the precise fraction of parameters updated by LoRA for each model (Terramind, DINOv3, Prithvi-v2) and whether rank and alpha were held constant across experiments.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point-by-point below. Where the manuscript requires clarification or expansion, we will revise accordingly to strengthen the presentation of our experimental setup and results.
read point-by-point responses
-
Referee: [Experimental setup / data partitioning] The headline generalization claim (LoRA strongest cross-domain performance) is load-bearing on the spatial/temporal splits of the 3,820 events. The manuscript must explicitly describe the partitioning procedure (e.g., how events are assigned by geographic coordinates, year, or biome to ensure zero event-level, pixel-level, or seasonal overlap between train and test sets) and report any checks for leakage; without this, the reported accuracy deltas cannot be confidently attributed to adaptation robustness rather than memorization or selection effects.
Authors: We agree that explicit details on the data partitioning are necessary to support the cross-domain generalization claims. The current manuscript mentions spatial and temporal generalization tests across biomes but does not provide a full procedural description. In the revised version, we will add a dedicated subsection (likely in Methods) that specifies: (1) assignment of the 3,820 events by geographic coordinates (e.g., disjoint US/Canada regions or biomes), (2) temporal splits by year ranges to avoid seasonal overlap, and (3) verification steps confirming zero event-level, pixel-level, or seasonal leakage between train and test sets. We will also report the resulting split sizes, biome coverage, and any leakage checks performed. This revision will allow readers to attribute performance differences more confidently to the adaptation methods rather than data artifacts. revision: yes
-
Referee: [Results and discussion] Results lack specification of the exact evaluation metrics (e.g., IoU, F1-score, overall accuracy), statistical tests for significance of differences, and complete baseline tables including all three adaptation methods for each model. These omissions prevent verification of the claim that Prithvi-v2+LoRA shows the largest improvement over full fine-tuning.
Authors: We acknowledge these omissions limit verifiability. The experiments use Intersection-over-Union (IoU) and F1-score as primary metrics for burned-area segmentation, supplemented by overall accuracy; these will be explicitly stated in a new Evaluation Metrics paragraph in the revised manuscript. We will also add statistical significance testing (e.g., paired t-tests across the 3,820 events or bootstrap confidence intervals) for all reported differences. Finally, we will expand the results tables to present all three adaptation strategies (full fine-tuning, decoder-only fine-tuning, and LoRA) side-by-side for each of the three GFMs, enabling direct comparison and verification of the largest gains for Prithvi-v2+LoRA. These changes will be incorporated without altering the underlying experimental outcomes. revision: yes
Circularity Check
No circularity: claims rest on direct empirical measurements from held-out splits
full rationale
The paper reports experimental results from applying full fine-tuning, decoder-only fine-tuning, and LoRA to three pre-trained geospatial foundation models on a dataset of 3,820 wildfire events (2017-2023) with explicit spatial and temporal splits across US/Canada biomes. Key claims (LoRA strongest cross-domain generalization, Prithvi-v2+LoRA highest accuracy and largest gain vs full fine-tuning) are computed directly from accuracy metrics on those test sets. No mathematical derivation chain exists; no parameters are fitted to a subset and then presented as predictions of related quantities; no self-citations are invoked to justify uniqueness or ansatzes that bear the central result; and no known empirical patterns are renamed as novel derivations. The evaluation is self-contained against the reported data splits and standard adaptation techniques.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The 3,820 wildfire events and Sentinel-2 imagery splits adequately represent geographic and temporal domain shifts for burned-area mapping.
Lean theorems connected to this paper
-
Foundation/AlexanderDuality.lean (8-tick period from 2^D=8, D=3)alexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LoRA adapters use rank r = 8 and scaling α = 1.0
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Burned area determination using sentinel-2 satellite images and the impact of fire on the availability of soil nutrients in syria
R. Al-Hasn and R. Almuhammad, “Burned area determination using sentinel-2 satellite images and the impact of fire on the availability of soil nutrients in syria.” 2022
2022
-
[2]
Mapping burned areas in thai- land using sentinel-2 imagery and obia techniques,
C. Suwanprasit and Shahnawaz, “Mapping burned areas in thai- land using sentinel-2 imagery and obia techniques,”Scientific Reports, vol. 14, no. 1, p. 9609, 2024
2024
-
[3]
A deep learning approach for burned area segmentation with sentinel-2 data,
L. Knopp, M. Wieland, M. R ¨attich, and S. Martinis, “A deep learning approach for burned area segmentation with sentinel-2 data,”Remote Sensing, vol. 12, no. 15, p. 2422, 2020
2020
-
[4]
Semantic segmentation of burned areas in satellite images using a u-net-based convolutional neu- ral network,
A. Brand and A. Manandhar, “Semantic segmentation of burned areas in satellite images using a u-net-based convolutional neu- ral network,”The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 43, pp. 47–53, 2021
2021
-
[5]
Biau- net: Wildfire burnt area mapping using bi-temporal sentinel- 2 imagery and u-net with attention mechanism,
T. Sui, Q. Huang, M. Wu, M. Wu, and Z. Zhang, “Biau- net: Wildfire burnt area mapping using bi-temporal sentinel- 2 imagery and u-net with attention mechanism,”International Journal of Applied Earth Observation and Geoinformation, vol. 132, p. 104034, 2024
2024
-
[6]
Anand, R
A. Anand, R. Imasu, S. K. Dhaka, and P. K. Patra, “Domain adaptation and fine-tuning of a deep learning segmentation model of small agricultural burn area detection using high- resolution sentinel-2 observations: A case study of punjab, india,”Remote Sensing, vol. 17, no. 6, p. 974, 2025
2025
-
[7]
Prithvi: Large-scale multimodal fms for earth observation,
M. e. a. Reichstein, “Prithvi: Large-scale multimodal fms for earth observation,”NeurIPS, 2023
2023
-
[8]
Terramind: Modality-agnostic geospatial foundation model,
A. A. Lab, “Terramind: Modality-agnostic geospatial foundation model,”CVPR, 2024
2024
-
[9]
O. Sim ´eoni, H. V . V o, M. Seitzer, F. Baldassarre, M. Oquab, C. Jose, V . Khalidov, M. Szafraniec, S. Yi, M. Ramamonjisoa et al., “Dinov3,”arXiv preprint arXiv:2508.10104, 2025
work page Pith review arXiv 2025
-
[10]
V . Marsocci, Y . Jia, G. L. Bellier, D. Kerekes, L. Zeng, S. Hafner, S. Gerard, E. Brune, R. Yadav, A. Shibliet al., “Pangaea: A global and inclusive benchmark for geospatial foundation models,”arXiv preprint arXiv:2412.04204, 2024
-
[11]
N. Simumba, N. Lehmann, P. Fraccaro, H. Alemohammad, G. De Mel, S. Khan, M. Maskey, N. Longepe, X. X. Zhu, H. Kerneret al., “Geo-bench-2: From performance to capa- bility, rethinking evaluation in geospatial ai,”arXiv preprint arXiv:2511.15658, 2025
-
[12]
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Z. Han, C. Gao, J. Liu, J. Zhang, and S. Q. Zhang, “Parameter- efficient fine-tuning for large models: A comprehensive survey,” arXiv preprint arXiv:2403.14608, 2024
work page internal anchor Pith review arXiv 2024
-
[13]
Parameter-efficient fine-tuning for pre-trained vision models: A survey,
Y . Xin, S. Luo, H. Zhou, J. Du, X. Liu, Y . Fan, Q. Li, and Y . Du, “Parameter-efficient fine-tuning for pre-trained vision models: A survey,”arXiv e-prints, pp. arXiv–2402, 2024
2024
-
[14]
Fine-tune smarter, not harder: Parameter- efficient fine-tuning for geospatial foundation models,
F. Marti Escofet, B. Blumenstiel, L. Scheibenreif, P. Fraccaro, and K. Schindler, “Fine-tune smarter, not harder: Parameter- efficient fine-tuning for geospatial foundation models,” inJoint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2025, pp. 516–532
2025
-
[15]
Lora: Low-rank adaptation of large language models
E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language models.”ICLR, vol. 1, no. 2, p. 3, 2022
2022
-
[16]
Unified perceptual parsing for scene understanding,
T. Xiao, Y . Liu, B. Zhou, Y . Jiang, and J. Sun, “Unified perceptual parsing for scene understanding,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 418–434
2018
-
[17]
A project for monitoring trends in burn severity,
J. Eidenshink, B. Schwind, K. Brewer, Z.-L. Zhu, B. Quayle, and S. Howard, “A project for monitoring trends in burn severity,”Fire ecology, vol. 3, no. 1, pp. 3–21, 2007
2007
-
[18]
National burned area composite (NBAC) — annual burned area polygons,
Canadian Forest Service, “National burned area composite (NBAC) — annual burned area polygons,” 2023, government of Canada metadata catalogue
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.