Recognition: unknown
On-Orbit Real-Time Wildfire Detection Under On-Board Constraints
Pith reviewed 2026-05-08 13:32 UTC · model grok-4.3
The pith
Dense masked autoencoding pretraining produces sub-megabyte models that detect sub-pixel wildfires from single-band MWIR satellite imagery at real-time speeds on orbit.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DenseMAE pretraining enables compact downstream models on the latency-accuracy Pareto frontier: the fastest SSL-pretrained model reaches 0.640 test AP and 0.69 event-level Fire-F1 at 65.34 ms per batch and 0.52 MB engine size without pruning. The best configuration achieves 0.699 AP and 0.744 Fire-F1 below 1 MB while outperforming a supervised baseline of 0.650 AP under the same constraints.
What carries the argument
Dense masked autoencoding (DenseMAE) pretraining on proprietary nine-satellite MWIR single-band imagery, which produces dense representations that support lightweight fine-tuned pixel-level fire detectors.
If this is right
- Models fit the sub-megabyte footprint and sub-150 ms TensorRT FP16 inference budget on an NVIDIA Jetson Xavier NX.
- The full pipeline targets end-to-end alerts under 10 minutes from satellite overpass to ground communication.
- Detection works on uncalibrated single-band MWIR data at 200 m resolution despite frequent sub-pixel fires and extreme class imbalance.
- SSL-pretrained models surpass supervised baselines when both are constrained to the same size and speed limits.
Where Pith is reading between the lines
- The same pretraining approach could be tested on other single-band thermal tasks such as volcanic hot-spot monitoring.
- Reduced dependence on ground processing may shorten response times for other time-critical satellite observations.
- Adding limited multi-band inputs could raise accuracy further if the model size budget still permits.
Load-bearing premise
The nine-satellite proprietary MWIR dataset used for pretraining and testing represents the full range of real operational conditions and that pixel-level AP and event-level Fire-F1 reliably predict on-orbit performance.
What would settle it
A clear drop in AP or Fire-F1 when the same models are evaluated on independent public MWIR datasets collected from different satellites or geographic regions.
Figures
read the original abstract
We present a deployed system for on-orbit wildfire detection aboard a nine-satellite commercial thermal infrared constellation, operating under demanding joint constraints: sub-megabyte model footprint, sub-150 ms per-batch TensorRT FP16 inference on an NVIDIA Jetson Xavier NX, and an end-to-end alert pipeline targeting under 10 minutes from satellite overpass to fire event communication. The system operates on uncalibrated mid-wave infrared (MWIR) single-band imagery at 200 m ground sampling distance, where fires frequently appear as sub-pixel or single-pixel thermal anomalies under extreme class imbalance -- challenges not addressed by the contextual thermal-thresholding pipelines (MODIS, VIIRS) that currently dominate operational fire monitoring. We present an empirical study of lightweight dense representation learning for this regime using a proprietary nine-satellite MWIR dataset. We compare dense masked autoencoding (DenseMAE) and a hybrid DenseMAE+EMA (exponential moving average) distillation variant, and evaluate representations via linear probing and full-distribution pixel-level average precision (AP) under extreme class imbalance. DenseMAE pretraining enables compact downstream models on the latency-accuracy Pareto frontier: our fastest SSL-pretrained model achieves 0.640 test AP and 0.69 event-level Fire-F1 with 65.34 ms latency per batch and a 0.52 MB engine, without pruning or compression. The best configuration reaches 0.699 AP and 0.744 Fire-F1 below 1 MB, outperforming a supervised baseline (0.650 AP) under comparable constraints.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a deployed on-orbit wildfire detection system for a nine-satellite MWIR constellation that uses DenseMAE (and DenseMAE+EMA) pretraining to produce compact models meeting sub-1 MB footprint and sub-150 ms TensorRT latency constraints. It reports empirical results on a proprietary dataset showing the best SSL-pretrained model reaching 0.699 pixel-level AP and 0.744 event-level Fire-F1, outperforming a supervised baseline at 0.650 AP, with the fastest variant at 0.640 AP / 0.69 Fire-F1 using a 0.52 MB engine.
Significance. If the empirical claims hold under operational conditions, the work would demonstrate a practical path for self-supervised pretraining to deliver latency-accuracy Pareto-optimal models for real-time on-board thermal anomaly detection, addressing a gap between standard contextual thresholding methods and edge-deployable deep models. The explicit focus on uncalibrated single-band MWIR, extreme imbalance, and end-to-end alert latency is a strength.
major comments (3)
- [Abstract] Abstract: The central performance claims (0.699 AP / 0.744 Fire-F1 beating supervised baseline at 0.650 AP) rest on a proprietary nine-satellite MWIR dataset whose size, positive-pixel ratio, event counts, train/test partitioning, and safeguards against event overlap between pretraining and downstream evaluation are not reported. Without these statistics it is impossible to determine whether the 0.049 AP delta reflects DenseMAE or dataset-specific artifacts, leakage, or evaluation choices.
- [Abstract] Abstract: The supervised baseline comparison provides no information on whether the baseline uses the identical backbone family, the same data augmentations, or the same training schedule as the SSL-pretrained models. This omission prevents attribution of the reported gains specifically to DenseMAE pretraining rather than other experimental factors.
- [Abstract] Abstract: No description is given of how pixel-level AP is computed under the stated extreme class imbalance (e.g., whether it uses standard COCO-style AP, any spatial weighting, or per-event aggregation), nor whether spatial autocorrelation in 200 m GSD MWIR imagery was mitigated; this directly affects interpretability of the 0.640–0.699 AP numbers.
minor comments (2)
- [Abstract] The abstract refers to 'linear probing and full-distribution pixel-level average precision' without defining the probing protocol or the exact AP implementation used.
- [Abstract] The latency figure (65.34 ms per batch) should specify batch size and whether it includes preprocessing or post-processing steps in the end-to-end pipeline.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major comment point-by-point below, indicating where we will revise the manuscript for improved clarity and completeness.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central performance claims (0.699 AP / 0.744 Fire-F1 beating supervised baseline at 0.650 AP) rest on a proprietary nine-satellite MWIR dataset whose size, positive-pixel ratio, event counts, train/test partitioning, and safeguards against event overlap between pretraining and downstream evaluation are not reported. Without these statistics it is impossible to determine whether the 0.049 AP delta reflects DenseMAE or dataset-specific artifacts, leakage, or evaluation choices.
Authors: We agree that greater transparency on dataset characteristics would aid interpretability. Due to the proprietary nature of the nine-satellite MWIR dataset, we cannot release exact numerical values for size, positive-pixel ratio, or event counts. However, we will revise the manuscript to add a dedicated paragraph in the Dataset subsection describing key properties (extreme imbalance with fire pixels <<0.1%, multi-year coverage across nine satellites), the partitioning approach (temporal splits by overpass date combined with spatial separation to eliminate event overlap), and explicit safeguards (no shared fire events or imagery between the pretraining corpus and the downstream evaluation set). These additions provide the necessary context to assess potential leakage or artifacts while respecting confidentiality. revision: partial
-
Referee: [Abstract] Abstract: The supervised baseline comparison provides no information on whether the baseline uses the identical backbone family, the same data augmentations, or the same training schedule as the SSL-pretrained models. This omission prevents attribution of the reported gains specifically to DenseMAE pretraining rather than other experimental factors.
Authors: We apologize for the insufficient detail in the current draft. The supervised baseline uses the identical lightweight backbone architecture, the same MWIR-specific data augmentations (random horizontal/vertical flips, rotations, and intensity scaling), and the exact same training schedule, optimizer, and hyperparameters as the fine-tuning stage of the SSL models. These equivalences are stated in Section 4 but will be made explicit with a cross-reference in the revised Results and a brief clarification in the abstract to ensure the performance delta can be attributed to DenseMAE pretraining. revision: yes
-
Referee: [Abstract] Abstract: No description is given of how pixel-level AP is computed under the stated extreme class imbalance (e.g., whether it uses standard COCO-style AP, any spatial weighting, or per-event aggregation), nor whether spatial autocorrelation in 200 m GSD MWIR imagery was mitigated; this directly affects interpretability of the 0.640–0.699 AP numbers.
Authors: We will expand the Evaluation Metrics section with a clear description: pixel-level AP follows the standard COCO-style computation of average precision from the precision-recall curve, treating every pixel as an independent instance with no spatial weighting or per-event aggregation (the latter is handled separately via event-level Fire-F1 using connected components). No explicit correction for spatial autocorrelation at 200 m GSD was applied, consistent with standard practice in dense prediction benchmarks; we will note this limitation and its implications for interpretation in the revised text. revision: yes
- Exact numerical statistics for the proprietary dataset (size, positive-pixel ratio, event counts) due to confidentiality constraints.
Circularity Check
No circularity: empirical performance metrics derived from held-out evaluation
full rationale
The paper presents an empirical study of DenseMAE pretraining on a proprietary MWIR dataset, followed by downstream evaluation via linear probing and pixel-level AP/Fire-F1 on held-out test data. No derivation chain, equations, or first-principles predictions are claimed; reported numbers (e.g., 0.699 AP, 0.744 Fire-F1) are direct measurements from model training and inference under stated constraints. The central claim reduces to experimental results rather than any self-definitional mapping, fitted-input renaming, or self-citation load-bearing step. The proprietary dataset raises external verification concerns but does not create internal circularity in the reported chain.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption MWIR single-band imagery at 200 m GSD contains detectable sub-pixel thermal anomalies corresponding to wildfires
- domain assumption Linear probing and pixel-level AP under extreme class imbalance are appropriate proxies for operational fire detection performance
Reference graph
Works this paper leans on
-
[1]
In: High Spatio- Temporal-Spectral Thermal Remote Sensing, pp
Cawse-Nicholson, K., Abrams, M., Buongiorno, M.F., Freeborn, D., Gamet, P., Gottfriedsen, J., Goullioud, R., Halverson, G., Holmes, T., Hook, S., et al.: Current and Future High-Resolution Multi-Channel Thermal Imagers. In: High Spatio- Temporal-Spectral Thermal Remote Sensing, pp. 37–60. CRC Press (2026) 3
2026
-
[2]
Journal of Geophysical Research: Atmospheres119(2), 803–816 (2014).https://doi.org/https://doi.org/10
Csiszar, I., Schroeder, W., Giglio, L., Ellicott, E., Vadrevu, K.P., Justice, C.O., Wind, B.: Active fires from the Suomi NPP Visible Infrared Imaging Radiometer Suite: Product status and first evaluation results. Journal of Geophysical Research: Atmospheres119(2), 803–816 (2014).https://doi.org/https://doi.org/10. 1002/2013JD020453,https://agupubs.online...
2014
-
[3]
Dickinson, M.B., Hudak, A.T., Zajkowski, T., Loudermilk, E.L., Schroeder, W., Ellison, L., Kremens, R.L., Holley, W., Martinez, O., Paxton, A., Bright, B.C., O’Brien, J.J., Hornsby, B., Ichoku, C., Faulring, J., Gerace, A., Peterson, D., Mauceri, J.: Measuring radiant emissions from entire prescribed fires with ground, airborne and satellite sensors – RxC...
-
[4]
Inter- national Journal of Remote Sensing17(2), 419–424 (Jan 1996).https://doi
Flasse, S.P., Ceccato, P.: A contextual algorithm for AVHRR fire detection. Inter- national Journal of Remote Sensing17(2), 419–424 (Jan 1996).https://doi. org/10.1080/01431169608949018,https://www.tandfonline.com/doi/full/ 10.1080/014311696089490183
-
[5]
Ghali, R., Akhloufi, M.A.: Deep Learning Approaches for Wildland Fires Using Satellite Remote Sensing Data: Detection, Mapping, and Prediction. Fire6(5) 16 M. Rötzer et al. (2023).https://doi.org/10.3390/fire6050192,https://www.mdpi.com/2571- 6255/6/5/1923
-
[6]
Giglio, L.: Viirs/jpss1 active fires 6-min l2 swath 375m v002 (vj114img) (2024). https://doi.org/10.5067/VIIRS/VJ114IMG.002,https://doi.org/10.5067/ VIIRS/VJ114IMG.002, accessed: 2025-12-22 13
-
[7]
Giglio, L., Schroeder, W., Justice, C.O.: The collection 6 MODIS active fire detec- tion algorithm and fire products. Remote Sensing of Environment178, 31–41 (Jun 2016).https://doi.org/10.1016/j.rse.2016.02.054,https://linkinghub. elsevier.com/retrieve/pii/S00344257163008273
- [8]
-
[9]
Hong,Z.,Tang,Z.,Pan,H.,Zhang,Y.,Zheng,Z.,Zhou,R.,Ma,Z.,Zhang,Y.,Han, Y., Wang, J., Yang, S.: Active Fire Detection Using a Novel Convolutional Neural Network Based on Himawari-8 Satellite Images. Frontiers in Environmental Sci- enceV olume 10 - 2022(2022).https://doi.org/10.3389/fenvs.2022.794028, https://www.frontiersin.org/journals/environmental- scienc...
-
[10]
Li, F., Zhang, X., Kondragunta, S., Schmidt, C.C., Holmes, C.D.: A prelimi- nary evaluation of GOES-16 active fire product using Landsat-8 and VIIRS ac- tive fire data, and ground-based prescribed fire records. Remote Sensing of En- vironment237, 111600 (2020).https://doi.org/https://doi.org/10.1016/ j.rse.2019.111600,https://www.sciencedirect.com/science...
-
[11]
Liu, Z., Zhang, Q., Zhang, B., Zhu, J.: Validation of Himawari-8/9 10-minute wildfire products: Comparisons with MODIS and VIIRS from 2015 to 2023. Re- mote Sensing Applications: Society and Environment41, 101868 (2026).https: //doi.org/https://doi.org/10.1016/j.rsase.2026.101868,https://www. sciencedirect.com/science/article/pii/S235293852600001713
-
[12]
Oliva, P., Schroeder, W.: Assessment of VIIRS 375m active fire detection prod- uct for direct burned area mapping. Remote Sensing of Environment160, 144– 155 (2015).https://doi.org/https://doi.org/10.1016/j.rse.2015.01.010, https://www.sciencedirect.com/science/article/pii/S003442571500019X13
-
[13]
Rostami, A., Shah-Hosseini, R., Asgari, S., Zarei, A., Aghdami-Nia, M., Homay- ouni,S.:ActiveFireDetectionfromLandsat-8ImageryUsingDeepMultipleKernel Learning. Remote Sensing14(4) (2022).https://doi.org/10.3390/rs14040992, https://www.mdpi.com/2072-4292/14/4/9923
-
[14]
Algorithm Theoretical Basis Document Version 2.5, NOAA/NESDIS/STAR (2012) 3
Schmidt, C.C., Hoffman, J., Prins, E., Lindstrom, S.: GOES-R Advanced Baseline Imager (ABI) Algorithm Theoretical Basis Document for Fire/Hot Spot Characterization. Algorithm Theoretical Basis Document Version 2.5, NOAA/NESDIS/STAR (2012) 3
2012
-
[15]
National Oceanic and Atmospheric Administration: Washington, DC, USA (2020) 13
Schroeder, W., Giglio, L., Csiszar, I., Tsidulko, M.: Algorithm theoretical basis document for NOAA NDE VIIRS I-band (375m) active fire. National Oceanic and Atmospheric Administration: Washington, DC, USA (2020) 13
2020
-
[16]
Remote Sensing of Environment143, 85–96 (Mar 2014).https://doi.org/10
Schroeder, W., Oliva, P., Giglio, L., Csiszar, I.A.: The New VIIRS 375 m ac- tive fire detection data product: Algorithm description and initial assessment. Remote Sensing of Environment143, 85–96 (Mar 2014).https://doi.org/10. 1016/j.rse.2013.12.008,https://linkinghub.elsevier.com/retrieve/pii/ S00344257130044833
2014
-
[17]
Seneviratne, S., Zhang, X., Adnan, M., Badi, W., Dereczynski, C., Di Luca, A., Ghosh, S., Iskandar, I., Kossin, J., Lewis, S., Otto, F., Pinto, I., Satoh, M., Vicente- Serrano, S., Wehner, M., Zhou, B.: Weather and climate extreme events in a Real-Time On-Orbit Wildfire Segmentation 17 changing climate. In: Masson-Delmotte, V., Zhai, P., Pirani, A., Conno...
-
[18]
Seydi, S.T., Saeidi, V., Kalantar, B., Ueda, N., Halin, A.A.: Fire-Net: A Deep Learning Framework for Active Forest Fire Detection. Journal of Sensors2022(1), 8044390 (2022).https://doi.org/https://doi.org/10.1155/2022/8044390, https://onlinelibrary.wiley.com/doi/abs/10.1155/2022/80443903
-
[19]
In: Proceedings of the 36th International Conference on Machine Learn- ing (ICML)
Tan, M., Le, Q.V.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learn- ing (ICML). Proceedings of Machine Learning Research, vol. 97, pp. 6105–6114. PMLR (2019) 4
2019
-
[20]
Remote Sensing14(1) (2022).https://doi
Wang, Z., Yang, P., Liang, H., Zheng, C., Yin, J., Tian, Y., Cui, W.: Semantic Segmentation and Analysis on Sensitive Parameters of Forest Fire Smoke Using Smoke-Unet and Landsat-8 Imagery. Remote Sensing14(1) (2022).https://doi. org/10.3390/rs14010045,https://www.mdpi.com/2072-4292/14/1/453
-
[21]
Remote Sensing of Environment267, 112694 (Dec 2021).https://doi.org/10.1016/j.rse.2021
Wooster, M.J., Roberts, G.J., Giglio, L., Roy, D.P., Freeborn, P.H., Boschetti, L., Justice, C., Ichoku, C., Schroeder, W., Davies, D., Smith, A.M., Setzer, A., Csiszar, I., Strydom, T., Frost, P., Zhang, T., Xu, W., De Jong, M.C., Johnston, J.M., Ellison, L., Vadrevu, K., Sparks, A.M., Nguyen, H., McCarty, J., Tanpipat, V., Schmidt, C., San-Miguel-Ayanz,...
-
[22]
Science of Remote Sensing13, 100366 (2026).https://doi.org/https://doi.org/10
Xu, W., Wooster, M.J., He, J., Meraner, A., Gomez-Dans, J., Liu, Z., Trigo, I.F., Dutra, E.: Major improvements in spaceborne early fire detection and small-fire FRP retrieval with the meteosat third generation flexible combined imager. Science of Remote Sensing13, 100366 (2026).https://doi.org/https://doi.org/10. 1016/j.srs.2026.100366,https://www.scienc...
-
[23]
Remote Sensing14(17) (2022).https://doi.org/10.3390/ rs14174347,https://www.mdpi.com/2072-4292/14/17/43473
Zhao, Y., Ban, Y.: GOES-R Time Series for Early Detection of Wildfires with Deep GRU-Network. Remote Sensing14(17) (2022).https://doi.org/10.3390/ rs14174347,https://www.mdpi.com/2072-4292/14/17/43473
2022
-
[24]
UNet++: A nested U-Net architecture for medical image segmentation,
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net ar- chitecture for medical image segmentation. arXiv preprint arXiv:1807.10165 (2018) 4
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.