pith. sign in

arxiv: 2605.20167 · v1 · pith:DTVRA635new · submitted 2026-05-19 · 💻 cs.AI · cs.LG

HaorFloodAlert: Deseasonalized ML Ensemble for 72-Hour Flood Prediction in Bangladesh Haor Wetlands

Pith reviewed 2026-05-20 04:53 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords flood predictionmachine learning ensembleSentinel-1 SARhaor wetlandsdeseasonalizationflash floodsBangladeshboro rice
0
0 comments X

The pith

A deseasonalized machine learning ensemble forecasts 72-hour flood probability in Bangladesh haor wetlands using Sentinel-1 data and an upstream river proxy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a system to predict flash floods in flat haor basins where standard riverine models fail because water behavior differs in these low-gradient areas. It removes temperature as a seasonal artifact that was inflating accuracy by about 7 percentage points since floods cluster in warm months. The approach adds an upstream Sentinel-1 SAR proxy from the Barak River for roughly 36 hours of lead time and validates predictions against real events with Otsu-thresholded labels. Accurate forecasts would help safeguard the annual boro rice harvest from sudden inundation across the roughly 8000 square kilometer Sunamganj Haor region.

Core claim

The operational ensemble (RF at weight 0.5625 plus XGBoost at weight 0.4375) reaches 89.6 percent LOOCV accuracy, 87.5 percent recall, and 0.943 AUC-ROC when tested on 77 real Sentinel-1 observed flood events.

What carries the argument

The deseasonalized ensemble that combines temperature-corrected features with an upstream Barak River Sentinel-1 SAR proxy and Otsu-thresholded change detection for label validation.

If this is right

  • The three-tier alert pipeline can issue 72-hour flood probability warnings across the 8000 km2 haor region.
  • A BRRI-calibrated boro rice damage estimator can quantify potential harvest losses tied to predicted floods.
  • Model outputs match Otsu-thresholded SAR labels at 84-91 percent spatial agreement on validation events.
  • The method captures backwater dynamics in flat basins that current riverine flood setups overlook.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same deseasonalization step could be tested in other monsoon flatland basins to check whether temperature bias appears elsewhere.
  • Adding more than 77 events to the training set might tighten confidence intervals around the reported accuracy.
  • Streaming real-time upstream SAR observations could push the practical warning window past the current 36-hour proxy lead.
  • Linking the damage estimator directly to local farmer advisories might turn probability outputs into actionable harvest protection steps.

Load-bearing premise

Deseasonalization removes only spurious temperature-driven accuracy without discarding genuine predictive signals, and the 77 Sentinel-1 events plus Otsu-thresholded SAR labels form a representative and unbiased validation set for real-world 72-hour forecasts.

What would settle it

A test on 50 or more new independent haor flood events after 2023 where the ensemble accuracy drops below 80 percent or recall falls below 80 percent would show the model does not generalize.

Figures

Figures reproduced from arXiv: 2605.20167 by Fahima Haque Talukder Jely, Md. Samiul Alim, Md. Zakir Hossen, Salma Hoque Talukdar Koli.

Figure 1
Figure 1. Figure 1: Study area: Sunamganj Haor boundary (white), Tanguar Haor (shaded), [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Otsu SAR validation across three events: 2017 major, 2022 moderate, [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Temperature confound before and after correction. (a) Raw tempera [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Normalized confusion matrix, 77-event real-SAR LOOCV. TN=41, [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: ROC curves: 77-event real-SAR LOOCV (AUC=0.943), 101-event [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Baseline comparison on 77-event real-SAR LOOCV. The RF+XGB [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: 5-fold CV stability (n=101 full dataset). Mean 90.8%±5.8% SD; range 80.8–96.3%. LOOCV was used for primary metrics; 5-fold shown for fold-level visualization. This is a supplementary stability check only. Uncertainty estimate. The 5-fold CV standard deviation of 5.8% ( [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: RF + XGBoost ensemble feature importance (131-event decon [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Ablation study (101-event LOOCV, RF+XGB, 8 [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
read the original abstract

Flash floods in Bangladesh's haor wetlands show up with almost no warning. They wreck the annual boro rice harvest. Current setups, built for riverine floods, miss backwater dynamics entirely. These basins are flat. Water does not behave like it does on the Brahmaputra. We built HaorFloodAlert, a deseasonalized machine learning ensemble that forecasts 72-hour flood probability for the Sunamganj Haor (approximately 8,000 km2). Temperature was acting as a seasonal cheat code - it inflated accuracy by 6.9 pp just because floods happen in warm months. We caught that. We also built an upstream Barak River Sentinel-1 SAR proxy from Silchar, Assam, giving about 36 hours of lead time. Otsu-thresholded SAR change detection validates at 84-91 percent spatial match. The operational ensemble (RF 0.5625 + XGBoost 0.4375) hits 89.6 percent LOOCV accuracy, 87.5 percent recall, and 0.943 AUC-ROC on 77 real Sentinel-1 events. A three-tier alert pipeline and a BRRI-calibrated boro rice damage estimator are included.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper presents HaorFloodAlert, a deseasonalized ML ensemble (Random Forest weighted 0.5625 + XGBoost 0.4375) for 72-hour flood probability forecasting in Bangladesh's Sunamganj Haor wetlands. It corrects for temperature-driven seasonal bias (noted as inflating accuracy by 6.9 pp), incorporates an upstream Barak River Sentinel-1 SAR proxy for ~36-hour lead time, validates labels via Otsu thresholding (84-91% spatial match), and reports 89.6% LOOCV accuracy, 87.5% recall, and 0.943 AUC-ROC on 77 real events, plus a three-tier alert pipeline and BRRI-calibrated boro rice damage estimator.

Significance. If the performance claims hold under rigorous validation, the work would offer a targeted advance for flash-flood early warning in flat haor basins where standard riverine models fail. The explicit deseasonalization step, SAR-based proxy, and agricultural damage linkage provide practical value beyond generic ML flood models. Concrete metrics on real Sentinel-1 events and inclusion of an operational pipeline are strengths that could support deployment if temporal dependence is properly addressed.

major comments (3)
  1. [Results / Validation] The headline 89.6% LOOCV accuracy, 87.5% recall, and 0.943 AUC-ROC on 77 Sentinel-1 events (Abstract and Results) rest on an assumption of event exchangeability. Flood events in the haor are strongly autocorrelated in time and space due to monsoon dynamics; standard LOOCV permits leakage across folds. A blocked, spatial, or forward-chaining validation scheme is required to demonstrate true 72-hour out-of-sample skill.
  2. [Methods] The deseasonalization procedure that removes the 6.9 pp temperature inflation is described only at high level (Abstract and Methods). Without the exact transformation, the set of retained features, or ablation results showing that genuine predictive signals are preserved, it is impossible to verify that the correction improves rather than harms the central performance claim.
  3. [Ensemble Construction] The ensemble weights (RF 0.5625 + XGBoost 0.4375) are presented as operational (Abstract) yet appear to have been fitted on the same 77-event Sentinel-1 set used for LOOCV evaluation. No separate hold-out or nested CV for weight selection is mentioned, raising a direct circularity concern for the reported metrics.
minor comments (3)
  1. [Abstract] The 84-91% spatial match for Otsu-thresholded SAR labels is stated without the precise metric (e.g., IoU, pixel accuracy) or the number of validation images, limiting interpretability.
  2. [Methods] A table listing all input features, their sources, and the exact deseasonalization formula would improve reproducibility and allow readers to assess the temperature correction.
  3. [Discussion] The manuscript would benefit from explicit comparison to at least one existing hydrological or remote-sensing flood model for the same region to contextualize the reported gains.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for their detailed and insightful comments, which have helped us strengthen the manuscript. We address each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Results / Validation] The headline 89.6% LOOCV accuracy, 87.5% recall, and 0.943 AUC-ROC on 77 Sentinel-1 events (Abstract and Results) rest on an assumption of event exchangeability. Flood events in the haor are strongly autocorrelated in time and space due to monsoon dynamics; standard LOOCV permits leakage across folds. A blocked, spatial, or forward-chaining validation scheme is required to demonstrate true 72-hour out-of-sample skill.

    Authors: We agree that temporal and spatial autocorrelation in flood events can lead to overly optimistic estimates with standard LOOCV. In the revised manuscript, we will incorporate a forward-chaining cross-validation approach, where models are trained on earlier events and tested on subsequent ones, to better simulate real-world forecasting conditions. We will also report the corresponding metrics under this scheme. revision: yes

  2. Referee: [Methods] The deseasonalization procedure that removes the 6.9 pp temperature inflation is described only at high level (Abstract and Methods). Without the exact transformation, the set of retained features, or ablation results showing that genuine predictive signals are preserved, it is impossible to verify that the correction improves rather than harms the central performance claim.

    Authors: Thank you for highlighting this lack of detail. The deseasonalization is performed by fitting a seasonal model to temperature data using harmonic regression and subtracting the predicted seasonal component from the feature set. We will add the precise equations, specify the retained features (e.g., deseasonalized temperature, precipitation, and SAR-derived indices), and include ablation experiments demonstrating the impact on model performance in the updated Methods section. revision: yes

  3. Referee: [Ensemble Construction] The ensemble weights (RF 0.5625 + XGBoost 0.4375) are presented as operational (Abstract) yet appear to have been fitted on the same 77-event Sentinel-1 set used for LOOCV evaluation. No separate hold-out or nested CV for weight selection is mentioned, raising a direct circularity concern for the reported metrics.

    Authors: This is a valid concern regarding potential data leakage in weight selection. The weights were selected via an internal optimization process; however, to address the circularity issue, we will provide a detailed description of the weight selection method and perform additional validation using time-based splits in the revised version. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes an applied ML pipeline: deseasonalization of temperature features, construction of an upstream SAR proxy, training of RF and XGBoost models, and linear combination with fixed weights 0.5625/0.4375, followed by LOOCV evaluation on the 77 Sentinel-1 events. No equation or step reduces the reported accuracy, recall, or AUC by construction to the input labels or fitted weights; LOOCV explicitly trains on held-out events, and the headline metrics are presented as empirical outcomes rather than tautological re-statements of the training objective. The SAR Otsu validation is a separate spatial-match check, not a re-derivation of the ML performance. The derivation is therefore self-contained against the external benchmark of the reported cross-validated scores.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The claim rests on standard ML assumptions plus domain assumptions about SAR flood mapping accuracy and the representativeness of the 77-event set; no new physical entities are introduced.

free parameters (1)
  • Ensemble weights = 0.5625 RF + 0.4375 XGBoost
    Weights 0.5625 for random forest and 0.4375 for XGBoost are stated for the operational model and are presumed optimized on training data.
axioms (1)
  • domain assumption Otsu-thresholded Sentinel-1 SAR change detection provides reliable flood extent labels at 84-91% spatial match in haor terrain
    Invoked to validate the model's flood predictions.

pith-pipeline@v0.9.0 · 5782 in / 1350 out tokens · 56212 ms · 2026-05-20T04:53:28.328724+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    Flood prediction in Bangladesh using ML and hydrological station data,

    M. A. Hossain et al., “Flood prediction in Bangladesh using ML and hydrological station data,”J. Hydrol.: Reg. Stud., vol. 38, p. 100934, 2021

  2. [2]

    Assessment of flood hazard in mid-eastern Dhaka,

    S. Masood and P. Takeuchi, “Assessment of flood hazard in mid-eastern Dhaka,”Nat. Hazards, vol. 61, pp. 757–770, 2012

  3. [3]

    Predicting flood risks using advanced machine learning algorithms with a focus on Bangladesh: Influencing factors, gaps and future challenges,

    A. R. M. T. Islam et al., “Predicting flood risks using advanced machine learning algorithms with a focus on Bangladesh: Influencing factors, gaps and future challenges,”Earth Sci. Inform., vol. 18, no. 3, p. 300,

  4. [4]

    DOI: 10.1007/s12145-025-01816-x

  5. [5]

    Machine learning in flood forecasting in Bangladesh,

    J. A. Rajab et al., “Machine learning in flood forecasting in Bangladesh,” Water, vol. 15, no. 22, p. 3970, 2023

  6. [6]

    Operational flood mapping using multi-temporal Sentinel-1 SAR,

    K. Uddin et al., “Operational flood mapping using multi-temporal Sentinel-1 SAR,”Remote Sens., vol. 11, no. 13, p. 1581, 2019

  7. [7]

    SAR-based flood threshold detection using Sentinel-1,

    M. R. Bhuiyan et al., “SAR-based flood threshold detection using Sentinel-1,”Remote Sens. Lett., vol. 12, no. 9, pp. 881–891, 2021

  8. [8]

    Identifying floods and flood-affected paddy rice fields in Bangladesh based on Sentinel-1 imagery and Google Earth Engine,

    M. Singha et al., “Identifying floods and flood-affected paddy rice fields in Bangladesh based on Sentinel-1 imagery and Google Earth Engine,” ISPRS J. Photogramm. Remote Sens., vol. 166, pp. 278–293, 2020

  9. [9]

    ANN-CatBoost hybrid model for flash flood prediction in NE haor,

    S. Chowdhury et al., “ANN-CatBoost hybrid model for flash flood prediction in NE haor,”Nat. Hazards, vol. 120, no. 4, pp. 3451–3472, 2024

  10. [10]

    Multi-classifier ensemble flood susceptibility map- ping,

    A. M. Siam et al., “Multi-classifier ensemble flood susceptibility map- ping,”Geocarto Int., vol. 39, no. 1, p. 2305847, 2024

  11. [11]

    Rainfall-runoff modelling using LSTM networks,

    F. Kratzert et al., “Rainfall-runoff modelling using LSTM networks,” Hydrol. Earth Syst. Sci., vol. 22, pp. 6005–6022, 2018

  12. [12]

    Barak-Surma-Meghna river system: Flood hazard assessment using geospatial techniques,

    A. M. Dewan et al., “Barak-Surma-Meghna river system: Flood hazard assessment using geospatial techniques,”Geomatics, Nat. Hazards Risk, vol. 6, no. sup1, pp. 1–15, 2015. DOI: 10.1080/19475705.2013.862344

  13. [13]

    Deseasonalisation of hydrological time series through the normal quantile transform,

    A. Montanari, “Deseasonalisation of hydrological time series through the normal quantile transform,”J. Hydrol., vol. 313, no. 3–4, pp. 274– 282, 2005. DOI: 10.1016/j.jhydrol.2005.03.002

  14. [14]

    Flood inundation map of Bangladesh using MODIS,

    A. S. M. Islam et al., “Flood inundation map of Bangladesh using MODIS,”J. Flood Risk Manag., vol. 3, no. 3, pp. 210–222, 2010

  15. [15]

    A threshold selection method from gray-level histograms,

    N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern., vol. 9, no. 1, pp. 62–66, 1979

  16. [16]

    Identifying societal challenges in flood early warning systems,

    D. Perera et al., “Identifying societal challenges in flood early warning systems,”Int. J. Disaster Risk Reduct., vol. 51, p. 101794, 2020

  17. [17]

    Boro rice yield statistics and growth stage calendars for haor regions,

    BRRI, “Boro rice yield statistics and growth stage calendars for haor regions,” Bangladesh Rice Research Institute, Gazipur, Bangladesh, Tech. Bull., 2024. https://brri.gov.bd [Accessed: May 2025]

  18. [18]

    Ensemble ML for flood susceptibility mapping in coastal Bangladesh,

    M. Hasan et al., “Ensemble ML for flood susceptibility mapping in coastal Bangladesh,”Int. J. Disaster Risk Reduct., vol. 94, p. 103812, 2023

  19. [19]

    ANN-based flood prediction model for Bangladesh,

    M. S. Islam et al., “ANN-based flood prediction model for Bangladesh,” J. Hydrol.: Reg. Stud., vol. 48, p. 101442, 2023

  20. [20]

    Rapid flood inundation mapping for effective man- agement: A machine learning and pixel-based classification approach in Feni District, Bangladesh,

    K. Uddin et al., “Rapid flood inundation mapping for effective man- agement: A machine learning and pixel-based classification approach in Feni District, Bangladesh,”J. Flood Risk Manag., vol. 18, no. 2, p. e70087, 2025

  21. [21]

    Sedimentation-induced flood risks and food secu- rity in Bangladesh’s Haor basin: A geospatial multi-index approach,

    M. N. Shad et al., “Sedimentation-induced flood risks and food secu- rity in Bangladesh’s Haor basin: A geospatial multi-index approach,” Geomatics, Nat. Hazards Risk, vol. 16, no. 1, p. 2588258, 2025

  22. [22]

    Random Forests,

    L. Breiman, “Random Forests,”Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001

  23. [23]

    Implementing machine learning techniques to forecast floods in Bangladesh,

    S. M. Toufique et al., “Implementing machine learning techniques to forecast floods in Bangladesh,” in2024 Int. Conf. Elect. Comput. Energy Technol. (ICECET), 2024, pp. 1–6

  24. [24]

    Flood susceptibility mapping in Bangladesh using machine learning ensemble models,

    M. M. Rahman et al., “Flood susceptibility mapping in Bangladesh using machine learning ensemble models,”Geosci. Front., vol. 12, no. 3, p. 101104, 2021

  25. [25]

    Geo-spatial analysis for flash flood susceptibility mapping in the North-East Haor (wetland) region in Bangladesh,

    M. N. Haque et al., “Geo-spatial analysis for flash flood susceptibility mapping in the North-East Haor (wetland) region in Bangladesh,”Earth Syst. Environ., vol. 5, no. 2, pp. 365–384, 2021

  26. [26]

    Land-use land-cover classification by ML classifiers,

    S. Talukdar et al., “Land-use land-cover classification by ML classifiers,” Remote Sens., vol. 12, no. 7, p. 1135, 2020

  27. [27]

    Flood mapping in the coastal region of Bangladesh using Sentinel-1 SAR images: A case study of super cyclone Amphan,

    P. Chakma and A. Akter, “Flood mapping in the coastal region of Bangladesh using Sentinel-1 SAR images: A case study of super cyclone Amphan,”J. Civ. Eng. F orum, vol. 7, pp. 267–278, 2021