arxiv: 2605.02137 · v1 · submitted 2026-05-04 · 💻 cs.CV · cs.AI

Recognition: unknown

FLoRA: Fusion-Latent for Optical Reconstruction and Flood Area Segmentation via Cross-Modal Multi-Task Distillation Network

Jagrati Talreja , Tewodros Syum Gebre , Leila Hashemi-Beni

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:49 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords flood mappingSAR to optical translationmulti-task distillationcross-modal fusionflood segmentationfeature distillationSentinel-1remote sensing

0 comments

The pith

Optical teacher guidance lets SAR data produce high-fidelity flood maps and RGB reconstructions in one network.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that distilling pyramidal features from a lightweight optical teacher into SAR representations creates a shared fusion latent space for two tasks at once: translating Sentinel-1 SAR into realistic RGB images and delineating flood water regions. This setup uses multiscale cross-attention and conditioning to align the modalities while gated residuals and a distillation loss keep the SAR features faithful to optical priors. A reader would care because SAR provides all-weather coverage but lacks easy visual interpretation, so turning it into both usable images and precise hydrologic maps supports faster disaster response without waiting for clear skies.

Core claim

FLoRA jointly reconstructs high-fidelity optical imagery and segments flood water regions from Sentinel 1 SAR by guiding SAR representations into a fusion latent space with pyramidal features from an optical teacher. The guidance occurs through multiscale windowed cross attention and FiLM conditioning, with gated residuals to avoid overcorrection. Training combines Charbonnier SSIM and edge FFT losses for reconstruction fidelity with Dice BCE for hydrology-aware segmentation, plus a feature distillation constraint that aligns the fused SAR features to the teacher's manifold. On SEN1FLOODS11, DEEPFLOOD, and SEN12MS the method exceeds prior fusion baselines in PSNR, SSIM, and LPIPS.

What carries the argument

The teacher-guided fusion latent space formed by multiscale windowed cross attention and FiLM conditioning between SAR inputs and optical pyramidal features, enforced by a distillation constraint.

If this is right

SAR data alone can generate RGB images with measurably higher structural and perceptual fidelity than standard fusion methods.
Flood boundaries become more precisely aligned to hydrologic edges through the combined Dice BCE loss.
The same latent representation supports both visual reconstruction and segmentation without separate models.
Performance gains appear consistently across three distinct flood datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The distillation approach could reduce reliance on paired optical-SAR training data for other remote-sensing translation tasks.
Real-time flood monitoring becomes feasible in persistently cloudy regions if the lightweight teacher is replaced by a fixed prior at inference.
The gated residual design may transfer to cross-modal fusion for drought or wildfire mapping using similar teacher guidance.

Load-bearing premise

The optical teacher's features from RGB and NDVI provide unbiased, transferable guidance to SAR representations without domain-shift artifacts.

What would settle it

A new test set of flood events where FLoRA's PSNR, SSIM, LPIPS, or segmentation accuracy falls below simple SAR-only or non-distilled fusion baselines would falsify the value of the guided latent space.

Figures

Figures reproduced from arXiv: 2605.02137 by Jagrati Talreja, Leila Hashemi-Beni, Tewodros Syum Gebre.

**Figure 1.** Figure 1: Overview of the proposed FLoRA (Fusion-Latent for Optical Reconstruction and Flood-Water Area Segmentation) framework. Teacher-student frameworks extend this idea further by using feature distillation to guide learning across modalities. In such settings, a teacher model, typically trained on optical imagery, provides high-level guidance to a SAR-based student network [37]. This supervision enables the SAR… view at source ↗

**Figure 2.** Figure 2: Architecture of the lightweight optical teacher pyramid. Given optical view at source ↗

**Figure 3.** Figure 3: Visual comparison of SAR-to-Optical translation and flood-water segmentation on SEN1FLOODS11 [21] dataset. Predicted masks are thresholded to view at source ↗

**Figure 4.** Figure 4: Visual comparison of SAR-to-Optical translation and flood-water segmentation on SEN12MS [22] dataset. Predicted masks are thresholded to binary view at source ↗

**Figure 5.** Figure 5: Visual comparison of SAR-to-Optical translation and flood-water segmentation on DEEPFLOOD [20] dataset. Predicted masks are thresholded to view at source ↗

**Figure 6.** Figure 6: Model Parameters vs Optical Resconstruction Performance on view at source ↗

**Figure 7.** Figure 7: Confusion matrices for FLoRA. and physically consistent representations that benefit both optical reconstruction and flood-water segmentation, explaining the consistent improvements observed across metrics and datasets. V. CONCLUSION We presented FLoRA, a unified cross-modal framework that jointly performs SAR-to-optical reconstruction and floodwater segmentation within a single model. By leveraging a fu… view at source ↗

read the original abstract

Accurate flood water mapping is critical for disaster management, yet current methods struggle to fully exploit the potential of spaceborne imagery. Optical data offers high interpretability but is limited by environmental conditions, whereas SAR provides reliable all-weather coverage with reduced visual interpretability. FLoRA (Fusion Latent for Optical Reconstruction and Area Segmentation) is a cross-modal multi-task framework that jointly reconstructs high-fidelity optical imagery and segments flood water regions from Sentinel 1 SAR by fusing the complementary strengths of optical and SAR data. During training, a lightweight optical teacher (driven by RGB and NDVI priors) provides pyramidal features that guide SAR representations into a fusion latent space via multiscale windowed cross attention and FiLM conditioning, with gated residuals preventing overcorrection. This design enables multi-task learning across two complementary objectives: (a) SAR-to-optical translation for fine-grained RGB reconstruction and (b) flood water region segmentation for hydrologic interpretation. The dual decoders are optimized using Charbonnier SSIM for structural fidelity, edge FFT magnitude losses for spectral realism, and Dice BCE hydrology-aware edge alignment for precise flood water delineation. A feature distillation constraint further aligns fused SAR features with the optical teacher's manifold. Evaluations on SEN1FLOODS11, DEEPFLOOD, and SEN12MS demonstrate that FLoRA surpasses fusion baselines in PSNR, SSIM, and LPIPS, demonstrating that multi-modal fusion within a teacher-guided latent space yields semantically faithful and physically consistent flood-water intelligence from spaceborne observations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FLoRA adds a teacher-guided fusion latent space with windowed cross-attention and FiLM for joint SAR optical reconstruction and flood segmentation, but the gains rest on untested transfer assumptions.

read the letter

The core idea is a distillation network that pulls pyramidal features from a lightweight optical teacher (RGB plus NDVI) into SAR representations during training, then uses multiscale windowed cross-attention, FiLM conditioning, and gated residuals to support two heads: one that reconstructs optical imagery and one that segments flood water. The losses combine Charbonnier SSIM, edge FFT magnitude, and Dice BCE with a feature alignment term. That setup is the actual novelty; it is not a routine extension of prior fusion work for this exact pair of tasks on Sentinel-1 data. The paper does a clean job of stating the practical motivation and spelling out the architecture choices that keep the student from over-relying on the teacher at inference time. On SEN1FLOODS11, DEEPFLOOD, and SEN12MS it reports higher PSNR, SSIM, and LPIPS than the fusion baselines it cites. Those are the parts that hold up from the text. The soft spots are straightforward. No ablations appear that remove the teacher, measure feature divergence between teacher and student, or test reconstruction quality on pure SAR inputs at inference. Without those, the claim that the fusion latent space produces semantically faithful and physically consistent output cannot be separated from the possibility that the optical priors simply regularize training. The stress-test concern about NDVI-derived vegetation assumptions leaking into the SAR manifold is therefore live; the abstract describes the mechanism but supplies no quantitative check on transfer fidelity. This is applied remote-sensing work aimed at disaster-mapping pipelines. Someone already building SAR-optical fusion models could pick up the attention and conditioning details, but the results are not yet strong enough to adopt as a new standard. It deserves peer review so the experiments can be examined for the missing controls and statistical reporting.

Referee Report

2 major / 1 minor

Summary. The paper introduces FLoRA, a cross-modal multi-task distillation network that reconstructs high-fidelity optical imagery from Sentinel-1 SAR while simultaneously segmenting flood water regions. It uses an optical teacher (RGB/NDVI-driven) to guide SAR features into a fusion latent space via multiscale windowed cross-attention and FiLM conditioning, with gated residuals and a feature distillation constraint. Dual decoders are trained with Charbonnier SSIM, edge FFT magnitude, and Dice-BCE hydrology-aware losses. The abstract reports superior PSNR, SSIM, and LPIPS on SEN1FLOODS11, DEEPFLOOD, and SEN12MS relative to fusion baselines.

Significance. If the performance gains prove robust under ablation, the work would offer a practical advance in all-weather flood mapping by showing that teacher-guided multi-modal fusion can produce both visually faithful optical reconstructions and hydrologically consistent segmentations. The combination of structural, spectral, and edge-aware losses tailored to flood semantics is a constructive design choice for remote-sensing applications.

major comments (2)

Abstract: The central performance claim (FLoRA surpasses fusion baselines in PSNR, SSIM, and LPIPS) is presented without any ablation removing the optical teacher, without baseline implementation details, error bars, or statistical tests. This prevents attribution of gains specifically to the fusion latent space and teacher guidance rather than to standard multi-task training.
Abstract (method description): The claim that pyramidal features from the RGB/NDVI optical teacher provide unbiased, transferable guidance to SAR representations via multiscale cross-attention and FiLM lacks supporting evidence such as quantitative domain-shift metrics (e.g., MMD or feature-distribution divergence) or test-time results on purely SAR inputs. Without these, the assertion that the latent space yields 'semantically faithful and physically consistent' outputs remains unverified.

minor comments (1)

Abstract: The parenthetical expansion of the FLoRA acronym ('Fusion Latent for Optical Reconstruction and Area Segmentation') is inconsistent with the title, which includes 'Flood Area Segmentation'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments highlight opportunities to strengthen the attribution of results and the supporting evidence for our cross-modal claims. We address each point below and will incorporate the requested analyses and clarifications in the revised manuscript.

read point-by-point responses

Referee: Abstract: The central performance claim (FLoRA surpasses fusion baselines in PSNR, SSIM, and LPIPS) is presented without any ablation removing the optical teacher, without baseline implementation details, error bars, or statistical tests. This prevents attribution of gains specifically to the fusion latent space and teacher guidance rather than to standard multi-task training.

Authors: We agree that the abstract and experimental section would benefit from explicit isolation of the teacher's contribution. In the revision we will add an ablation study that trains and evaluates a variant of FLoRA without the optical teacher (removing both the feature distillation loss and the pyramidal feature injection via cross-attention and FiLM). We will also expand the supplementary material with complete baseline implementation details, including network architectures, loss weights, and training schedules. All reported metrics will include standard deviations computed over three independent runs and will be accompanied by statistical significance tests (paired t-tests with p-values). These additions will allow readers to attribute performance differences more precisely to the proposed teacher-guided fusion latent space. revision: yes
Referee: Abstract (method description): The claim that pyramidal features from the RGB/NDVI optical teacher provide unbiased, transferable guidance to SAR representations via multiscale cross-attention and FiLM lacks supporting evidence such as quantitative domain-shift metrics (e.g., MMD or feature-distribution divergence) or test-time results on purely SAR inputs. Without these, the assertion that the latent space yields 'semantically faithful and physically consistent' outputs remains unverified.

Authors: We acknowledge that direct quantitative evidence of domain alignment would make the guidance claim more robust. We will therefore add a feature-alignment analysis that computes Maximum Mean Discrepancy (MMD) between (i) raw SAR encoder features and the fused latent representation and (ii) the fused representation and the corresponding optical-teacher features. In addition, because inference uses only SAR inputs by design, we will explicitly tabulate reconstruction and segmentation metrics on the held-out test portions of SEN1FLOODS11, DEEPFLOOD, and SEN12MS when the model receives exclusively SAR data (no optical inputs at test time). These results, together with the MMD values, will be presented in a new subsection of the experiments to substantiate the transferability and semantic consistency of the learned latent space. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical DL training outcomes, not derived quantities

full rationale

The paper describes a multi-task neural network (FLoRA) trained end-to-end on SEN1FLOODS11 and related datasets to perform SAR-to-optical reconstruction and flood segmentation. Performance is reported as measured PSNR/SSIM/LPIPS after optimization with standard losses (Charbonnier, Dice-BCE, feature distillation). No equations, first-principles derivations, or 'predictions' are presented that reduce the reported metrics to fitted parameters by construction, nor are any uniqueness theorems or ansatzes imported via self-citation to force the architecture. The central claim is an empirical demonstration of improved metrics under the proposed training regime; the results remain falsifiable by retraining or ablation and do not collapse to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Only abstract available; ledger is therefore limited to high-level assumptions visible in the summary. The central claim rests on standard supervised deep-learning training plus domain priors for optical data.

axioms (2)

domain assumption Optical imagery with RGB and NDVI provides reliable high-interpretability priors that can guide SAR feature learning
Invoked in the description of the lightweight optical teacher and pyramidal feature guidance.
ad hoc to paper Multi-scale windowed cross attention and FiLM conditioning can align SAR and optical manifolds without introducing systematic artifacts
Core mechanism of the fusion latent space; no independent justification supplied in abstract.

invented entities (1)

fusion latent space no independent evidence
purpose: Shared representation that fuses SAR and optical features for both reconstruction and segmentation
Central architectural construct introduced to enable the multi-task distillation; no external falsifiable prediction attached.

pith-pipeline@v0.9.0 · 5590 in / 1348 out tokens · 26186 ms · 2026-05-09T16:49:30.024907+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 1 canonical work pages

[1]

Increasing stress on disaster-risk finance due to large floods,

B. Jongmanet al., “Increasing stress on disaster-risk finance due to large floods,”Nature Climate Change, vol. 4, no. 4, pp. 264–268, 2014

2014
[2]

The need for a high-accuracy, open-access global DEM,

G. J. P. Schumann and P. D. Bates, “The need for a high-accuracy, open-access global DEM,”Frontiers in Earth Science, vol. 6, p. 225, 2018

2018
[3]

Sentinel-2: ESA’s optical high-resolution mission for GMES operational services,

M. Druschet al., “Sentinel-2: ESA’s optical high-resolution mission for GMES operational services,”Remote Sensing of Environment, vol. 120, pp. 25–36, 2012

2012
[4]

Sentinel-1 InSAR coherence to detect floodwater in urban areas: Houston and Hurricane Harvey as a test case,

M. Chiniet al., “Sentinel-1 InSAR coherence to detect floodwater in urban areas: Houston and Hurricane Harvey as a test case,”Remote Sensing, vol. 11, no. 2, p. 107, 2019

2019
[5]

GMES Sentinel-1 mission,

R. Torreset al., “GMES Sentinel-1 mission,”Remote Sensing of Envi- ronment, vol. 120, pp. 9–24, 2012

2012
[6]

Flood monitoring using multi-temporal COSMO- SkyMed data: Image segmentation and signature interpretation,

L. Pulvirentiet al., “Flood monitoring using multi-temporal COSMO- SkyMed data: Image segmentation and signature interpretation,”Remote Sensing of Environment, vol. 115, no. 4, pp. 990–1002, 2011

2011
[7]

Google Earth Engine: Planetary-scale geospatial analysis for everyone,

N. Gorelicket al., “Google Earth Engine: Planetary-scale geospatial analysis for everyone,”Remote Sensing of Environment, vol. 202, pp. 18–27, 2017

2017
[8]

Limitations and potential of satellite imagery to monitor environmental response to coastal flooding,

E. Ramsey IIIet al., “Limitations and potential of satellite imagery to monitor environmental response to coastal flooding,”J. Coastal Res., vol. 28, no. 2, pp. 457–476, 2012

2012
[9]

SAR image analysis techniques for flood area map- ping—Literature survey,

R. Manavalan, “SAR image analysis techniques for flood area map- ping—Literature survey,”Earth Sci. Informatics, vol. 10, no. 1, pp. 1–14, 2017

2017
[10]

Fusing of optical and synthetic aperture radar (SAR) remote sensing data: A systematic literature review (SLR),

S. Mahyoubet al., “Fusing of optical and synthetic aperture radar (SAR) remote sensing data: A systematic literature review (SLR),”Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., vol. 42, pp. 127–138, 2019

2019
[11]

Multi-Head Encoder-Decoder Deep Learning Architecture for Flood Segmentation and Mapping Through Multi-Sensor Data Fusion,

M. Fawakherji and L. Hashemi-Beni, “Multi-Head Encoder-Decoder Deep Learning Architecture for Flood Segmentation and Mapping Through Multi-Sensor Data Fusion,” inIGARSS 2024–IEEE Int. Geosci. Remote Sens. Symp., 2024, pp. 1191–1195

2024
[12]

Flood detection and mapping through multi-resolution sensor fusion: integrating UA V optical imagery and satellite SAR data,

M. Fawakherji and L. Hashemi-Beni, “Flood detection and mapping through multi-resolution sensor fusion: integrating UA V optical imagery and satellite SAR data,”Geomatics, Natural Hazards and Risk, vol. 16, no. 1, p. 2493225, 2025

2025
[14]

Segmentation and visualization of flooded areas through sentinel-1 images and U-net,

F. Pech-Mayet al., “Segmentation and visualization of flooded areas through sentinel-1 images and U-net,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 17, pp. 8996–9008, 2024

2024
[15]

Unpaired image-to-image translation using cycle- consistent adversarial networks,

J.-Y . Zhuet al., “Unpaired image-to-image translation using cycle- consistent adversarial networks,” inProc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2223–2232

2017
[16]

Image-to-image translation with conditional adversarial networks,

P. Isolaet al., “Image-to-image translation with conditional adversarial networks,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 1125–1134

2017
[17]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMICCAI, 2015, pp. 234–241

2015
[18]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liuet al., “Swin transformer: Hierarchical vision transformer using shifted windows,” inProc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 10012–10022

2021
[19]

A crossmodal multiscale fusion network for semantic segmentation of remote sensing data,

X. Ma, X. Zhang, and M.-O. Pun, “A crossmodal multiscale fusion network for semantic segmentation of remote sensing data,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15, pp. 3463–3474, 2022

2022
[20]

DeepFlood for inundated vegetation high- resolution dataset for accurate flood mapping and segmentation,

M. Fawakherjiet al., “DeepFlood for inundated vegetation high- resolution dataset for accurate flood mapping and segmentation,”Sci- entific Data, vol. 12, no. 1, p. 271, 2025

2025
[21]

Sen1Floods11: A georeferenced dataset to train and test deep learning flood algorithms for sentinel-1,

D. Bonafiliaet al., “Sen1Floods11: A georeferenced dataset to train and test deep learning flood algorithms for sentinel-1,” inProc. IEEE/CVF CVPR Workshops, 2020, pp. 210–211

2020
[22]

SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion

M. Schmitt, L. H. Hughes, C. Qiu, and X. X. Zhu, “SEN12MS–A curated dataset of georeferenced multi-spectral sentinel-1/2 imagery for deep learning and data fusion,”arXiv:1906.07789, 2019

work page Pith review arXiv 1906
[23]

McGladeet al.,Global Assessment Report on Disaster Risk Reduction

J. McGladeet al.,Global Assessment Report on Disaster Risk Reduction
[24]

Flood extent mapping: An integrated method using deep learning and region growing using UA V optical data,

L. Hashemi-Beni and A. A. Gebrehiwot, “Flood extent mapping: An integrated method using deep learning and region growing using UA V optical data,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 2127–2135, 2021

2021
[25]

The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features,

S. K. McFeeters, “The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features,”Int. J. Remote Sens., vol. 17, no. 7, pp. 1425–1432, 1996

1996
[26]

Analysis of dynamic thresholds for the normalized difference water index,

L. Ji, L. Zhang, and B. Wylie, “Analysis of dynamic thresholds for the normalized difference water index,”Photogramm. Eng. Remote Sens., vol. 75, no. 11, pp. 1307–1317, 2009

2009
[27]

Deep learning in remote sensing: A comprehensive review and list of resources,

X. X. Zhuet al., “Deep learning in remote sensing: A comprehensive review and list of resources,”IEEE Geosci. Remote Sens. Mag., vol. 5, no. 4, pp. 8–36, 2017

2017
[28]

Segnet: A deep con- volutional encoder-decoder architecture for image segmentation,

V . Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep con- volutional encoder-decoder architecture for image segmentation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, 2017

2017
[29]

Road extraction by deep residual U- net,

Z. Zhang, Q. Liu, and Y . Wang, “Road extraction by deep residual U- net,”IEEE Geosci. Remote Sens. Lett., vol. 15, no. 5, pp. 749–753, 2018

2018
[30]

Instance segmentation of LiDAR data with vision transformer model in support inundation mapping under forest canopy environment,

J. Yanget al., “Instance segmentation of LiDAR data with vision transformer model in support inundation mapping under forest canopy environment,” 2023

2023
[31]

Inundated vegetation mapping using SAR data: A comparison of polarization configurations of UA VSAR L- band and Sentinel C-band,

A. Salem and L. Hashemi-Beni, “Inundated vegetation mapping using SAR data: A comparison of polarization configurations of UA VSAR L- band and Sentinel C-band,”Remote Sensing, vol. 14, no. 24, p. 6374, 2022

2022
[32]

Pixel level fusion techniques for SAR and optical images: A review,

S. C. Kulkarni and P. P. Rege, “Pixel level fusion techniques for SAR and optical images: A review,”Information Fusion, vol. 59, pp. 13–29, 2020

2020
[33]

Research on remote sensing image classification based on feature level fusion,

L. Yuan and G. Zhu, “Research on remote sensing image classification based on feature level fusion,”Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., vol. 42, pp. 2185–2189, 2018

2018
[34]

Advances in spectral-spatial classification of hyper- spectral images,

M. Fauvelet al., “Advances in spectral-spatial classification of hyper- spectral images,”Proc. IEEE, vol. 101, no. 3, pp. 652–675, 2012

2012
[35]

Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion,

A. Meraneret al., “Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion,”ISPRS J. Photogramm. Remote Sens., vol. 166, pp. 333–346, 2020

2020
[36]

Coupling model-and data-driven methods for remote sensing image restoration and fusion: Improving physical interpretabil- ity,

H. Shenet al., “Coupling model-and data-driven methods for remote sensing image restoration and fusion: Improving physical interpretabil- ity,”IEEE Geosci. Remote Sens. Mag., vol. 10, no. 2, pp. 231–249, 2022

2022
[37]

A novel multiscale adaptive binning phase congruency feature for SAR and optical image registration,

J. Fanet al., “A novel multiscale adaptive binning phase congruency feature for SAR and optical image registration,”IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–16, 2022

2022
[38]

Efficient knowledge distillation for remote sensing image classification: a CNN-based approach,

H. Song, C. Wei, and Z. Yong, “Efficient knowledge distillation for remote sensing image classification: a CNN-based approach,”Int. J. Web Inf. Syst., vol. 20, no. 2, pp. 129–158, 2024

2024
[39]

A threshold selection method from gray-level histograms,

N. Otsu, “A threshold selection method from gray-level histograms,” Automatica, vol. 11, no. 285–296, pp. 23–27, 1975

1975
[40]

Based on Trans-Unet remote sensing image segmen- tation network model,

J. Zhenget al., “Based on Trans-Unet remote sensing image segmen- tation network model,” in2025 5th Int. Conf. Artif. Intell., Big Data Algorithms (CAIBDA), 2025, pp. 1425–1428

2025
[41]

MT GAN: A SAR-to-optical image translation method for cloud removal,

P. Wanget al., “MT GAN: A SAR-to-optical image translation method for cloud removal,”ISPRS J. Photogramm. Remote Sens., vol. 225, pp. 180–195, 2025

2025
[42]

Reusing discriminators for encoding: Towards unsuper- vised image-to-image translation,

R. Chenet al., “Reusing discriminators for encoding: Towards unsuper- vised image-to-image translation,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 8168–8177

2020
[43]

TS2O: A token-based two-stage generative architecture for SAR-to-optical translation,

J. Zhanget al., “TS2O: A token-based two-stage generative architecture for SAR-to-optical translation,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 18, pp. 26949–26960, 2025

2025
[44]

MsResNet: Multi-scale edge-enhanced ResNet for RGB-T image segmentation,

B. Adhikariet al., “MsResNet: Multi-scale edge-enhanced ResNet for RGB-T image segmentation,” inInt. Conf. Image Process. Vision Eng., 2025, pp. 59–75. JAGRATI TALREJA(Graduate Member, IEEE) received the B.Tech. degree in Electronics and Com- munication Engineering from Pranveer Singh Insti- tute of Technology, Kanpur, Uttar Pradesh, India, in 2019. She p...

2025