pith. machine review for the scientific record. sign in

arxiv: 2605.02137 · v1 · submitted 2026-05-04 · 💻 cs.CV · cs.AI

Recognition: unknown

FLoRA: Fusion-Latent for Optical Reconstruction and Flood Area Segmentation via Cross-Modal Multi-Task Distillation Network

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:49 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords flood mappingSAR to optical translationmulti-task distillationcross-modal fusionflood segmentationfeature distillationSentinel-1remote sensing
0
0 comments X

The pith

Optical teacher guidance lets SAR data produce high-fidelity flood maps and RGB reconstructions in one network.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that distilling pyramidal features from a lightweight optical teacher into SAR representations creates a shared fusion latent space for two tasks at once: translating Sentinel-1 SAR into realistic RGB images and delineating flood water regions. This setup uses multiscale cross-attention and conditioning to align the modalities while gated residuals and a distillation loss keep the SAR features faithful to optical priors. A reader would care because SAR provides all-weather coverage but lacks easy visual interpretation, so turning it into both usable images and precise hydrologic maps supports faster disaster response without waiting for clear skies.

Core claim

FLoRA jointly reconstructs high-fidelity optical imagery and segments flood water regions from Sentinel 1 SAR by guiding SAR representations into a fusion latent space with pyramidal features from an optical teacher. The guidance occurs through multiscale windowed cross attention and FiLM conditioning, with gated residuals to avoid overcorrection. Training combines Charbonnier SSIM and edge FFT losses for reconstruction fidelity with Dice BCE for hydrology-aware segmentation, plus a feature distillation constraint that aligns the fused SAR features to the teacher's manifold. On SEN1FLOODS11, DEEPFLOOD, and SEN12MS the method exceeds prior fusion baselines in PSNR, SSIM, and LPIPS.

What carries the argument

The teacher-guided fusion latent space formed by multiscale windowed cross attention and FiLM conditioning between SAR inputs and optical pyramidal features, enforced by a distillation constraint.

If this is right

  • SAR data alone can generate RGB images with measurably higher structural and perceptual fidelity than standard fusion methods.
  • Flood boundaries become more precisely aligned to hydrologic edges through the combined Dice BCE loss.
  • The same latent representation supports both visual reconstruction and segmentation without separate models.
  • Performance gains appear consistently across three distinct flood datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The distillation approach could reduce reliance on paired optical-SAR training data for other remote-sensing translation tasks.
  • Real-time flood monitoring becomes feasible in persistently cloudy regions if the lightweight teacher is replaced by a fixed prior at inference.
  • The gated residual design may transfer to cross-modal fusion for drought or wildfire mapping using similar teacher guidance.

Load-bearing premise

The optical teacher's features from RGB and NDVI provide unbiased, transferable guidance to SAR representations without domain-shift artifacts.

What would settle it

A new test set of flood events where FLoRA's PSNR, SSIM, LPIPS, or segmentation accuracy falls below simple SAR-only or non-distilled fusion baselines would falsify the value of the guided latent space.

Figures

Figures reproduced from arXiv: 2605.02137 by Jagrati Talreja, Leila Hashemi-Beni, Tewodros Syum Gebre.

Figure 1
Figure 1. Figure 1: Overview of the proposed FLoRA (Fusion-Latent for Optical Reconstruction and Flood-Water Area Segmentation) framework. Teacher-student frameworks extend this idea further by using feature distillation to guide learning across modalities. In such settings, a teacher model, typically trained on optical imagery, provides high-level guidance to a SAR-based student network [37]. This supervision enables the SAR… view at source ↗
Figure 2
Figure 2. Figure 2: Architecture of the lightweight optical teacher pyramid. Given optical view at source ↗
Figure 3
Figure 3. Figure 3: Visual comparison of SAR-to-Optical translation and flood-water segmentation on SEN1FLOODS11 [21] dataset. Predicted masks are thresholded to view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparison of SAR-to-Optical translation and flood-water segmentation on SEN12MS [22] dataset. Predicted masks are thresholded to binary view at source ↗
Figure 5
Figure 5. Figure 5: Visual comparison of SAR-to-Optical translation and flood-water segmentation on DEEPFLOOD [20] dataset. Predicted masks are thresholded to view at source ↗
Figure 6
Figure 6. Figure 6: Model Parameters vs Optical Resconstruction Performance on view at source ↗
Figure 7
Figure 7. Figure 7: Confusion matrices for FLoRA. and physically consistent representations that benefit both optical reconstruction and flood-water segmentation, explain￾ing the consistent improvements observed across metrics and datasets. V. CONCLUSION We presented FLoRA, a unified cross-modal framework that jointly performs SAR-to-optical reconstruction and flood￾water segmentation within a single model. By leveraging a fu… view at source ↗
read the original abstract

Accurate flood water mapping is critical for disaster management, yet current methods struggle to fully exploit the potential of spaceborne imagery. Optical data offers high interpretability but is limited by environmental conditions, whereas SAR provides reliable all-weather coverage with reduced visual interpretability. FLoRA (Fusion Latent for Optical Reconstruction and Area Segmentation) is a cross-modal multi-task framework that jointly reconstructs high-fidelity optical imagery and segments flood water regions from Sentinel 1 SAR by fusing the complementary strengths of optical and SAR data. During training, a lightweight optical teacher (driven by RGB and NDVI priors) provides pyramidal features that guide SAR representations into a fusion latent space via multiscale windowed cross attention and FiLM conditioning, with gated residuals preventing overcorrection. This design enables multi-task learning across two complementary objectives: (a) SAR-to-optical translation for fine-grained RGB reconstruction and (b) flood water region segmentation for hydrologic interpretation. The dual decoders are optimized using Charbonnier SSIM for structural fidelity, edge FFT magnitude losses for spectral realism, and Dice BCE hydrology-aware edge alignment for precise flood water delineation. A feature distillation constraint further aligns fused SAR features with the optical teacher's manifold. Evaluations on SEN1FLOODS11, DEEPFLOOD, and SEN12MS demonstrate that FLoRA surpasses fusion baselines in PSNR, SSIM, and LPIPS, demonstrating that multi-modal fusion within a teacher-guided latent space yields semantically faithful and physically consistent flood-water intelligence from spaceborne observations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces FLoRA, a cross-modal multi-task distillation network that reconstructs high-fidelity optical imagery from Sentinel-1 SAR while simultaneously segmenting flood water regions. It uses an optical teacher (RGB/NDVI-driven) to guide SAR features into a fusion latent space via multiscale windowed cross-attention and FiLM conditioning, with gated residuals and a feature distillation constraint. Dual decoders are trained with Charbonnier SSIM, edge FFT magnitude, and Dice-BCE hydrology-aware losses. The abstract reports superior PSNR, SSIM, and LPIPS on SEN1FLOODS11, DEEPFLOOD, and SEN12MS relative to fusion baselines.

Significance. If the performance gains prove robust under ablation, the work would offer a practical advance in all-weather flood mapping by showing that teacher-guided multi-modal fusion can produce both visually faithful optical reconstructions and hydrologically consistent segmentations. The combination of structural, spectral, and edge-aware losses tailored to flood semantics is a constructive design choice for remote-sensing applications.

major comments (2)
  1. Abstract: The central performance claim (FLoRA surpasses fusion baselines in PSNR, SSIM, and LPIPS) is presented without any ablation removing the optical teacher, without baseline implementation details, error bars, or statistical tests. This prevents attribution of gains specifically to the fusion latent space and teacher guidance rather than to standard multi-task training.
  2. Abstract (method description): The claim that pyramidal features from the RGB/NDVI optical teacher provide unbiased, transferable guidance to SAR representations via multiscale cross-attention and FiLM lacks supporting evidence such as quantitative domain-shift metrics (e.g., MMD or feature-distribution divergence) or test-time results on purely SAR inputs. Without these, the assertion that the latent space yields 'semantically faithful and physically consistent' outputs remains unverified.
minor comments (1)
  1. Abstract: The parenthetical expansion of the FLoRA acronym ('Fusion Latent for Optical Reconstruction and Area Segmentation') is inconsistent with the title, which includes 'Flood Area Segmentation'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments highlight opportunities to strengthen the attribution of results and the supporting evidence for our cross-modal claims. We address each point below and will incorporate the requested analyses and clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: Abstract: The central performance claim (FLoRA surpasses fusion baselines in PSNR, SSIM, and LPIPS) is presented without any ablation removing the optical teacher, without baseline implementation details, error bars, or statistical tests. This prevents attribution of gains specifically to the fusion latent space and teacher guidance rather than to standard multi-task training.

    Authors: We agree that the abstract and experimental section would benefit from explicit isolation of the teacher's contribution. In the revision we will add an ablation study that trains and evaluates a variant of FLoRA without the optical teacher (removing both the feature distillation loss and the pyramidal feature injection via cross-attention and FiLM). We will also expand the supplementary material with complete baseline implementation details, including network architectures, loss weights, and training schedules. All reported metrics will include standard deviations computed over three independent runs and will be accompanied by statistical significance tests (paired t-tests with p-values). These additions will allow readers to attribute performance differences more precisely to the proposed teacher-guided fusion latent space. revision: yes

  2. Referee: Abstract (method description): The claim that pyramidal features from the RGB/NDVI optical teacher provide unbiased, transferable guidance to SAR representations via multiscale cross-attention and FiLM lacks supporting evidence such as quantitative domain-shift metrics (e.g., MMD or feature-distribution divergence) or test-time results on purely SAR inputs. Without these, the assertion that the latent space yields 'semantically faithful and physically consistent' outputs remains unverified.

    Authors: We acknowledge that direct quantitative evidence of domain alignment would make the guidance claim more robust. We will therefore add a feature-alignment analysis that computes Maximum Mean Discrepancy (MMD) between (i) raw SAR encoder features and the fused latent representation and (ii) the fused representation and the corresponding optical-teacher features. In addition, because inference uses only SAR inputs by design, we will explicitly tabulate reconstruction and segmentation metrics on the held-out test portions of SEN1FLOODS11, DEEPFLOOD, and SEN12MS when the model receives exclusively SAR data (no optical inputs at test time). These results, together with the MMD values, will be presented in a new subsection of the experiments to substantiate the transferability and semantic consistency of the learned latent space. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical DL training outcomes, not derived quantities

full rationale

The paper describes a multi-task neural network (FLoRA) trained end-to-end on SEN1FLOODS11 and related datasets to perform SAR-to-optical reconstruction and flood segmentation. Performance is reported as measured PSNR/SSIM/LPIPS after optimization with standard losses (Charbonnier, Dice-BCE, feature distillation). No equations, first-principles derivations, or 'predictions' are presented that reduce the reported metrics to fitted parameters by construction, nor are any uniqueness theorems or ansatzes imported via self-citation to force the architecture. The central claim is an empirical demonstration of improved metrics under the proposed training regime; the results remain falsifiable by retraining or ablation and do not collapse to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Only abstract available; ledger is therefore limited to high-level assumptions visible in the summary. The central claim rests on standard supervised deep-learning training plus domain priors for optical data.

axioms (2)
  • domain assumption Optical imagery with RGB and NDVI provides reliable high-interpretability priors that can guide SAR feature learning
    Invoked in the description of the lightweight optical teacher and pyramidal feature guidance.
  • ad hoc to paper Multi-scale windowed cross attention and FiLM conditioning can align SAR and optical manifolds without introducing systematic artifacts
    Core mechanism of the fusion latent space; no independent justification supplied in abstract.
invented entities (1)
  • fusion latent space no independent evidence
    purpose: Shared representation that fuses SAR and optical features for both reconstruction and segmentation
    Central architectural construct introduced to enable the multi-task distillation; no external falsifiable prediction attached.

pith-pipeline@v0.9.0 · 5590 in / 1348 out tokens · 26186 ms · 2026-05-09T16:49:30.024907+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 1 canonical work pages

  1. [1]

    Increasing stress on disaster-risk finance due to large floods,

    B. Jongmanet al., “Increasing stress on disaster-risk finance due to large floods,”Nature Climate Change, vol. 4, no. 4, pp. 264–268, 2014

  2. [2]

    The need for a high-accuracy, open-access global DEM,

    G. J. P. Schumann and P. D. Bates, “The need for a high-accuracy, open-access global DEM,”Frontiers in Earth Science, vol. 6, p. 225, 2018

  3. [3]

    Sentinel-2: ESA’s optical high-resolution mission for GMES operational services,

    M. Druschet al., “Sentinel-2: ESA’s optical high-resolution mission for GMES operational services,”Remote Sensing of Environment, vol. 120, pp. 25–36, 2012

  4. [4]

    Sentinel-1 InSAR coherence to detect floodwater in urban areas: Houston and Hurricane Harvey as a test case,

    M. Chiniet al., “Sentinel-1 InSAR coherence to detect floodwater in urban areas: Houston and Hurricane Harvey as a test case,”Remote Sensing, vol. 11, no. 2, p. 107, 2019

  5. [5]

    GMES Sentinel-1 mission,

    R. Torreset al., “GMES Sentinel-1 mission,”Remote Sensing of Envi- ronment, vol. 120, pp. 9–24, 2012

  6. [6]

    Flood monitoring using multi-temporal COSMO- SkyMed data: Image segmentation and signature interpretation,

    L. Pulvirentiet al., “Flood monitoring using multi-temporal COSMO- SkyMed data: Image segmentation and signature interpretation,”Remote Sensing of Environment, vol. 115, no. 4, pp. 990–1002, 2011

  7. [7]

    Google Earth Engine: Planetary-scale geospatial analysis for everyone,

    N. Gorelicket al., “Google Earth Engine: Planetary-scale geospatial analysis for everyone,”Remote Sensing of Environment, vol. 202, pp. 18–27, 2017

  8. [8]

    Limitations and potential of satellite imagery to monitor environmental response to coastal flooding,

    E. Ramsey IIIet al., “Limitations and potential of satellite imagery to monitor environmental response to coastal flooding,”J. Coastal Res., vol. 28, no. 2, pp. 457–476, 2012

  9. [9]

    SAR image analysis techniques for flood area map- ping—Literature survey,

    R. Manavalan, “SAR image analysis techniques for flood area map- ping—Literature survey,”Earth Sci. Informatics, vol. 10, no. 1, pp. 1–14, 2017

  10. [10]

    Fusing of optical and synthetic aperture radar (SAR) remote sensing data: A systematic literature review (SLR),

    S. Mahyoubet al., “Fusing of optical and synthetic aperture radar (SAR) remote sensing data: A systematic literature review (SLR),”Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., vol. 42, pp. 127–138, 2019

  11. [11]

    Multi-Head Encoder-Decoder Deep Learning Architecture for Flood Segmentation and Mapping Through Multi-Sensor Data Fusion,

    M. Fawakherji and L. Hashemi-Beni, “Multi-Head Encoder-Decoder Deep Learning Architecture for Flood Segmentation and Mapping Through Multi-Sensor Data Fusion,” inIGARSS 2024–IEEE Int. Geosci. Remote Sens. Symp., 2024, pp. 1191–1195

  12. [12]

    Flood detection and mapping through multi-resolution sensor fusion: integrating UA V optical imagery and satellite SAR data,

    M. Fawakherji and L. Hashemi-Beni, “Flood detection and mapping through multi-resolution sensor fusion: integrating UA V optical imagery and satellite SAR data,”Geomatics, Natural Hazards and Risk, vol. 16, no. 1, p. 2493225, 2025

  13. [14]

    Segmentation and visualization of flooded areas through sentinel-1 images and U-net,

    F. Pech-Mayet al., “Segmentation and visualization of flooded areas through sentinel-1 images and U-net,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 17, pp. 8996–9008, 2024

  14. [15]

    Unpaired image-to-image translation using cycle- consistent adversarial networks,

    J.-Y . Zhuet al., “Unpaired image-to-image translation using cycle- consistent adversarial networks,” inProc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2223–2232

  15. [16]

    Image-to-image translation with conditional adversarial networks,

    P. Isolaet al., “Image-to-image translation with conditional adversarial networks,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 1125–1134

  16. [17]

    U-net: Convolutional networks for biomedical image segmentation,

    O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMICCAI, 2015, pp. 234–241

  17. [18]

    Swin transformer: Hierarchical vision transformer using shifted windows,

    Z. Liuet al., “Swin transformer: Hierarchical vision transformer using shifted windows,” inProc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 10012–10022

  18. [19]

    A crossmodal multiscale fusion network for semantic segmentation of remote sensing data,

    X. Ma, X. Zhang, and M.-O. Pun, “A crossmodal multiscale fusion network for semantic segmentation of remote sensing data,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15, pp. 3463–3474, 2022

  19. [20]

    DeepFlood for inundated vegetation high- resolution dataset for accurate flood mapping and segmentation,

    M. Fawakherjiet al., “DeepFlood for inundated vegetation high- resolution dataset for accurate flood mapping and segmentation,”Sci- entific Data, vol. 12, no. 1, p. 271, 2025

  20. [21]

    Sen1Floods11: A georeferenced dataset to train and test deep learning flood algorithms for sentinel-1,

    D. Bonafiliaet al., “Sen1Floods11: A georeferenced dataset to train and test deep learning flood algorithms for sentinel-1,” inProc. IEEE/CVF CVPR Workshops, 2020, pp. 210–211

  21. [22]

    SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion

    M. Schmitt, L. H. Hughes, C. Qiu, and X. X. Zhu, “SEN12MS–A curated dataset of georeferenced multi-spectral sentinel-1/2 imagery for deep learning and data fusion,”arXiv:1906.07789, 2019

  22. [23]

    McGladeet al.,Global Assessment Report on Disaster Risk Reduction

    J. McGladeet al.,Global Assessment Report on Disaster Risk Reduction

  23. [24]

    Flood extent mapping: An integrated method using deep learning and region growing using UA V optical data,

    L. Hashemi-Beni and A. A. Gebrehiwot, “Flood extent mapping: An integrated method using deep learning and region growing using UA V optical data,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 2127–2135, 2021

  24. [25]

    The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features,

    S. K. McFeeters, “The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features,”Int. J. Remote Sens., vol. 17, no. 7, pp. 1425–1432, 1996

  25. [26]

    Analysis of dynamic thresholds for the normalized difference water index,

    L. Ji, L. Zhang, and B. Wylie, “Analysis of dynamic thresholds for the normalized difference water index,”Photogramm. Eng. Remote Sens., vol. 75, no. 11, pp. 1307–1317, 2009

  26. [27]

    Deep learning in remote sensing: A comprehensive review and list of resources,

    X. X. Zhuet al., “Deep learning in remote sensing: A comprehensive review and list of resources,”IEEE Geosci. Remote Sens. Mag., vol. 5, no. 4, pp. 8–36, 2017

  27. [28]

    Segnet: A deep con- volutional encoder-decoder architecture for image segmentation,

    V . Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep con- volutional encoder-decoder architecture for image segmentation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, 2017

  28. [29]

    Road extraction by deep residual U- net,

    Z. Zhang, Q. Liu, and Y . Wang, “Road extraction by deep residual U- net,”IEEE Geosci. Remote Sens. Lett., vol. 15, no. 5, pp. 749–753, 2018

  29. [30]

    Instance segmentation of LiDAR data with vision transformer model in support inundation mapping under forest canopy environment,

    J. Yanget al., “Instance segmentation of LiDAR data with vision transformer model in support inundation mapping under forest canopy environment,” 2023

  30. [31]

    Inundated vegetation mapping using SAR data: A comparison of polarization configurations of UA VSAR L- band and Sentinel C-band,

    A. Salem and L. Hashemi-Beni, “Inundated vegetation mapping using SAR data: A comparison of polarization configurations of UA VSAR L- band and Sentinel C-band,”Remote Sensing, vol. 14, no. 24, p. 6374, 2022

  31. [32]

    Pixel level fusion techniques for SAR and optical images: A review,

    S. C. Kulkarni and P. P. Rege, “Pixel level fusion techniques for SAR and optical images: A review,”Information Fusion, vol. 59, pp. 13–29, 2020

  32. [33]

    Research on remote sensing image classification based on feature level fusion,

    L. Yuan and G. Zhu, “Research on remote sensing image classification based on feature level fusion,”Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., vol. 42, pp. 2185–2189, 2018

  33. [34]

    Advances in spectral-spatial classification of hyper- spectral images,

    M. Fauvelet al., “Advances in spectral-spatial classification of hyper- spectral images,”Proc. IEEE, vol. 101, no. 3, pp. 652–675, 2012

  34. [35]

    Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion,

    A. Meraneret al., “Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion,”ISPRS J. Photogramm. Remote Sens., vol. 166, pp. 333–346, 2020

  35. [36]

    Coupling model-and data-driven methods for remote sensing image restoration and fusion: Improving physical interpretabil- ity,

    H. Shenet al., “Coupling model-and data-driven methods for remote sensing image restoration and fusion: Improving physical interpretabil- ity,”IEEE Geosci. Remote Sens. Mag., vol. 10, no. 2, pp. 231–249, 2022

  36. [37]

    A novel multiscale adaptive binning phase congruency feature for SAR and optical image registration,

    J. Fanet al., “A novel multiscale adaptive binning phase congruency feature for SAR and optical image registration,”IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–16, 2022

  37. [38]

    Efficient knowledge distillation for remote sensing image classification: a CNN-based approach,

    H. Song, C. Wei, and Z. Yong, “Efficient knowledge distillation for remote sensing image classification: a CNN-based approach,”Int. J. Web Inf. Syst., vol. 20, no. 2, pp. 129–158, 2024

  38. [39]

    A threshold selection method from gray-level histograms,

    N. Otsu, “A threshold selection method from gray-level histograms,” Automatica, vol. 11, no. 285–296, pp. 23–27, 1975

  39. [40]

    Based on Trans-Unet remote sensing image segmen- tation network model,

    J. Zhenget al., “Based on Trans-Unet remote sensing image segmen- tation network model,” in2025 5th Int. Conf. Artif. Intell., Big Data Algorithms (CAIBDA), 2025, pp. 1425–1428

  40. [41]

    MT GAN: A SAR-to-optical image translation method for cloud removal,

    P. Wanget al., “MT GAN: A SAR-to-optical image translation method for cloud removal,”ISPRS J. Photogramm. Remote Sens., vol. 225, pp. 180–195, 2025

  41. [42]

    Reusing discriminators for encoding: Towards unsuper- vised image-to-image translation,

    R. Chenet al., “Reusing discriminators for encoding: Towards unsuper- vised image-to-image translation,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 8168–8177

  42. [43]

    TS2O: A token-based two-stage generative architecture for SAR-to-optical translation,

    J. Zhanget al., “TS2O: A token-based two-stage generative architecture for SAR-to-optical translation,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 18, pp. 26949–26960, 2025

  43. [44]

    MsResNet: Multi-scale edge-enhanced ResNet for RGB-T image segmentation,

    B. Adhikariet al., “MsResNet: Multi-scale edge-enhanced ResNet for RGB-T image segmentation,” inInt. Conf. Image Process. Vision Eng., 2025, pp. 59–75. JAGRATI TALREJA(Graduate Member, IEEE) received the B.Tech. degree in Electronics and Com- munication Engineering from Pranveer Singh Insti- tute of Technology, Kanpur, Uttar Pradesh, India, in 2019. She p...