pith. machine review for the scientific record. sign in

arxiv: 2604.22174 · v1 · submitted 2026-04-24 · 💻 cs.CV

Recognition: unknown

Unlocking Optical Prior: Spectrum-Guided Knowledge Transfer for SAR Generalized Category Discovery

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:32 UTC · model grok-4.3

classification 💻 cs.CV
keywords Generalized Category DiscoverySAR imageryOptical prior transferModal Discrepancy CurveCross-modal adaptationFrequency-domain modelingDomain adaptationLarge vision models
0
0 comments X

The pith

Modeling cross-modal spectral discrepancies with a Modal Discrepancy Curve allows effective transfer of optical priors to SAR imagery for generalized category discovery.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the challenge of generalized category discovery in SAR data, where labels are scarce and large vision models trained on optical images suffer from imaging incompatibility. It introduces the Modal Discrepancy Curve to represent these differences as a frequency-domain descriptor based on spectral energy distributions. The MDC-guided Cross-modal Prior Transfer framework then uses this curve during pre-training on paired optical-SAR data, with modules that tokenize the curve and refine features band-wise before contrastive alignment. The adapted representations improve single-modal SAR tasks downstream. A sympathetic reader cares because this offers a structured way to leverage abundant optical knowledge for radar analysis without extensive new labeling.

Core claim

The Modal Discrepancy Curve (MDC) models cross-modal discrepancy as a structured frequency-domain descriptor derived from spectral energy distributions. Leveraging this formulation, the MDC-guided Cross-modal Prior Transfer (MCPT) framework operates on paired optical-SAR data, where Adaptive Frequency Tokenization (AFT) converts the MDC into learnable tokens and Frequency-aware Expert Refinement (FER) performs band-wise discrepancy-aware feature refinement. Contrastive learning then aligns refined embeddings across modalities to internalize the adaptation pattern, yielding superior SAR feature representations for downstream single-modal SAR-GCD tasks.

What carries the argument

The Modal Discrepancy Curve (MDC), a frequency-domain descriptor of spectral energy differences between optical and SAR modalities, which provides the inductive bias to guide tokenization, refinement, and contrastive alignment in the MCPT pre-training framework.

If this is right

  • Superior SAR feature representations become available for single-modal generalized category discovery without labels.
  • State-of-the-art results appear across multiple mainstream SAR datasets.
  • Optical priors from large vision models adapt more effectively to SAR than with existing domain adaptation methods lacking imaging-characteristic bias.
  • Frequency-domain discrepancy modeling supplies a usable inductive bias that reflects physical imaging differences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The spectrum-guided transfer pattern could extend to other label-scarce modalities such as infrared or multispectral remote sensing.
  • The approach may encourage similar frequency-based discrepancy models for cross-modal adaptation in medical imaging or autonomous driving sensors.
  • Testing whether the internalized adaptation holds on unpaired SAR data or in operational remote-sensing pipelines would be a direct next step.
  • Linking the Modal Discrepancy Curve more explicitly to SAR physical scattering properties could refine the method further.

Load-bearing premise

The Modal Discrepancy Curve derived from spectral energy distributions supplies an inductive bias that accurately captures the incompatibility between optical priors and SAR imaging so that transfer succeeds.

What would settle it

Experiments showing that SAR-GCD accuracy gains vanish when the MDC guidance is removed or replaced by random frequency tokens, or that performance fails to exceed standard domain adaptation baselines on paired-data benchmarks.

Figures

Figures reproduced from arXiv: 2604.22174 by Jingyuan Xia, Ruikang Hu, Xu Lan, Ye Li, Zhejun Lu, Zhixiong Yang.

Figure 1
Figure 1. Figure 1: The Modal Discrepancy Curve (MDC) formulation process. view at source ↗
Figure 2
Figure 2. Figure 2: (a) The overall framework of the proposed MCPT. (b) Construction of the MDC and its discretization into spectral tokens via the AFT module. (c) view at source ↗
Figure 3
Figure 3. Figure 3: Confusion matrices on the long-tail FUSAR dataset with DINOv2. view at source ↗
Figure 4
Figure 4. Figure 4: Parameter tuning studies of the proposed method on the MSTAR dataset with DINOv2. (a) Weight coefficient view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of features extracted by different methods on the MSTAR dataset with DINOv2. (a) GCD. (b) SimGCD. (c) InfoSieve. (d) CMS. (e) view at source ↗
Figure 6
Figure 6. Figure 6: Visualization of MDC for different scenes in the YESeg-OPT-SAR dataset. (a) Farmland. (b) Vegetation. (c) Water. (d) Ship. view at source ↗
Figure 7
Figure 7. Figure 7: Visualization of frequency token heatmaps in ship and building scenes. view at source ↗
read the original abstract

Generalized Category Discovery (GCD) holds significant promise for the label-scarce Synthetic Aperture Radar (SAR) domain, yet its efficacy is severely constrained by the cross-modal incompatibility between the inherent optical prior of the Large Vision Models (LVMs) and SAR imagery. Existing domain adaptation methods often lack an inductive bias that reflects imaging characteristics, consequently failing to effectively transfer optical prior into the SAR domain. To address this issue, the Modal Discrepancy Curve (MDC) is introduced to model cross-modal discrepancy as a structured frequency-domain descriptor derived from spectral energy distributions. Leveraging this formulation, we propose the MDC-guided Cross-modal Prior Transfer (MCPT) framework, a pre-training paradigm that operates on paired optical-SAR data. Within this framework, Adaptive Frequency Tokenization (AFT) converts the MDC into learnable tokens, and Frequency-aware Expert Refinement (FER) performs band-wise discrepancy-aware feature refinement using these tokens. Based on the refined representations, contrastive learning aligns refined embeddings across modalities and internalizes the adaptation pattern. Ultimately, the superior SAR feature representation capability learned during paired pre-training is applied to downstream single-modal SAR-GCD tasks. Extensive experiments demonstrate state-of-the-art performance across multiple mainstream datasets, indicating that frequency-domain discrepancy modeling enables more effective adaptation of optical prior to SAR imagery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces the Modal Discrepancy Curve (MDC) derived from spectral energy distributions to model cross-modal incompatibility between optical large vision model priors and SAR imagery. It proposes the MDC-guided Cross-modal Prior Transfer (MCPT) framework, which uses Adaptive Frequency Tokenization (AFT) to convert MDC into learnable tokens and Frequency-aware Expert Refinement (FER) for band-wise feature refinement, followed by contrastive alignment on paired optical-SAR data. The adapted representations are then applied to downstream single-modal SAR Generalized Category Discovery, with claims of state-of-the-art performance on mainstream datasets.

Significance. If the results hold, the work offers a novel frequency-domain approach to bridge the optical-SAR modality gap for label-scarce GCD tasks, potentially advancing remote sensing applications by internalizing adaptation patterns via pre-training. The introduction of MDC as a structured descriptor, along with AFT and FER, provides a creative inductive bias that could generalize beyond the specific setting if its advantages over alternatives are demonstrated.

major comments (2)
  1. [Method] Method section (MDC formulation and justification): The paper positions the Modal Discrepancy Curve, derived from spectral energy distributions, as the key inductive bias enabling effective transfer via AFT and FER. However, no derivation or comparison is provided showing why this spectral-energy-based measure outperforms alternatives such as phase-based, wavelet-based, or learned spatial discrepancy measures. This is load-bearing for the central claim, as gains could arise from contrastive pre-training or paired data rather than the frequency-guided mechanism.
  2. [Experiments] Experiments section: The abstract asserts SOTA results across mainstream datasets, but the manuscript requires explicit quantitative tables, ablation studies isolating AFT and FER contributions, baseline comparisons, dataset details, and error bars to allow verification of the performance claims and the role of MDC.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to provide stronger justification for the MDC and more complete experimental reporting.

read point-by-point responses
  1. Referee: [Method] Method section (MDC formulation and justification): The paper positions the Modal Discrepancy Curve, derived from spectral energy distributions, as the key inductive bias enabling effective transfer via AFT and FER. However, no derivation or comparison is provided showing why this spectral-energy-based measure outperforms alternatives such as phase-based, wavelet-based, or learned spatial discrepancy measures. This is load-bearing for the central claim, as gains could arise from contrastive pre-training or paired data rather than the frequency-guided mechanism.

    Authors: We agree that the original manuscript would benefit from an explicit derivation and direct comparisons. The MDC is motivated by the physical properties of SAR imaging, where spectral energy distributions reflect radar-specific scattering behaviors that differ from optical imagery. In the revision we will add a mathematical derivation of the MDC from spectral energy and include ablation studies comparing it to phase-based, wavelet-based, and learned spatial discrepancy measures. These additions will help isolate the contribution of the frequency-guided mechanism from contrastive pre-training and paired data alone. revision: yes

  2. Referee: [Experiments] Experiments section: The abstract asserts SOTA results across mainstream datasets, but the manuscript requires explicit quantitative tables, ablation studies isolating AFT and FER contributions, baseline comparisons, dataset details, and error bars to allow verification of the performance claims and the role of MDC.

    Authors: We concur that the experimental section requires expansion for full verifiability. The revised manuscript will incorporate explicit quantitative tables reporting SOTA results on the mainstream datasets, ablation studies that isolate the individual contributions of AFT and FER, comprehensive baseline comparisons, detailed dataset descriptions, and error bars computed over multiple runs. These changes will enable readers to assess both the performance claims and the specific role of the MDC-guided components. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; MDC and MCPT remain independent of self-definition or fitted inputs

full rationale

The provided abstract and context introduce the Modal Discrepancy Curve (MDC) as a new frequency-domain descriptor derived from spectral energy distributions, then build the MCPT framework around AFT tokenization, FER refinement, and contrastive alignment before downstream transfer. No equations, derivations, or self-citations appear that would reduce the claimed inductive bias, adaptation superiority, or SOTA performance to tautological inputs by construction. The central premise posits the utility of spectral-energy MDC without exhibiting a reduction to prior fitted parameters or self-referential definitions. This matches the reader's assessment that no abstract-level equations enable tautological reduction, yielding only minor (non-load-bearing) circularity risk at most.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The central claim rests on the domain assumption that frequency-domain spectral discrepancy can be turned into an effective inductive bias for optical-to-SAR transfer; three new procedural entities (MDC, AFT, FER) are introduced without independent falsifiable evidence beyond the SOTA claim.

axioms (1)
  • domain assumption Cross-modal incompatibility between optical LVM priors and SAR imagery can be modeled as a structured frequency-domain descriptor derived from spectral energy distributions.
    Invoked as the core modeling choice that existing adaptation methods lack.
invented entities (3)
  • Modal Discrepancy Curve (MDC) no independent evidence
    purpose: Structured frequency-domain descriptor of cross-modal discrepancy
    Newly defined in the paper to guide adaptation
  • Adaptive Frequency Tokenization (AFT) no independent evidence
    purpose: Conversion of MDC into learnable tokens for pre-training
    Component of the proposed MCPT framework
  • Frequency-aware Expert Refinement (FER) no independent evidence
    purpose: Band-wise discrepancy-aware feature refinement using the tokens
    Component of the proposed MCPT framework

pith-pipeline@v0.9.0 · 5549 in / 1362 out tokens · 62350 ms · 2026-05-08T12:32:40.672833+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references · 5 canonical work pages · 2 internal anchors

  1. [1]

    Multi- view SAR target recognition based on adaptive semantic-view feature embedding fusion network,

    H. Wang, B. Sun, W. Yang, H. Zeng, C. Li, and J. Chen, “Multi- view SAR target recognition based on adaptive semantic-view feature embedding fusion network,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5221022, pp. 1–22, Oct. 2025

  2. [2]

    Target-aspect domain continual learning for SAR target recognition,

    H. Chen, C. Du, J. Zhu, and D. Guo, “Target-aspect domain continual learning for SAR target recognition,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5205514, pp. 1–14, Feb. 2025

  3. [3]

    WHFNet: A wavelet-driven heterogeneous fusion network for high-frequency en- hanced optical-SAR remote sensing segmentation,

    B. Ren, Q. Wang, B. Liu, B. Hou, C. Yang, and L. Jiao, “WHFNet: A wavelet-driven heterogeneous fusion network for high-frequency en- hanced optical-SAR remote sensing segmentation,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5607817, pp. 1–17, Feb. 2026

  4. [4]

    SARLang-1M: A benchmark for vision-language model- ing in SAR image understanding,

    Y . Weiet al., “SARLang-1M: A benchmark for vision-language model- ing in SAR image understanding,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5201320, pp. 1–20, Jan. 2026

  5. [5]

    A few-shot SAR target recognition method by unifying local classification with feature generation and calibration,

    S. Wang, Y . Wang, H. Liu, Y . Sun, and C. Zhang, “A few-shot SAR target recognition method by unifying local classification with feature generation and calibration,”IEEE Trans. Geosci. Remote Sens., vol. 62, no. 5200319, pp. 1–19, Nov. 2023

  6. [6]

    Band-kernel stochastic learning for unsupervised blind hyperspectral image super-resolution,

    Z. Yanget al., “Band-kernel stochastic learning for unsupervised blind hyperspectral image super-resolution,”IEEE Trans. Pattern Anal. Mach. Intell., pp. 1–18, Apr. 2026

  7. [7]

    Few-shot class-incremental SAR target recognition via cosine prototype learning,

    Y . Zhao, L. Zhao, D. Ding, D. Hu, G. Kuang, and L. Liu, “Few-shot class-incremental SAR target recognition via cosine prototype learning,” IEEE Trans. Geosci. Remote Sens., vol. 61, no. 5212718, pp. 1–18, Jul. 2023

  8. [8]

    Metalearning-based alternating minimization algorithm for nonconvex optimization,

    J. Xia, S. Li, J. Huang, Z. Yang, I. M. Jaimoukha, and D. Gündüz, “Metalearning-based alternating minimization algorithm for nonconvex optimization,”IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 9, pp. 5366–5380, Apr. 2023

  9. [9]

    Xcon: Learning with experts for fine-grained category dis- covery.arXiv preprint arXiv:2208.01898, 2022

    Y . Fei, Z. Zhao, S. Yang, and B. Zhao, “XCon: Learning with experts for fine-grained category discovery,” 2022,arXiv:2208.01898

  10. [10]

    Category discovery: An open-world perspective,

    Z. He, Y . Liu, and K. Han, “Category discovery: An open-world perspective,” 2025,arXiv:2509.22542

  11. [11]

    Context2Context: A zero-shot SAR image speckle filter,

    L. Dai and S. Chen, “Context2Context: A zero-shot SAR image speckle filter,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5204911, pp. 1–11, Mar. 2026

  12. [12]

    Noise-tolerant novel-view SAR synthesis via denoising diffusion,

    A. Rahimi and S. Yu, “Noise-tolerant novel-view SAR synthesis via denoising diffusion,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5202111, pp. 1–11, Jan. 2026

  13. [13]

    MCS filter: A multi- channel structure-aware speckle filter for SAR images,

    B. Saheya, R. Cai, H. Zhao, M. Gong, and X. Li, “MCS filter: A multi- channel structure-aware speckle filter for SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5213114, pp. 1–14, May 2025

  14. [14]

    Solving the catastrophic forgetting problem in generalized category discovery,

    X. Caoet al., “Solving the catastrophic forgetting problem in generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 16 880–16 889

  15. [15]

    Hyperbolic category discovery,

    Y . Liu, Z. He, and K. Han, “Hyperbolic category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2025, pp. 9891–9900

  16. [16]

    Federated generalized category discovery,

    N. Pu, W. Li, X. Ji, Y . Qin, N. Sebe, and Z. Zhong, “Federated generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 28 741–28 750

  17. [17]

    Prototypical hash encoding for on-the-fly fine-grained category discovery,

    H. Zheng, N. Pu, W. Li, N. Sebe, and Z. Zhong, “Prototypical hash encoding for on-the-fly fine-grained category discovery,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 37, Dec. 2024, pp. 101 428–101 455

  18. [18]

    Para- metric information maximization for generalized category discovery,

    F. Chiaroni, J. Dolz, Z. I. Masud, A. Mitiche, and I. Ben Ayed, “Para- metric information maximization for generalized category discovery,” in IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 1729–1739

  19. [19]

    Decoupling pseudo label disambiguation and represen- tation learning for generalized intent discovery,

    Y . Mouet al., “Decoupling pseudo label disambiguation and represen- tation learning for generalized intent discovery,” inProc. Annu. Meet. Assoc. Comput. Linguist. (ACL), Jul. 2023, pp. 9661–9675

  20. [20]

    Generalized category discovery with decoupled prototypical network,

    W. An, F. Tian, Q. Zheng, W. Ding, Q. Wang, and P. Chen, “Generalized category discovery with decoupled prototypical network,” inAAAI Conf. Artif. Intell. (AAAI), vol. 37, no. 11, Jun. 2023, pp. 12 527–12 535

  21. [21]

    Learning to distinguish samples for generalized category discovery,

    F. Yanget al., “Learning to distinguish samples for generalized category discovery,” inEur. Conf. Comput. Vis. (ECCV), Nov. 2024, pp. 105–122

  22. [22]

    When domain generalization meets generalized category discovery: An adaptive task-arithmetic driven approach,

    V . Rathoreet al., “When domain generalization meets generalized category discovery: An adaptive task-arithmetic driven approach,” in IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2025, pp. 4905–4915

  23. [23]

    Towards understanding parametric generalized category discovery on graphs,

    B. Denget al., “Towards understanding parametric generalized category discovery on graphs,” inInt. Conf. Mach. Learn. (ICML), vol. 267, Oct. 2025, pp. 13 069–13 109

  24. [24]

    A fresh look at generalized category discovery through non-negative matrix factorization,

    Z. Ji, S. Yang, J. Liu, Y . Pang, and C. Tang, “A fresh look at generalized category discovery through non-negative matrix factorization,”IEEE Trans. Circuits Syst. Video Technol., Feb. 2026

  25. [25]

    Towards distribution-agnostic generalized category discov- ery,

    J. Baiet al., “Towards distribution-agnostic generalized category discov- ery,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 36, Dec. 2023, pp. 58 625–58 647

  26. [26]

    Incremental generalized category discov- ery,

    B. Zhao and O. Mac Aodha, “Incremental generalized category discov- ery,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 19 137– 19 147

  27. [27]

    MFJA: Unsupervised domain adaptation based on multimodal feature fusion and global–local joint alignment for SAR ATR,

    C. Zhang, Y . Wang, H. Liu, S. Wang, X. Zhang, and C. Qu, “MFJA: Unsupervised domain adaptation based on multimodal feature fusion and global–local joint alignment for SAR ATR,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5214420, pp. 1–20, Jun. 2025

  28. [28]

    Zero-shot domain adaptation for SAR target recognition based on cooperative learning of domain alignment and task alignment,

    G. Chen, S. Zhang, Z. Zhou, L. Zhao, and G. Kuang, “Zero-shot domain adaptation for SAR target recognition based on cooperative learning of domain alignment and task alignment,”IEEE Trans. Radar Syst., vol. 3, pp. 890–904, Jun. 2025

  29. [29]

    Unsupervised domain adaptation for SAR ship detection based on multitask decoupling,

    Y . Yang, X. Yang, and D. Yang, “Unsupervised domain adaptation for SAR ship detection based on multitask decoupling,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 18, pp. 12 684–12 696, May 2025

  30. [30]

    Meta-learning based domain prior with application to optical-ISAR image translation,

    H. Liao, J. Xia, Z. Yang, F. Pan, Z. Liu, and Y . Liu, “Meta-learning based domain prior with application to optical-ISAR image translation,” IEEE Trans. Circuits Syst. Video Technol., vol. 34, no. 8, pp. 7041–7056, Sep. 2024

  31. [31]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    A. Dosovitskiyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,” 2020,arXiv:2010.11929

  32. [32]

    Learning transferable visual models from natural language supervision,

    A. Radfordet al., “Learning transferable visual models from natural language supervision,” inInt. Conf. Mach. Learn. (ICML), vol. 139, Jul. 2021, pp. 8748–8763

  33. [33]

    Generalized category discovery,

    S. Vaze, K. Han, A. Vedaldi, and A. Zisserman, “Generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 7492–7501

  34. [34]

    k-means++: The advantages of careful seeding,

    D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” 2006

  35. [35]

    Dynamic conceptional contrastive learning for generalized category discovery,

    N. Pu, Z. Zhong, and N. Sebe, “Dynamic conceptional contrastive learning for generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, pp. 7579–7588

  36. [36]

    CIPR: An efficient framework with cross-instance positive relations for generalized category discovery,

    S. Hao, K. Han, and K. K. Wong, “CIPR: An efficient framework with cross-instance positive relations for generalized category discovery,” 2023,arXiv:2304.06928

  37. [37]

    Learning semi-supervised Gaussian mixture models for generalized category discovery,

    B. Zhao, X. Wen, and K. Han, “Learning semi-supervised Gaussian mixture models for generalized category discovery,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 16 623–16 633

  38. [38]

    Monte Carlo sampling methods using Markov chains and their applications,

    W. K. Hastings, “Monte Carlo sampling methods using Markov chains and their applications,”Biometrika, vol. 57, no. 1, pp. 97–109, Apr. 1970

  39. [39]

    Contrastive mean-shift learning for generalized category discovery,

    S. Choi, D. Kang, and M. Cho, “Contrastive mean-shift learning for generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 23 094–23 104

  40. [40]

    Mean shift, mode seeking, and clustering,

    Y . Cheng, “Mean shift, mode seeking, and clustering,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 8, pp. 790–799, Aug. 1995. 14

  41. [41]

    Mean shift: A robust approach toward feature space analysis,

    D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002

  42. [42]

    Selex: Self-expertise in fine-grained generalized category discovery,

    S. Rastegar, M. Salehi, Y . M. Asano, H. Doughty, and C. G. M. Snoek, “Selex: Self-expertise in fine-grained generalized category discovery,” inEur. Conf. Comput. Vis. (ECCV), Dec. 2024, pp. 440–458

  43. [43]

    Parametric classification for generalized category discovery: A baseline study,

    X. Wen, B. Zhao, and X. Qi, “Parametric classification for generalized category discovery: A baseline study,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 16 590–16 600

  44. [44]

    Active generalized category discovery,

    S. Ma, F. Zhu, Z. Zhong, X. Zhang, and C. Liu, “Active generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 16 890–16 900

  45. [45]

    ProtoGCD: Unified and unbiased prototype learning for generalized category discovery,

    S. Ma, F. Zhu, X. Zhang, and C. Liu, “ProtoGCD: Unified and unbiased prototype learning for generalized category discovery,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 47, no. 7, pp. 6022–6038, Apr. 2025

  46. [46]

    Domain adaptive oriented object detection from optical to SAR images,

    H. Huang, J. Guo, H. Lin, Y . Huang, and X. Ding, “Domain adaptive oriented object detection from optical to SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5200314, pp. 1–14, Dec. 2024

  47. [47]

    Unsupervised domain adaptation for SAR target detection,

    Y . Shi, L. Du, and Y . Guo, “Unsupervised domain adaptation for SAR target detection,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 6372–6385, Jun. 2021

  48. [48]

    Cross-modal ship detection from optical to SAR images based on pixel- and feature-level progressive transfer,

    Y . Zhu, J. Ai, W. Xue, Z. Wang, and Z. Zhao, “Cross-modal ship detection from optical to SAR images based on pixel- and feature-level progressive transfer,”IEEE Sensors J., vol. 25, no. 8, pp. 13 344–13 356, Feb. 2025

  49. [49]

    A domain-adaptive few-shot SAR ship detection algorithm driven by the latent similarity between optical and SAR images,

    Z. Zhou, L. Zhao, K. Ji, and G. Kuang, “A domain-adaptive few-shot SAR ship detection algorithm driven by the latent similarity between optical and SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 62, no. 5216318, pp. 1–18, Jul. 2024

  50. [50]

    Cross- modality domain adaptation based on semantic graph learning: From optical to SAR images,

    X. Zhang, Z. Huang, X. Yao, X. Feng, G. Cheng, and J. Han, “Cross- modality domain adaptation based on semantic graph learning: From optical to SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5620215, pp. 1–15, Apr. 2025

  51. [51]

    SARDet-100K: Towards open-source benchmark and toolkit for large-scale SAR object detection,

    Y . Liet al., “SARDet-100K: Towards open-source benchmark and toolkit for large-scale SAR object detection,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 37, Dec. 2024, pp. 128 430–128 461

  52. [52]

    Unsupervised domain adaptation based on progressive transfer for ship detection: From optical to SAR images,

    Y . Shi, L. Du, Y . Guo, and Y . Du, “Unsupervised domain adaptation based on progressive transfer for ship detection: From optical to SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 60, no. 5230317, pp. 1–17, Jun. 2022

  53. [53]

    Blind super-resolution via meta-learning and Markov chain Monte Carlo simulation,

    J. Xiaet al., “Blind super-resolution via meta-learning and Markov chain Monte Carlo simulation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 8139–8156, May 2024

  54. [54]

    MGFNet: An MLP-dominated gated fusion network for semantic segmentation of high-resolution multi- modal remote sensing images,

    K. Wei, J. Dai, D. Hong, and Y . Ye, “MGFNet: An MLP-dominated gated fusion network for semantic segmentation of high-resolution multi- modal remote sensing images,”Int. J. Appl. Earth Observ. Geoinf., vol. 135, p. 104241, Dec. 2024

  55. [55]

    Standard SAR ATR evaluation experiments using the MSTAR public release data set,

    T. D. Ross, S. W. Worrell, V . J. Velten, J. C. Mossing, and M. L. Bryant, “Standard SAR ATR evaluation experiments using the MSTAR public release data set,” inProc. SPIE, vol. 3370, Sep. 1998, pp. 566–573

  56. [56]

    A SAR dataset for ATR development: The synthetic and measured paired labeled experiment (SAMPLE),

    B. Lewis, T. Scarnati, E. Sudkamp, J. Nehrbass, S. Rosencrantz, and E. Zelnio, “A SAR dataset for ATR development: The synthetic and measured paired labeled experiment (SAMPLE),” inProc. SPIE, vol. 10987, May 2019, pp. 39–54

  57. [57]

    FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition,

    X. Hou, W. Ao, Q. Song, J. Lai, H. Wang, and F. Xu, “FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition,”Sci. China Inf. Sci., vol. 63, no. 4, p. 140303, Mar. 2020

  58. [58]

    OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation,

    L. Huanget al., “OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 1, pp. 195–208, Oct. 2017

  59. [59]

    DINOv2: Learning Robust Visual Features without Supervision

    M. Oquabet al., “DINOv2: Learning robust visual features without supervision,” 2023,arXiv:2304.07193

  60. [60]

    Emerging properties in self-supervised vision trans- formers,

    M. Caronet al., “Emerging properties in self-supervised vision trans- formers,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 9650– 9660

  61. [61]

    Learn to categorize or cate- gorize to learn? self-coding for generalized category discovery,

    S. Rastegar, H. Doughty, and C. Snoek, “Learn to categorize or cate- gorize to learn? self-coding for generalized category discovery,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 36, Dec. 2023, pp. 72 794– 72 818

  62. [62]

    BKD-CL: Balanced knowledge distillation-contrastive learning for distribution-unknown generalized category discovery in SAR ATR,

    Q. Hou, Z. Duan, J. Zong, J. Han, and H. Wang, “BKD-CL: Balanced knowledge distillation-contrastive learning for distribution-unknown generalized category discovery in SAR ATR,”IEEE Geosci. Remote Sens. Lett., vol. 22, no. 4006405, pp. 1–5, Mar. 2025. Jingyuan Xia(Member, IEEE) received the B.Sc. and M.Sc. degrees in electrical and electronic en- gineerin...

  63. [63]

    de- gree in information and communication engineering with the College of Electronic Science, the National University of Defense Technology (NUDT), Chang- sha, China

    He is currently working toward the M.Sc. de- gree in information and communication engineering with the College of Electronic Science, the National University of Defense Technology (NUDT), Chang- sha, China. His research focuses on deep learning on SAR target recognition. Ye Lireceived the B.Sc. degree in information engineering from the National Universi...