arxiv: 2604.22174 · v1 · submitted 2026-04-24 · 💻 cs.CV

Recognition: unknown

Unlocking Optical Prior: Spectrum-Guided Knowledge Transfer for SAR Generalized Category Discovery

Jingyuan Xia , Ruikang Hu , Ye Li , Zhixiong Yang , Xu Lan , Zhejun Lu

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:32 UTC · model grok-4.3

classification 💻 cs.CV

keywords Generalized Category DiscoverySAR imageryOptical prior transferModal Discrepancy CurveCross-modal adaptationFrequency-domain modelingDomain adaptationLarge vision models

0 comments

The pith

Modeling cross-modal spectral discrepancies with a Modal Discrepancy Curve allows effective transfer of optical priors to SAR imagery for generalized category discovery.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the challenge of generalized category discovery in SAR data, where labels are scarce and large vision models trained on optical images suffer from imaging incompatibility. It introduces the Modal Discrepancy Curve to represent these differences as a frequency-domain descriptor based on spectral energy distributions. The MDC-guided Cross-modal Prior Transfer framework then uses this curve during pre-training on paired optical-SAR data, with modules that tokenize the curve and refine features band-wise before contrastive alignment. The adapted representations improve single-modal SAR tasks downstream. A sympathetic reader cares because this offers a structured way to leverage abundant optical knowledge for radar analysis without extensive new labeling.

Core claim

The Modal Discrepancy Curve (MDC) models cross-modal discrepancy as a structured frequency-domain descriptor derived from spectral energy distributions. Leveraging this formulation, the MDC-guided Cross-modal Prior Transfer (MCPT) framework operates on paired optical-SAR data, where Adaptive Frequency Tokenization (AFT) converts the MDC into learnable tokens and Frequency-aware Expert Refinement (FER) performs band-wise discrepancy-aware feature refinement. Contrastive learning then aligns refined embeddings across modalities to internalize the adaptation pattern, yielding superior SAR feature representations for downstream single-modal SAR-GCD tasks.

What carries the argument

The Modal Discrepancy Curve (MDC), a frequency-domain descriptor of spectral energy differences between optical and SAR modalities, which provides the inductive bias to guide tokenization, refinement, and contrastive alignment in the MCPT pre-training framework.

If this is right

Superior SAR feature representations become available for single-modal generalized category discovery without labels.
State-of-the-art results appear across multiple mainstream SAR datasets.
Optical priors from large vision models adapt more effectively to SAR than with existing domain adaptation methods lacking imaging-characteristic bias.
Frequency-domain discrepancy modeling supplies a usable inductive bias that reflects physical imaging differences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The spectrum-guided transfer pattern could extend to other label-scarce modalities such as infrared or multispectral remote sensing.
The approach may encourage similar frequency-based discrepancy models for cross-modal adaptation in medical imaging or autonomous driving sensors.
Testing whether the internalized adaptation holds on unpaired SAR data or in operational remote-sensing pipelines would be a direct next step.
Linking the Modal Discrepancy Curve more explicitly to SAR physical scattering properties could refine the method further.

Load-bearing premise

The Modal Discrepancy Curve derived from spectral energy distributions supplies an inductive bias that accurately captures the incompatibility between optical priors and SAR imaging so that transfer succeeds.

What would settle it

Experiments showing that SAR-GCD accuracy gains vanish when the MDC guidance is removed or replaced by random frequency tokens, or that performance fails to exceed standard domain adaptation baselines on paired-data benchmarks.

Figures

Figures reproduced from arXiv: 2604.22174 by Jingyuan Xia, Ruikang Hu, Xu Lan, Ye Li, Zhejun Lu, Zhixiong Yang.

**Figure 1.** Figure 1: The Modal Discrepancy Curve (MDC) formulation process. view at source ↗

**Figure 2.** Figure 2: (a) The overall framework of the proposed MCPT. (b) Construction of the MDC and its discretization into spectral tokens via the AFT module. (c) view at source ↗

**Figure 3.** Figure 3: Confusion matrices on the long-tail FUSAR dataset with DINOv2. view at source ↗

**Figure 4.** Figure 4: Parameter tuning studies of the proposed method on the MSTAR dataset with DINOv2. (a) Weight coefficient view at source ↗

**Figure 5.** Figure 5: Visualization of features extracted by different methods on the MSTAR dataset with DINOv2. (a) GCD. (b) SimGCD. (c) InfoSieve. (d) CMS. (e) view at source ↗

**Figure 6.** Figure 6: Visualization of MDC for different scenes in the YESeg-OPT-SAR dataset. (a) Farmland. (b) Vegetation. (c) Water. (d) Ship. view at source ↗

**Figure 7.** Figure 7: Visualization of frequency token heatmaps in ship and building scenes. view at source ↗

read the original abstract

Generalized Category Discovery (GCD) holds significant promise for the label-scarce Synthetic Aperture Radar (SAR) domain, yet its efficacy is severely constrained by the cross-modal incompatibility between the inherent optical prior of the Large Vision Models (LVMs) and SAR imagery. Existing domain adaptation methods often lack an inductive bias that reflects imaging characteristics, consequently failing to effectively transfer optical prior into the SAR domain. To address this issue, the Modal Discrepancy Curve (MDC) is introduced to model cross-modal discrepancy as a structured frequency-domain descriptor derived from spectral energy distributions. Leveraging this formulation, we propose the MDC-guided Cross-modal Prior Transfer (MCPT) framework, a pre-training paradigm that operates on paired optical-SAR data. Within this framework, Adaptive Frequency Tokenization (AFT) converts the MDC into learnable tokens, and Frequency-aware Expert Refinement (FER) performs band-wise discrepancy-aware feature refinement using these tokens. Based on the refined representations, contrastive learning aligns refined embeddings across modalities and internalizes the adaptation pattern. Ultimately, the superior SAR feature representation capability learned during paired pre-training is applied to downstream single-modal SAR-GCD tasks. Extensive experiments demonstrate state-of-the-art performance across multiple mainstream datasets, indicating that frequency-domain discrepancy modeling enables more effective adaptation of optical prior to SAR imagery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces MDC as a spectral-energy frequency descriptor to adapt optical priors to SAR GCD via paired pre-training with AFT and FER, but offers no derivation showing why this discrepancy measure outperforms phase, wavelet, or learned alternatives.

read the letter

The core contribution is a pre-training setup called MCPT that uses paired optical-SAR data to internalize adaptation before single-modal SAR generalized category discovery. MDC models cross-modal mismatch as a curve from spectral energy distributions, AFT turns that curve into tokens, and FER applies band-wise refinement; contrastive alignment then produces the final SAR features. This is a concrete attempt to inject imaging-physics awareness rather than relying on generic domain adaptation, and the problem it targets—label scarcity plus optical-to-SAR mismatch—is real in remote sensing.

Referee Report

2 major / 0 minor

Summary. The paper introduces the Modal Discrepancy Curve (MDC) derived from spectral energy distributions to model cross-modal incompatibility between optical large vision model priors and SAR imagery. It proposes the MDC-guided Cross-modal Prior Transfer (MCPT) framework, which uses Adaptive Frequency Tokenization (AFT) to convert MDC into learnable tokens and Frequency-aware Expert Refinement (FER) for band-wise feature refinement, followed by contrastive alignment on paired optical-SAR data. The adapted representations are then applied to downstream single-modal SAR Generalized Category Discovery, with claims of state-of-the-art performance on mainstream datasets.

Significance. If the results hold, the work offers a novel frequency-domain approach to bridge the optical-SAR modality gap for label-scarce GCD tasks, potentially advancing remote sensing applications by internalizing adaptation patterns via pre-training. The introduction of MDC as a structured descriptor, along with AFT and FER, provides a creative inductive bias that could generalize beyond the specific setting if its advantages over alternatives are demonstrated.

major comments (2)

[Method] Method section (MDC formulation and justification): The paper positions the Modal Discrepancy Curve, derived from spectral energy distributions, as the key inductive bias enabling effective transfer via AFT and FER. However, no derivation or comparison is provided showing why this spectral-energy-based measure outperforms alternatives such as phase-based, wavelet-based, or learned spatial discrepancy measures. This is load-bearing for the central claim, as gains could arise from contrastive pre-training or paired data rather than the frequency-guided mechanism.
[Experiments] Experiments section: The abstract asserts SOTA results across mainstream datasets, but the manuscript requires explicit quantitative tables, ablation studies isolating AFT and FER contributions, baseline comparisons, dataset details, and error bars to allow verification of the performance claims and the role of MDC.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to provide stronger justification for the MDC and more complete experimental reporting.

read point-by-point responses

Referee: [Method] Method section (MDC formulation and justification): The paper positions the Modal Discrepancy Curve, derived from spectral energy distributions, as the key inductive bias enabling effective transfer via AFT and FER. However, no derivation or comparison is provided showing why this spectral-energy-based measure outperforms alternatives such as phase-based, wavelet-based, or learned spatial discrepancy measures. This is load-bearing for the central claim, as gains could arise from contrastive pre-training or paired data rather than the frequency-guided mechanism.

Authors: We agree that the original manuscript would benefit from an explicit derivation and direct comparisons. The MDC is motivated by the physical properties of SAR imaging, where spectral energy distributions reflect radar-specific scattering behaviors that differ from optical imagery. In the revision we will add a mathematical derivation of the MDC from spectral energy and include ablation studies comparing it to phase-based, wavelet-based, and learned spatial discrepancy measures. These additions will help isolate the contribution of the frequency-guided mechanism from contrastive pre-training and paired data alone. revision: yes
Referee: [Experiments] Experiments section: The abstract asserts SOTA results across mainstream datasets, but the manuscript requires explicit quantitative tables, ablation studies isolating AFT and FER contributions, baseline comparisons, dataset details, and error bars to allow verification of the performance claims and the role of MDC.

Authors: We concur that the experimental section requires expansion for full verifiability. The revised manuscript will incorporate explicit quantitative tables reporting SOTA results on the mainstream datasets, ablation studies that isolate the individual contributions of AFT and FER, comprehensive baseline comparisons, detailed dataset descriptions, and error bars computed over multiple runs. These changes will enable readers to assess both the performance claims and the specific role of the MDC-guided components. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; MDC and MCPT remain independent of self-definition or fitted inputs

full rationale

The provided abstract and context introduce the Modal Discrepancy Curve (MDC) as a new frequency-domain descriptor derived from spectral energy distributions, then build the MCPT framework around AFT tokenization, FER refinement, and contrastive alignment before downstream transfer. No equations, derivations, or self-citations appear that would reduce the claimed inductive bias, adaptation superiority, or SOTA performance to tautological inputs by construction. The central premise posits the utility of spectral-energy MDC without exhibiting a reduction to prior fitted parameters or self-referential definitions. This matches the reader's assessment that no abstract-level equations enable tautological reduction, yielding only minor (non-load-bearing) circularity risk at most.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The central claim rests on the domain assumption that frequency-domain spectral discrepancy can be turned into an effective inductive bias for optical-to-SAR transfer; three new procedural entities (MDC, AFT, FER) are introduced without independent falsifiable evidence beyond the SOTA claim.

axioms (1)

domain assumption Cross-modal incompatibility between optical LVM priors and SAR imagery can be modeled as a structured frequency-domain descriptor derived from spectral energy distributions.
Invoked as the core modeling choice that existing adaptation methods lack.

invented entities (3)

Modal Discrepancy Curve (MDC) no independent evidence
purpose: Structured frequency-domain descriptor of cross-modal discrepancy
Newly defined in the paper to guide adaptation
Adaptive Frequency Tokenization (AFT) no independent evidence
purpose: Conversion of MDC into learnable tokens for pre-training
Component of the proposed MCPT framework
Frequency-aware Expert Refinement (FER) no independent evidence
purpose: Band-wise discrepancy-aware feature refinement using the tokens
Component of the proposed MCPT framework

pith-pipeline@v0.9.0 · 5549 in / 1362 out tokens · 62350 ms · 2026-05-08T12:32:40.672833+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

63 extracted references · 5 canonical work pages · 2 internal anchors

[1]

Multi- view SAR target recognition based on adaptive semantic-view feature embedding fusion network,

H. Wang, B. Sun, W. Yang, H. Zeng, C. Li, and J. Chen, “Multi- view SAR target recognition based on adaptive semantic-view feature embedding fusion network,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5221022, pp. 1–22, Oct. 2025

2025
[2]

Target-aspect domain continual learning for SAR target recognition,

H. Chen, C. Du, J. Zhu, and D. Guo, “Target-aspect domain continual learning for SAR target recognition,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5205514, pp. 1–14, Feb. 2025

2025
[3]

WHFNet: A wavelet-driven heterogeneous fusion network for high-frequency en- hanced optical-SAR remote sensing segmentation,

B. Ren, Q. Wang, B. Liu, B. Hou, C. Yang, and L. Jiao, “WHFNet: A wavelet-driven heterogeneous fusion network for high-frequency en- hanced optical-SAR remote sensing segmentation,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5607817, pp. 1–17, Feb. 2026

2026
[4]

SARLang-1M: A benchmark for vision-language model- ing in SAR image understanding,

Y . Weiet al., “SARLang-1M: A benchmark for vision-language model- ing in SAR image understanding,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5201320, pp. 1–20, Jan. 2026

2026
[5]

A few-shot SAR target recognition method by unifying local classification with feature generation and calibration,

S. Wang, Y . Wang, H. Liu, Y . Sun, and C. Zhang, “A few-shot SAR target recognition method by unifying local classification with feature generation and calibration,”IEEE Trans. Geosci. Remote Sens., vol. 62, no. 5200319, pp. 1–19, Nov. 2023

2023
[6]

Band-kernel stochastic learning for unsupervised blind hyperspectral image super-resolution,

Z. Yanget al., “Band-kernel stochastic learning for unsupervised blind hyperspectral image super-resolution,”IEEE Trans. Pattern Anal. Mach. Intell., pp. 1–18, Apr. 2026

2026
[7]

Few-shot class-incremental SAR target recognition via cosine prototype learning,

Y . Zhao, L. Zhao, D. Ding, D. Hu, G. Kuang, and L. Liu, “Few-shot class-incremental SAR target recognition via cosine prototype learning,” IEEE Trans. Geosci. Remote Sens., vol. 61, no. 5212718, pp. 1–18, Jul. 2023

2023
[8]

Metalearning-based alternating minimization algorithm for nonconvex optimization,

J. Xia, S. Li, J. Huang, Z. Yang, I. M. Jaimoukha, and D. Gündüz, “Metalearning-based alternating minimization algorithm for nonconvex optimization,”IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 9, pp. 5366–5380, Apr. 2023

2023
[9]

Xcon: Learning with experts for fine-grained category dis- covery.arXiv preprint arXiv:2208.01898, 2022

Y . Fei, Z. Zhao, S. Yang, and B. Zhao, “XCon: Learning with experts for fine-grained category discovery,” 2022,arXiv:2208.01898

work page arXiv 2022
[10]

Category discovery: An open-world perspective,

Z. He, Y . Liu, and K. Han, “Category discovery: An open-world perspective,” 2025,arXiv:2509.22542

work page arXiv 2025
[11]

Context2Context: A zero-shot SAR image speckle filter,

L. Dai and S. Chen, “Context2Context: A zero-shot SAR image speckle filter,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5204911, pp. 1–11, Mar. 2026

2026
[12]

Noise-tolerant novel-view SAR synthesis via denoising diffusion,

A. Rahimi and S. Yu, “Noise-tolerant novel-view SAR synthesis via denoising diffusion,”IEEE Trans. Geosci. Remote Sens., vol. 64, no. 5202111, pp. 1–11, Jan. 2026

2026
[13]

MCS filter: A multi- channel structure-aware speckle filter for SAR images,

B. Saheya, R. Cai, H. Zhao, M. Gong, and X. Li, “MCS filter: A multi- channel structure-aware speckle filter for SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5213114, pp. 1–14, May 2025

2025
[14]

Solving the catastrophic forgetting problem in generalized category discovery,

X. Caoet al., “Solving the catastrophic forgetting problem in generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 16 880–16 889

2024
[15]

Hyperbolic category discovery,

Y . Liu, Z. He, and K. Han, “Hyperbolic category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2025, pp. 9891–9900

2025
[16]

Federated generalized category discovery,

N. Pu, W. Li, X. Ji, Y . Qin, N. Sebe, and Z. Zhong, “Federated generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 28 741–28 750

2024
[17]

Prototypical hash encoding for on-the-fly fine-grained category discovery,

H. Zheng, N. Pu, W. Li, N. Sebe, and Z. Zhong, “Prototypical hash encoding for on-the-fly fine-grained category discovery,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 37, Dec. 2024, pp. 101 428–101 455

2024
[18]

Para- metric information maximization for generalized category discovery,

F. Chiaroni, J. Dolz, Z. I. Masud, A. Mitiche, and I. Ben Ayed, “Para- metric information maximization for generalized category discovery,” in IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 1729–1739

2023
[19]

Decoupling pseudo label disambiguation and represen- tation learning for generalized intent discovery,

Y . Mouet al., “Decoupling pseudo label disambiguation and represen- tation learning for generalized intent discovery,” inProc. Annu. Meet. Assoc. Comput. Linguist. (ACL), Jul. 2023, pp. 9661–9675

2023
[20]

Generalized category discovery with decoupled prototypical network,

W. An, F. Tian, Q. Zheng, W. Ding, Q. Wang, and P. Chen, “Generalized category discovery with decoupled prototypical network,” inAAAI Conf. Artif. Intell. (AAAI), vol. 37, no. 11, Jun. 2023, pp. 12 527–12 535

2023
[21]

Learning to distinguish samples for generalized category discovery,

F. Yanget al., “Learning to distinguish samples for generalized category discovery,” inEur. Conf. Comput. Vis. (ECCV), Nov. 2024, pp. 105–122

2024
[22]

When domain generalization meets generalized category discovery: An adaptive task-arithmetic driven approach,

V . Rathoreet al., “When domain generalization meets generalized category discovery: An adaptive task-arithmetic driven approach,” in IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2025, pp. 4905–4915

2025
[23]

Towards understanding parametric generalized category discovery on graphs,

B. Denget al., “Towards understanding parametric generalized category discovery on graphs,” inInt. Conf. Mach. Learn. (ICML), vol. 267, Oct. 2025, pp. 13 069–13 109

2025
[24]

A fresh look at generalized category discovery through non-negative matrix factorization,

Z. Ji, S. Yang, J. Liu, Y . Pang, and C. Tang, “A fresh look at generalized category discovery through non-negative matrix factorization,”IEEE Trans. Circuits Syst. Video Technol., Feb. 2026

2026
[25]

Towards distribution-agnostic generalized category discov- ery,

J. Baiet al., “Towards distribution-agnostic generalized category discov- ery,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 36, Dec. 2023, pp. 58 625–58 647

2023
[26]

Incremental generalized category discov- ery,

B. Zhao and O. Mac Aodha, “Incremental generalized category discov- ery,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 19 137– 19 147

2023
[27]

MFJA: Unsupervised domain adaptation based on multimodal feature fusion and global–local joint alignment for SAR ATR,

C. Zhang, Y . Wang, H. Liu, S. Wang, X. Zhang, and C. Qu, “MFJA: Unsupervised domain adaptation based on multimodal feature fusion and global–local joint alignment for SAR ATR,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5214420, pp. 1–20, Jun. 2025

2025
[28]

Zero-shot domain adaptation for SAR target recognition based on cooperative learning of domain alignment and task alignment,

G. Chen, S. Zhang, Z. Zhou, L. Zhao, and G. Kuang, “Zero-shot domain adaptation for SAR target recognition based on cooperative learning of domain alignment and task alignment,”IEEE Trans. Radar Syst., vol. 3, pp. 890–904, Jun. 2025

2025
[29]

Unsupervised domain adaptation for SAR ship detection based on multitask decoupling,

Y . Yang, X. Yang, and D. Yang, “Unsupervised domain adaptation for SAR ship detection based on multitask decoupling,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 18, pp. 12 684–12 696, May 2025

2025
[30]

Meta-learning based domain prior with application to optical-ISAR image translation,

H. Liao, J. Xia, Z. Yang, F. Pan, Z. Liu, and Y . Liu, “Meta-learning based domain prior with application to optical-ISAR image translation,” IEEE Trans. Circuits Syst. Video Technol., vol. 34, no. 8, pp. 7041–7056, Sep. 2024

2024
[31]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,” 2020,arXiv:2010.11929

work page internal anchor Pith review arXiv 2020
[32]

Learning transferable visual models from natural language supervision,

A. Radfordet al., “Learning transferable visual models from natural language supervision,” inInt. Conf. Mach. Learn. (ICML), vol. 139, Jul. 2021, pp. 8748–8763

2021
[33]

Generalized category discovery,

S. Vaze, K. Han, A. Vedaldi, and A. Zisserman, “Generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 7492–7501

2022
[34]

k-means++: The advantages of careful seeding,

D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” 2006

2006
[35]

Dynamic conceptional contrastive learning for generalized category discovery,

N. Pu, Z. Zhong, and N. Sebe, “Dynamic conceptional contrastive learning for generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, pp. 7579–7588

2023
[36]

CIPR: An efficient framework with cross-instance positive relations for generalized category discovery,

S. Hao, K. Han, and K. K. Wong, “CIPR: An efficient framework with cross-instance positive relations for generalized category discovery,” 2023,arXiv:2304.06928

work page arXiv 2023
[37]

Learning semi-supervised Gaussian mixture models for generalized category discovery,

B. Zhao, X. Wen, and K. Han, “Learning semi-supervised Gaussian mixture models for generalized category discovery,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 16 623–16 633

2023
[38]

Monte Carlo sampling methods using Markov chains and their applications,

W. K. Hastings, “Monte Carlo sampling methods using Markov chains and their applications,”Biometrika, vol. 57, no. 1, pp. 97–109, Apr. 1970

1970
[39]

Contrastive mean-shift learning for generalized category discovery,

S. Choi, D. Kang, and M. Cho, “Contrastive mean-shift learning for generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 23 094–23 104

2024
[40]

Mean shift, mode seeking, and clustering,

Y . Cheng, “Mean shift, mode seeking, and clustering,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 8, pp. 790–799, Aug. 1995. 14

1995
[41]

Mean shift: A robust approach toward feature space analysis,

D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002

2002
[42]

Selex: Self-expertise in fine-grained generalized category discovery,

S. Rastegar, M. Salehi, Y . M. Asano, H. Doughty, and C. G. M. Snoek, “Selex: Self-expertise in fine-grained generalized category discovery,” inEur. Conf. Comput. Vis. (ECCV), Dec. 2024, pp. 440–458

2024
[43]

Parametric classification for generalized category discovery: A baseline study,

X. Wen, B. Zhao, and X. Qi, “Parametric classification for generalized category discovery: A baseline study,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 16 590–16 600

2023
[44]

Active generalized category discovery,

S. Ma, F. Zhu, Z. Zhong, X. Zhang, and C. Liu, “Active generalized category discovery,” inIEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2024, pp. 16 890–16 900

2024
[45]

ProtoGCD: Unified and unbiased prototype learning for generalized category discovery,

S. Ma, F. Zhu, X. Zhang, and C. Liu, “ProtoGCD: Unified and unbiased prototype learning for generalized category discovery,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 47, no. 7, pp. 6022–6038, Apr. 2025

2025
[46]

Domain adaptive oriented object detection from optical to SAR images,

H. Huang, J. Guo, H. Lin, Y . Huang, and X. Ding, “Domain adaptive oriented object detection from optical to SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5200314, pp. 1–14, Dec. 2024

2024
[47]

Unsupervised domain adaptation for SAR target detection,

Y . Shi, L. Du, and Y . Guo, “Unsupervised domain adaptation for SAR target detection,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 6372–6385, Jun. 2021

2021
[48]

Cross-modal ship detection from optical to SAR images based on pixel- and feature-level progressive transfer,

Y . Zhu, J. Ai, W. Xue, Z. Wang, and Z. Zhao, “Cross-modal ship detection from optical to SAR images based on pixel- and feature-level progressive transfer,”IEEE Sensors J., vol. 25, no. 8, pp. 13 344–13 356, Feb. 2025

2025
[49]

A domain-adaptive few-shot SAR ship detection algorithm driven by the latent similarity between optical and SAR images,

Z. Zhou, L. Zhao, K. Ji, and G. Kuang, “A domain-adaptive few-shot SAR ship detection algorithm driven by the latent similarity between optical and SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 62, no. 5216318, pp. 1–18, Jul. 2024

2024
[50]

Cross- modality domain adaptation based on semantic graph learning: From optical to SAR images,

X. Zhang, Z. Huang, X. Yao, X. Feng, G. Cheng, and J. Han, “Cross- modality domain adaptation based on semantic graph learning: From optical to SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 63, no. 5620215, pp. 1–15, Apr. 2025

2025
[51]

SARDet-100K: Towards open-source benchmark and toolkit for large-scale SAR object detection,

Y . Liet al., “SARDet-100K: Towards open-source benchmark and toolkit for large-scale SAR object detection,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 37, Dec. 2024, pp. 128 430–128 461

2024
[52]

Unsupervised domain adaptation based on progressive transfer for ship detection: From optical to SAR images,

Y . Shi, L. Du, Y . Guo, and Y . Du, “Unsupervised domain adaptation based on progressive transfer for ship detection: From optical to SAR images,”IEEE Trans. Geosci. Remote Sens., vol. 60, no. 5230317, pp. 1–17, Jun. 2022

2022
[53]

Blind super-resolution via meta-learning and Markov chain Monte Carlo simulation,

J. Xiaet al., “Blind super-resolution via meta-learning and Markov chain Monte Carlo simulation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 8139–8156, May 2024

2024
[54]

MGFNet: An MLP-dominated gated fusion network for semantic segmentation of high-resolution multi- modal remote sensing images,

K. Wei, J. Dai, D. Hong, and Y . Ye, “MGFNet: An MLP-dominated gated fusion network for semantic segmentation of high-resolution multi- modal remote sensing images,”Int. J. Appl. Earth Observ. Geoinf., vol. 135, p. 104241, Dec. 2024

2024
[55]

Standard SAR ATR evaluation experiments using the MSTAR public release data set,

T. D. Ross, S. W. Worrell, V . J. Velten, J. C. Mossing, and M. L. Bryant, “Standard SAR ATR evaluation experiments using the MSTAR public release data set,” inProc. SPIE, vol. 3370, Sep. 1998, pp. 566–573

1998
[56]

A SAR dataset for ATR development: The synthetic and measured paired labeled experiment (SAMPLE),

B. Lewis, T. Scarnati, E. Sudkamp, J. Nehrbass, S. Rosencrantz, and E. Zelnio, “A SAR dataset for ATR development: The synthetic and measured paired labeled experiment (SAMPLE),” inProc. SPIE, vol. 10987, May 2019, pp. 39–54

2019
[57]

FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition,

X. Hou, W. Ao, Q. Song, J. Lai, H. Wang, and F. Xu, “FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition,”Sci. China Inf. Sci., vol. 63, no. 4, p. 140303, Mar. 2020

2020
[58]

OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation,

L. Huanget al., “OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 1, pp. 195–208, Oct. 2017

2017
[59]

DINOv2: Learning Robust Visual Features without Supervision

M. Oquabet al., “DINOv2: Learning robust visual features without supervision,” 2023,arXiv:2304.07193

work page internal anchor Pith review arXiv 2023
[60]

Emerging properties in self-supervised vision trans- formers,

M. Caronet al., “Emerging properties in self-supervised vision trans- formers,” inIEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 9650– 9660

2021
[61]

Learn to categorize or cate- gorize to learn? self-coding for generalized category discovery,

S. Rastegar, H. Doughty, and C. Snoek, “Learn to categorize or cate- gorize to learn? self-coding for generalized category discovery,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 36, Dec. 2023, pp. 72 794– 72 818

2023
[62]

BKD-CL: Balanced knowledge distillation-contrastive learning for distribution-unknown generalized category discovery in SAR ATR,

Q. Hou, Z. Duan, J. Zong, J. Han, and H. Wang, “BKD-CL: Balanced knowledge distillation-contrastive learning for distribution-unknown generalized category discovery in SAR ATR,”IEEE Geosci. Remote Sens. Lett., vol. 22, no. 4006405, pp. 1–5, Mar. 2025. Jingyuan Xia(Member, IEEE) received the B.Sc. and M.Sc. degrees in electrical and electronic en- gineerin...

2025
[63]

de- gree in information and communication engineering with the College of Electronic Science, the National University of Defense Technology (NUDT), Chang- sha, China

He is currently working toward the M.Sc. de- gree in information and communication engineering with the College of Electronic Science, the National University of Defense Technology (NUDT), Chang- sha, China. His research focuses on deep learning on SAR target recognition. Ye Lireceived the B.Sc. degree in information engineering from the National Universi...

2023