Physics-Driven Semantic Scattering Structure Understanding of Aircraft Target in SAR Images

Hao Shi; Liang Chen; Wei Li; Xiaogang Yu; Yifei Yin

arxiv: 2606.06847 · v1 · pith:3XNGKTS5new · submitted 2026-06-05 · 📡 eess.IV · cs.CV

Physics-Driven Semantic Scattering Structure Understanding of Aircraft Target in SAR Images

Yifei Yin , Xiaogang Yu , Hao Shi , Liang Chen , Wei Li This is my paper

Pith reviewed 2026-06-27 20:50 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords SARaircraft targetsemantic scatteringstructure understandingphysics-drivenkeypointsSAR imagestopology reconstruction

0 comments

The pith

Semantic scattering keypoints tied to aircraft parts recover complete structures in SAR images via physical priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Semantic Scattering Structure Understanding as a new paradigm for interpreting aircraft targets in SAR images. It defines semantic scattering keypoints that link local electromagnetic responses to specific, physically meaningful aircraft components and adds visibility-aware attributes to keep weakly scattering but real parts. These keypoints form a stable structure guided by priors on scattering heterogeneity, rigid-body topology, and speckle uncertainty. The resulting S3U-SAR framework localizes the keypoints and builds the full representation, outperforming prior local scattering-center methods on a new benchmark. Traditional unordered centers often miss weak components and yield incomplete topology, so anchoring responses to semantics aims to fix that instability.

Core claim

We establish Semantic Scattering Structure Understanding as a new paradigm for SAR aircraft interpretation. Semantic scattering keypoints are defined to associate local electromagnetic responses with physically meaningful aircraft components, while visibility-aware attributes are introduced to retain weakly observable yet physically existed components. The keypoints are further organized into a stable semantic scattering structure. Build upon this, we propose S3U-SAR, a physics-driven framework to localize semantic scattering keypoints and construct the complete representation constrained by multi-dimensional physical priors containing scattering heterogeneity, rigid-body topology, speckle u

What carries the argument

Semantic scattering keypoints that associate local electromagnetic responses with physically meaningful aircraft components and are organized into a stable semantic scattering structure under multi-dimensional physical priors.

If this is right

S3U-SAR achieves the best performance compared with baselines on the KP-SAR-Aircraft-1.0 benchmark.
Cross-category and cross-dataset evaluations confirm robustness and transferability of the semantic structure approach.
The complete representation retains physically existing weak-scattering components that local scattering-center methods miss.
A confidence-gated joint supervision strategy alleviates optimization conflicts during keypoint localization and structure construction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same semantic-keypoint approach could be tested on other rigid targets such as ships or ground vehicles if analogous physical priors are available.
The new KP-SAR-Aircraft-1.0 benchmark supplies a standard for comparing future component-level SAR interpretation methods.
Visibility-aware attributes open a route to handling variable observation angles or partial occlusions in operational SAR systems.

Load-bearing premise

Multi-dimensional physical priors on scattering heterogeneity, rigid-body topology and speckle uncertainty together with visibility-aware attributes are enough to recover complete topology without missing real weak-scattering parts or adding false structure.

What would settle it

A controlled SAR experiment in which S3U-SAR either omits a documented physically present weak-scattering component or fabricates a nonexistent structural link would falsify the claim that the priors suffice.

Figures

Figures reproduced from arXiv: 2606.06847 by Hao Shi, Liang Chen, Wei Li, Xiaogang Yu, Yifei Yin.

**Figure 2.** Figure 2: Definition of the selected semantic scattering keypoints, including [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 5.** Figure 5: Illustration of the physics-constrained structural topology defined on [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 4.** Figure 4: Examples of visibility-aware semantic scattering keypoints in SAR [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 6.** Figure 6: Overall framework of the proposed S3U-SAR, including high-resolution feature extraction, 10-channel heatmap prediction, spatial-aware Softmax, coordinate expectation, semantic scattering keypoint localization, and physics-aware joint supervision. structured physical-semantic graph, providing the foundation for the subsequent research. III. METHOD A. Task Definition and Overview Given a SAR aircraft image, … view at source ↗

**Figure 7.** Figure 7: Illustration of the scattering-intensity heterogeneity-aware localization [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Illustration of the confidence-gated joint supervision strategy for [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: Qualitative comparison of semantic scattering keypoint localization results of different methods. (a) GT. (b) ViTPose. (c) SimCC. (d) DiffusionPose. [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 11.** Figure 11: Visualization of the topological constraint. (a) GT. (b) w/o [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗

**Figure 10.** Figure 10: Visualization of heatmap responses for scattering-intensity [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 12.** Figure 12: Training dynamics of the confidence-gated joint supervision strategy. [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗

**Figure 14.** Figure 14: Qualitative results under cross-dataset evaluation. (a) A220. (b) A320. [PITH_FULL_IMAGE:figures/full_fig_p011_14.png] view at source ↗

read the original abstract

Synthetic aperture radar (SAR) has become indispensable for target interpretation owing to its all-day and all-weather observation capability. In SAR target interpretation, electromagnetic scattering information provides a physically grounded cue beyond visual texture and has been widely exploited for target interpretation. However, existing methods remain dominated by local scattering center representations. Such unordered and component-agnostic representations are highly unstable for aircraft targets. As a result, physically existing components with weak scattering responses are often missed, resulting in the incomplete reconstructed topology structure. To address this limitation, we establish Semantic Scattering Structure Understanding as a new paradigm for SAR aircraft interpretation. Semantic scattering keypoints are defined to associate local electromagnetic responses with physically meaningful aircraft components, while visibility-aware attributes are introduced to retain weakly observable yet physically existed components. The keypoints are further organized into a stable semantic scattering structure. Build upon this, we propose S3U-SAR, a physics-driven framework to localize semantic scattering keypoints and construct the complete representation constrained by multi-dimensional physical priors containing scattering heterogeneity, rigid-body topology, speckle uncertainty. A confidence-gated joint supervision strategy is further introduced to alleviate optimization conflicts. We construct KP-SAR-Aircraft-1.0, the first fine-grained benchmark for semantic scattering structure understanding. Extensive experiments demonstrate that S3U-SAR achieves the best performance compared with baselines. Cross-category and cross-dataset evaluations further verify its robustness and transferability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract outlines a shift to semantic keypoints for SAR aircraft but without methods or results the performance claims stay untested.

read the letter

The core idea is to replace unordered local scattering centers with semantic keypoints tied to actual aircraft parts, plus visibility attributes that keep weak but real components in the picture. That directly targets the instability problem the abstract flags for aircraft targets.

What stands out as new is the framing of a full semantic scattering structure, the multi-prior constraints (heterogeneity, rigid topology, speckle), and the new KP-SAR-Aircraft-1.0 benchmark. The confidence-gated supervision is also a concrete mechanism they introduce to handle conflicting losses.

The paper does a clean job naming the practical failure mode—missed weak scatterers leading to broken topology—and linking it to electromagnetic priors that already exist in the SAR literature. That part feels grounded.

The soft spots are straightforward: the abstract gives no equations for how the priors are turned into constraints, no definition of the keypoints themselves, and no numbers or ablation tables. Without those, the claim that S3U-SAR beats baselines and transfers across datasets cannot be checked. The circularity risk is real until the full derivations and data splits appear.

This is aimed at the SAR target recognition community, especially groups already using component-aware or physics-informed models. A reader working on aircraft classification or structural reconstruction would get the most out of it.

It deserves a serious referee. The problem is real and the proposed direction is coherent on paper; the details will decide whether the priors actually deliver stable topology or just add another fitting layer. Send it out so the experiments and implementation can be examined.

Referee Report

3 major / 1 minor

Summary. The paper introduces Semantic Scattering Structure Understanding (S3U) as a new paradigm for aircraft target interpretation in SAR images. It defines semantic scattering keypoints to associate local electromagnetic responses with physically meaningful aircraft components and introduces visibility-aware attributes to retain weakly observable but physically existing components. These are organized into a stable semantic scattering structure. The proposed S3U-SAR framework localizes these keypoints and constructs the representation under constraints from multi-dimensional physical priors (scattering heterogeneity, rigid-body topology, speckle uncertainty) plus a confidence-gated joint supervision strategy. A new benchmark KP-SAR-Aircraft-1.0 is constructed, and the work claims that S3U-SAR outperforms baselines with verified robustness and transferability via cross-category and cross-dataset evaluations.

Significance. If the central claims hold with supporting derivations and experiments, the shift from unordered local scattering centers to semantically grounded and physically constrained keypoints could improve stability and completeness in SAR aircraft topology reconstruction. The creation of a dedicated fine-grained benchmark would also be a useful contribution to the SAR interpretation community.

major comments (3)

[Abstract] Abstract: The central performance claim that 'S3U-SAR achieves the best performance compared with baselines' is unsupported by any metrics, tables, ablation studies, or error analysis in the provided manuscript text, which prevents assessment of whether the physical priors actually resolve the stated instability in local scattering center representations.
[Abstract] Abstract: No equations, definitions, or implementation details are supplied for the semantic scattering keypoints, visibility-aware attributes, or the multi-dimensional physical priors (scattering heterogeneity, rigid-body topology, speckle uncertainty), making it impossible to evaluate whether these constraints are load-bearing, parameter-free, or free of circularity in recovering complete topology.
[Abstract] Abstract: The construction of KP-SAR-Aircraft-1.0 is presented as enabling cross-category and cross-dataset evaluations, but no information is given on dataset size, annotation protocol, or how the benchmark avoids the very component-agnostic issues criticized in prior methods.

minor comments (1)

[Abstract] The acronym S3U-SAR is used without an explicit expansion on first appearance in the abstract.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed review and constructive comments on the abstract. We will revise the abstract to incorporate supporting details from the full manuscript. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract] Abstract: The central performance claim that 'S3U-SAR achieves the best performance compared with baselines' is unsupported by any metrics, tables, ablation studies, or error analysis in the provided manuscript text, which prevents assessment of whether the physical priors actually resolve the stated instability in local scattering center representations.

Authors: We agree the abstract would be strengthened by including key quantitative results. The full manuscript contains these in the Experiments section, with tables reporting performance metrics against baselines, ablation studies on the physical priors, and error analysis showing improved stability. We will revise the abstract to reference these outcomes and briefly note the demonstrated gains. revision: yes
Referee: [Abstract] Abstract: No equations, definitions, or implementation details are supplied for the semantic scattering keypoints, visibility-aware attributes, or the multi-dimensional physical priors (scattering heterogeneity, rigid-body topology, speckle uncertainty), making it impossible to evaluate whether these constraints are load-bearing, parameter-free, or free of circularity in recovering complete topology.

Authors: The abstract is intentionally concise and omits equations, which are instead provided with full definitions in Section 3. To aid evaluation without expanding the abstract excessively, we will add one-sentence characterizations of the keypoints, visibility-aware attributes, and the three physical priors. revision: yes
Referee: [Abstract] Abstract: The construction of KP-SAR-Aircraft-1.0 is presented as enabling cross-category and cross-dataset evaluations, but no information is given on dataset size, annotation protocol, or how the benchmark avoids the very component-agnostic issues criticized in prior methods.

Authors: The manuscript details the benchmark construction, size, annotation protocol, and its design to enforce component-level semantics in a dedicated section. We will revise the abstract to include a brief statement on dataset scale and how the annotation protocol directly mitigates component-agnostic limitations. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The supplied manuscript consists solely of an abstract with no equations, derivations, or explicit mathematical steps. The central claims introduce semantic scattering keypoints and physics-driven constraints (scattering heterogeneity, rigid-body topology, speckle uncertainty) as a new paradigm, but present no derivation chain that reduces predictions to fitted inputs or self-citations by construction. Without access to any load-bearing equations or self-referential definitions, the derivation cannot be shown to collapse into its own inputs; this is the expected honest non-finding when concrete technical content is absent.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 4 invented entities

Abstract introduces multiple new concepts without supplying definitions, validation, or external grounding; no free parameters, axioms, or independent evidence for invented entities are detailed.

invented entities (4)

Semantic scattering keypoints no independent evidence
purpose: Associate local electromagnetic responses with physically meaningful aircraft components
Defined in abstract as core of new paradigm; no independent evidence supplied
visibility-aware attributes no independent evidence
purpose: Retain weakly observable yet physically existed components
Introduced in abstract to address missing weak responses; no external validation
S3U-SAR framework no independent evidence
purpose: Localize keypoints and construct complete representation using physical priors
Proposed method in abstract; no code or derivation details
KP-SAR-Aircraft-1.0 benchmark no independent evidence
purpose: First fine-grained benchmark for semantic scattering structure understanding
Constructed dataset claimed in abstract; availability and labeling protocol unspecified

pith-pipeline@v0.9.1-grok · 5784 in / 1335 out tokens · 32163 ms · 2026-06-27T20:50:23.221723+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references

[1]

Deep learning meets SAR: Concepts, models, pitfalls, and perspectives

Xiao Xiang Zhu, Shiyao Montazeri, Mohsin Ali, Yuansheng Hua, Yuanyuan Wang, Lichao Mou, Yilei Shi, Feng Xu, and Richard Bamler. Deep learning meets SAR: Concepts, models, pitfalls, and perspectives. IEEE Geoscience and Remote Sensing Magazine, 9(4):143–172, 2021

2021
[2]

Self-supervised despeckling based solely on SAR intensity images: A general strategy

Liang Chen, Yifei Yin, Hao Shi, Jingfei He, and Wei Li. Self-supervised despeckling based solely on SAR intensity images: A general strategy. ISPRS Journal of Photogrammetry and Remote Sensing, 231:854–873, 2026

2026
[3]

DOGAN: DINO-based optical-prior-driven GAN for SAR-to- optical image translation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–16, 2025

Jingfei He, Liang Chen, Hao Shi, Yuhang Chen, Jingyi Yang, and Wei Li. DOGAN: DINO-based optical-prior-driven GAN for SAR-to- optical image translation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–16, 2025

2025
[4]

Yifei Yin, Zhu Yang, Hao Shi, Fanyu Meng, and Wei Li. Ship detection transformer in SAR images based on key scattering points feature aggregation and context feature refinement.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:17820– 17836, 2025

2025
[5]

Yongxiang Liu, Weijie Li, Li Liu, Jie Zhou, Bowen Peng, Yafei Song, Xuying Xiong, Wei Yang, Tianpeng Liu, Zhen Liu, and Xiang Li. ATRNet-STAR: A large dataset and benchmark toward remote sensing object recognition in the wild.IEEE Transactions on Pattern Analysis and Machine Intelligence, 48(6):6735–6753, 2026

2026
[6]

Re- inforcement learning for SAR target orientation inference with the differentiable SAR renderer.IEEE Transactions on Geoscience and Remote Sensing, 62:1–13, 2024

Yue Wang, Hongyu Jia, Shanshan Fu, Hong Lin, and Feng Xu. Re- inforcement learning for SAR target orientation inference with the differentiable SAR renderer.IEEE Transactions on Geoscience and Remote Sensing, 62:1–13, 2024

2024
[7]

PGMNet: A prototype-guided multimodal network for ship recognition in SAR images.IEEE Transactions on Geoscience and Remote Sensing, 63:1–17, 2025

Liang Chen, Jianhao Li, Honghu Zhong, Hao Shi, Zhu Yang, and Wei Li. PGMNet: A prototype-guided multimodal network for ship recognition in SAR images.IEEE Transactions on Geoscience and Remote Sensing, 63:1–17, 2025

2025
[8]

CV-SAR-Det: Target detection for SAR images via deep complex-valued network.IEEE Transactions on Aerospace and Electronic Systems, 60(6):8226–8238, 2024

Zhaocheng Wang, Ruonan Wang, Hailong Kang, Feng Luo, and Jun Ai. CV-SAR-Det: Target detection for SAR images via deep complex-valued network.IEEE Transactions on Aerospace and Electronic Systems, 60(6):8226–8238, 2024

2024
[9]

Potter, D.-M

Lee C. Potter, D.-M. Chiang, R. Carriere, and M. J. Gerry. A GTD-based parametric model for radar scattering.IEEE Transactions on Antennas and Propagation, 43(10):1058–1067, 1995. 13

1995
[10]

Potter and Randolph L

Lee C. Potter and Randolph L. Moses. Attributed scattering centers for SAR ATR.IEEE Transactions on Image Processing, 6(1):79–91, 1997

1997
[11]

Recognition of articulated and occluded objects.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 21(7):603–613, 1999

Graeme Jones and Bir Bhanu. Recognition of articulated and occluded objects.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 21(7):603–613, 1999

1999
[12]

Jinsong Zhang, Mengdao Xing, and Yiyuan Xie. FEC: A feature fusion framework for SAR target recognition based on electromagnetic scattering features and deep CNN features.IEEE Transactions on Geoscience and Remote Sensing, 59(3):2174–2187, 2021

2021
[13]

PAN: Part attention network integrating electromag- netic characteristics for interpretable SAR vehicle target recognition

Shao Feng, Kefeng Ji, Feng Wang, Lihua Zhang, Xiaoxiang Ma, and Gangyao Kuang. PAN: Part attention network integrating electromag- netic characteristics for interpretable SAR vehicle target recognition. IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

2023
[14]

Zhang, X

Y . Zhang, X. Gao, J. Xia, W. Li, S. Zhang, L. Liu, and X. Li. ASC-SepNet: Enhancing robust SAR ground target recognition via attribute scattering center and separability dual-driven learning.IEEE Transactions on Aerospace and Electronic Systems, 61(5):11308–11324, 2025

2025
[15]

SARATR-X: Toward building a foundation model for SAR target recognition.IEEE Transactions on Image Processing, 34:869–884, 2025

Weijie Li, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, and Xiang Li. SARATR-X: Toward building a foundation model for SAR target recognition.IEEE Transactions on Image Processing, 34:869–884, 2025

2025
[16]

Scattering prompt tuning: A fine- tuned foundation model for SAR object recognition

Weikang Guo, Shan Li, and Jian Yang. Scattering prompt tuning: A fine- tuned foundation model for SAR object recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 3056–3065, 2024

2024
[17]

Gerry, Lee C

Michael J. Gerry, Lee C. Potter, Inder J. Gupta, and Anton van der Merwe. A parametric model for synthetic aperture radar measurements. IEEE Transactions on Antennas and Propagation, 47(7):1179–1188, 1999

1999
[18]

EMI- Net: An end-to-end mechanism-driven interpretable network for SAR target recognition under EOCs.IEEE Transactions on Geoscience and Remote Sensing, 62:1–18, 2024

Leiyao Liao, Lan Du, Jian Chen, Zhuowei Cao, and Ke’er Zhou. EMI- Net: An end-to-end mechanism-driven interpretable network for SAR target recognition under EOCs.IEEE Transactions on Geoscience and Remote Sensing, 62:1–18, 2024

2024
[19]

A novel SAR target recognition method combining electromagnetic scattering information and GCN

Chen Li, Lan Du, Yi Li, and Jialun Song. A novel SAR target recognition method combining electromagnetic scattering information and GCN. IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022

2022
[20]

ST-Net: Scattering topology network for aircraft classification in high-resolution SAR images.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

Yuzhuo Kang, Zhirui Wang, Haoyu Zuo, Yidan Zhang, Zhujun Yang, Xian Sun, and Kun Fu. ST-Net: Scattering topology network for aircraft classification in high-resolution SAR images.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

2023
[21]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Per- ona, Deva Ramanan, Piotr Doll ´ar, and C. Lawrence Zitnick. Microsoft COCO: Common objects in context. InProceedings of the European Conference on Computer Vision, pages 740–755, 2014

2014
[22]

Openpose: Realtime multi-person 2d pose estimation using part affinity fields.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021

Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Openpose: Realtime multi-person 2d pose estimation using part affinity fields.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021

2021
[23]

Simple baselines for human pose estimation and tracking

Bin Xiao, Haiping Wu, and Yichen Wei. Simple baselines for human pose estimation and tracking. InProceedings of the European Confer- ence on Computer Vision, pages 466–481, 2018

2018
[24]

2d human pose estimation: New benchmark and state of the art analysis

Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2d human pose estimation: New benchmark and state of the art analysis. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3686–3693, 2014

2014
[25]

Deep high-resolution representation learning for human pose estimation

Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. Deep high-resolution representation learning for human pose estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5693–5703, 2019

2019
[26]

Transpose: Keypoint localization via transformer

Sen Yang, Zhibin Quan, Mu Nie, and Wankou Yang. Transpose: Keypoint localization via transformer. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 11782–11792, 2021

2021
[27]

ViTPose++: Vision transformer for generic body pose estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(2):1212–1230, 2024

Yufei Xu, Jing Zhang, Qiming Zhang, and Dacheng Tao. ViTPose++: Vision transformer for generic body pose estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(2):1212–1230, 2024

2024
[28]

Spatial-aware regression for keypoint localization

Dongkai Wang and Shiliang Zhang. Spatial-aware regression for keypoint localization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 624–633, 2024

2024
[29]

DiffusionPose: Markov-optimized diffusion model for human pose estimation.Proceedings of the AAAI Conference on Artificial Intelligence, 40(12):10412–10420, 2026

Zhigang Wang, Zhenguang Liu, Shaojing Fan, Sifan Wu, and Yingying Jiao. DiffusionPose: Markov-optimized diffusion model for human pose estimation.Proceedings of the AAAI Conference on Artificial Intelligence, 40(12):10412–10420, 2026

2026
[30]

NanoHTNet: Nano human topology network for efficient 3d human pose estimation.IEEE Transactions on Image Processing, 2025

Jialun Cai, Mengyuan Liu, Hong Liu, Wenhao Li, and Shuheng Zhou. NanoHTNet: Nano human topology network for efficient 3d human pose estimation.IEEE Transactions on Image Processing, 2025

2025
[31]

SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset.Journal of Radars, 12(4):906–922, 2023

Zhirui Wang, Yuzhuo Kang, Xuan Zeng, Yuelei Wang, Ting Zhang, and Xian Sun. SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset.Journal of Radars, 12(4):906–922, 2023

2023
[32]

Xian Sun, Yixuan Lv, Zhirui Wang, and Kun Fu. SCAN: Scattering characteristics analysis network for few-shot aircraft classification in high-resolution SAR images.IEEE Transactions on Geoscience and Remote Sensing, 60:1–17, 2022

2022
[33]

Aircraft detection in SAR images based on peak feature fusion and adaptive deformable network.Remote Sensing, 14(23):6077, 2022

Xiayang Xiao, Hecheng Jia, Penghao Xiao, and Haipeng Wang. Aircraft detection in SAR images based on peak feature fusion and adaptive deformable network.Remote Sensing, 14(23):6077, 2022

2022
[34]

Research progress on aircraft detection and recognition in SAR imagery.Journal of Radars, 9(3):497– 513, 2020

Qian Guo, Haipeng Wang, and Feng Xu. Research progress on aircraft detection and recognition in SAR imagery.Journal of Radars, 9(3):497– 513, 2020

2020
[35]

Papathanassiou

Alberto Moreira, Pau Prats-Iraola, Marwan Younis, Gerhard Krieger, Irena Hajnsek, and Konstantinos P. Papathanassiou. A tutorial on syn- thetic aperture radar.IEEE Geoscience and Remote Sensing Magazine, 1(1):6–43, 2013

2013
[36]

Artech House, 1998

Chris Oliver and Shaun Quegan.Understanding Synthetic Aperture Radar Images. Artech House, 1998

1998
[37]

A tutorial on speckle reduction in synthetic aperture radar images.IEEE Geoscience and Remote Sensing Magazine, 1(3):6–35, 2013

Fabrizio Argenti, Andrea Lapini, Tiziano Bianchi, and Luciano Al- parone. A tutorial on speckle reduction in synthetic aperture radar images.IEEE Geoscience and Remote Sensing Magazine, 1(3):6–35, 2013

2013
[38]

Integral human pose regression

Xiao Sun, Bin Xiao, Fangyin Wei, Shuang Liang, and Yichen Wei. Integral human pose regression. InProceedings of the European Conference on Computer Vision, pages 529–545, 2018

2018
[39]

Semi-supervised learning by entropy minimization

Yves Grandvalet and Yoshua Bengio. Semi-supervised learning by entropy minimization. InAdvances in Neural Information Processing Systems, 2005

2005
[40]

Multi-task learning using uncertainty to weigh losses for scene geometry and semantics

Alex Kendall, Yarin Gal, and Roberto Cipolla. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7482–7491, 2018

2018
[41]

Gradient surgery for multi-task learning

Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning. In Advances in Neural Information Processing Systems, volume 33, pages 5824–5836, 2020

2020
[42]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic op- timization. InProceedings of the International Conference on Learning Representations, 2015

2015
[43]

OpenMMLab pose estimation toolbox and benchmark

MMPose Contributors. OpenMMLab pose estimation toolbox and benchmark. https://github.com/open-mmlab/mmpose, 2020

2020
[44]

ViTPose: Simple vision transformer baselines for human pose estimation

Yufei Xu, Jing Zhang, Qiming Zhang, and Dacheng Tao. ViTPose: Simple vision transformer baselines for human pose estimation. In Advances in Neural Information Processing Systems, volume 35, pages 38571–38584, 2022

2022
[45]

SimCC: A simple coordinate classification perspective for human pose estimation

Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang, and Shu-Tao Xia. SimCC: A simple coordinate classification perspective for human pose estimation. In Proceedings of the European Conference on Computer Vision, pages 89–106, 2022

2022
[46]

ProbPose: A probabilistic approach to 2d human pose estimation

Miroslav Purkrabek and Jiri Matas. ProbPose: A probabilistic approach to 2d human pose estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2025
[47]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770– 778, 2016

2016
[48]

Very deep convolutional networks for large-scale image recognition

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. InProceedings of the International Conference on Learning Representations, 2015

2015
[49]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weis- senborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. InProceedings of the International Conference on Learning Representati...

2021
[50]

Keypoint- based angle estimation for SAR images with bayesian ambiguity opti- mization.IEEE Geoscience and Remote Sensing Letters, 22:1–5, 2025

Hanbo Sang, Tianrui Chen, Weiwei Guo, and Zenghui Zhang. Keypoint- based angle estimation for SAR images with bayesian ambiguity opti- mization.IEEE Geoscience and Remote Sensing Letters, 22:1–5, 2025

2025

[1] [1]

Deep learning meets SAR: Concepts, models, pitfalls, and perspectives

Xiao Xiang Zhu, Shiyao Montazeri, Mohsin Ali, Yuansheng Hua, Yuanyuan Wang, Lichao Mou, Yilei Shi, Feng Xu, and Richard Bamler. Deep learning meets SAR: Concepts, models, pitfalls, and perspectives. IEEE Geoscience and Remote Sensing Magazine, 9(4):143–172, 2021

2021

[2] [2]

Self-supervised despeckling based solely on SAR intensity images: A general strategy

Liang Chen, Yifei Yin, Hao Shi, Jingfei He, and Wei Li. Self-supervised despeckling based solely on SAR intensity images: A general strategy. ISPRS Journal of Photogrammetry and Remote Sensing, 231:854–873, 2026

2026

[3] [3]

DOGAN: DINO-based optical-prior-driven GAN for SAR-to- optical image translation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–16, 2025

Jingfei He, Liang Chen, Hao Shi, Yuhang Chen, Jingyi Yang, and Wei Li. DOGAN: DINO-based optical-prior-driven GAN for SAR-to- optical image translation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–16, 2025

2025

[4] [4]

Yifei Yin, Zhu Yang, Hao Shi, Fanyu Meng, and Wei Li. Ship detection transformer in SAR images based on key scattering points feature aggregation and context feature refinement.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:17820– 17836, 2025

2025

[5] [5]

Yongxiang Liu, Weijie Li, Li Liu, Jie Zhou, Bowen Peng, Yafei Song, Xuying Xiong, Wei Yang, Tianpeng Liu, Zhen Liu, and Xiang Li. ATRNet-STAR: A large dataset and benchmark toward remote sensing object recognition in the wild.IEEE Transactions on Pattern Analysis and Machine Intelligence, 48(6):6735–6753, 2026

2026

[6] [6]

Re- inforcement learning for SAR target orientation inference with the differentiable SAR renderer.IEEE Transactions on Geoscience and Remote Sensing, 62:1–13, 2024

Yue Wang, Hongyu Jia, Shanshan Fu, Hong Lin, and Feng Xu. Re- inforcement learning for SAR target orientation inference with the differentiable SAR renderer.IEEE Transactions on Geoscience and Remote Sensing, 62:1–13, 2024

2024

[7] [7]

PGMNet: A prototype-guided multimodal network for ship recognition in SAR images.IEEE Transactions on Geoscience and Remote Sensing, 63:1–17, 2025

Liang Chen, Jianhao Li, Honghu Zhong, Hao Shi, Zhu Yang, and Wei Li. PGMNet: A prototype-guided multimodal network for ship recognition in SAR images.IEEE Transactions on Geoscience and Remote Sensing, 63:1–17, 2025

2025

[8] [8]

CV-SAR-Det: Target detection for SAR images via deep complex-valued network.IEEE Transactions on Aerospace and Electronic Systems, 60(6):8226–8238, 2024

Zhaocheng Wang, Ruonan Wang, Hailong Kang, Feng Luo, and Jun Ai. CV-SAR-Det: Target detection for SAR images via deep complex-valued network.IEEE Transactions on Aerospace and Electronic Systems, 60(6):8226–8238, 2024

2024

[9] [9]

Potter, D.-M

Lee C. Potter, D.-M. Chiang, R. Carriere, and M. J. Gerry. A GTD-based parametric model for radar scattering.IEEE Transactions on Antennas and Propagation, 43(10):1058–1067, 1995. 13

1995

[10] [10]

Potter and Randolph L

Lee C. Potter and Randolph L. Moses. Attributed scattering centers for SAR ATR.IEEE Transactions on Image Processing, 6(1):79–91, 1997

1997

[11] [11]

Recognition of articulated and occluded objects.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 21(7):603–613, 1999

Graeme Jones and Bir Bhanu. Recognition of articulated and occluded objects.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 21(7):603–613, 1999

1999

[12] [12]

Jinsong Zhang, Mengdao Xing, and Yiyuan Xie. FEC: A feature fusion framework for SAR target recognition based on electromagnetic scattering features and deep CNN features.IEEE Transactions on Geoscience and Remote Sensing, 59(3):2174–2187, 2021

2021

[13] [13]

PAN: Part attention network integrating electromag- netic characteristics for interpretable SAR vehicle target recognition

Shao Feng, Kefeng Ji, Feng Wang, Lihua Zhang, Xiaoxiang Ma, and Gangyao Kuang. PAN: Part attention network integrating electromag- netic characteristics for interpretable SAR vehicle target recognition. IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

2023

[14] [14]

Zhang, X

Y . Zhang, X. Gao, J. Xia, W. Li, S. Zhang, L. Liu, and X. Li. ASC-SepNet: Enhancing robust SAR ground target recognition via attribute scattering center and separability dual-driven learning.IEEE Transactions on Aerospace and Electronic Systems, 61(5):11308–11324, 2025

2025

[15] [15]

SARATR-X: Toward building a foundation model for SAR target recognition.IEEE Transactions on Image Processing, 34:869–884, 2025

Weijie Li, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, and Xiang Li. SARATR-X: Toward building a foundation model for SAR target recognition.IEEE Transactions on Image Processing, 34:869–884, 2025

2025

[16] [16]

Scattering prompt tuning: A fine- tuned foundation model for SAR object recognition

Weikang Guo, Shan Li, and Jian Yang. Scattering prompt tuning: A fine- tuned foundation model for SAR object recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 3056–3065, 2024

2024

[17] [17]

Gerry, Lee C

Michael J. Gerry, Lee C. Potter, Inder J. Gupta, and Anton van der Merwe. A parametric model for synthetic aperture radar measurements. IEEE Transactions on Antennas and Propagation, 47(7):1179–1188, 1999

1999

[18] [18]

EMI- Net: An end-to-end mechanism-driven interpretable network for SAR target recognition under EOCs.IEEE Transactions on Geoscience and Remote Sensing, 62:1–18, 2024

Leiyao Liao, Lan Du, Jian Chen, Zhuowei Cao, and Ke’er Zhou. EMI- Net: An end-to-end mechanism-driven interpretable network for SAR target recognition under EOCs.IEEE Transactions on Geoscience and Remote Sensing, 62:1–18, 2024

2024

[19] [19]

A novel SAR target recognition method combining electromagnetic scattering information and GCN

Chen Li, Lan Du, Yi Li, and Jialun Song. A novel SAR target recognition method combining electromagnetic scattering information and GCN. IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022

2022

[20] [20]

ST-Net: Scattering topology network for aircraft classification in high-resolution SAR images.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

Yuzhuo Kang, Zhirui Wang, Haoyu Zuo, Yidan Zhang, Zhujun Yang, Xian Sun, and Kun Fu. ST-Net: Scattering topology network for aircraft classification in high-resolution SAR images.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

2023

[21] [21]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Per- ona, Deva Ramanan, Piotr Doll ´ar, and C. Lawrence Zitnick. Microsoft COCO: Common objects in context. InProceedings of the European Conference on Computer Vision, pages 740–755, 2014

2014

[22] [22]

Openpose: Realtime multi-person 2d pose estimation using part affinity fields.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021

Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Openpose: Realtime multi-person 2d pose estimation using part affinity fields.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021

2021

[23] [23]

Simple baselines for human pose estimation and tracking

Bin Xiao, Haiping Wu, and Yichen Wei. Simple baselines for human pose estimation and tracking. InProceedings of the European Confer- ence on Computer Vision, pages 466–481, 2018

2018

[24] [24]

2d human pose estimation: New benchmark and state of the art analysis

Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2d human pose estimation: New benchmark and state of the art analysis. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3686–3693, 2014

2014

[25] [25]

Deep high-resolution representation learning for human pose estimation

Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. Deep high-resolution representation learning for human pose estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5693–5703, 2019

2019

[26] [26]

Transpose: Keypoint localization via transformer

Sen Yang, Zhibin Quan, Mu Nie, and Wankou Yang. Transpose: Keypoint localization via transformer. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 11782–11792, 2021

2021

[27] [27]

ViTPose++: Vision transformer for generic body pose estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(2):1212–1230, 2024

Yufei Xu, Jing Zhang, Qiming Zhang, and Dacheng Tao. ViTPose++: Vision transformer for generic body pose estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(2):1212–1230, 2024

2024

[28] [28]

Spatial-aware regression for keypoint localization

Dongkai Wang and Shiliang Zhang. Spatial-aware regression for keypoint localization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 624–633, 2024

2024

[29] [29]

DiffusionPose: Markov-optimized diffusion model for human pose estimation.Proceedings of the AAAI Conference on Artificial Intelligence, 40(12):10412–10420, 2026

Zhigang Wang, Zhenguang Liu, Shaojing Fan, Sifan Wu, and Yingying Jiao. DiffusionPose: Markov-optimized diffusion model for human pose estimation.Proceedings of the AAAI Conference on Artificial Intelligence, 40(12):10412–10420, 2026

2026

[30] [30]

NanoHTNet: Nano human topology network for efficient 3d human pose estimation.IEEE Transactions on Image Processing, 2025

Jialun Cai, Mengyuan Liu, Hong Liu, Wenhao Li, and Shuheng Zhou. NanoHTNet: Nano human topology network for efficient 3d human pose estimation.IEEE Transactions on Image Processing, 2025

2025

[31] [31]

SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset.Journal of Radars, 12(4):906–922, 2023

Zhirui Wang, Yuzhuo Kang, Xuan Zeng, Yuelei Wang, Ting Zhang, and Xian Sun. SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset.Journal of Radars, 12(4):906–922, 2023

2023

[32] [32]

Xian Sun, Yixuan Lv, Zhirui Wang, and Kun Fu. SCAN: Scattering characteristics analysis network for few-shot aircraft classification in high-resolution SAR images.IEEE Transactions on Geoscience and Remote Sensing, 60:1–17, 2022

2022

[33] [33]

Aircraft detection in SAR images based on peak feature fusion and adaptive deformable network.Remote Sensing, 14(23):6077, 2022

Xiayang Xiao, Hecheng Jia, Penghao Xiao, and Haipeng Wang. Aircraft detection in SAR images based on peak feature fusion and adaptive deformable network.Remote Sensing, 14(23):6077, 2022

2022

[34] [34]

Research progress on aircraft detection and recognition in SAR imagery.Journal of Radars, 9(3):497– 513, 2020

Qian Guo, Haipeng Wang, and Feng Xu. Research progress on aircraft detection and recognition in SAR imagery.Journal of Radars, 9(3):497– 513, 2020

2020

[35] [35]

Papathanassiou

Alberto Moreira, Pau Prats-Iraola, Marwan Younis, Gerhard Krieger, Irena Hajnsek, and Konstantinos P. Papathanassiou. A tutorial on syn- thetic aperture radar.IEEE Geoscience and Remote Sensing Magazine, 1(1):6–43, 2013

2013

[36] [36]

Artech House, 1998

Chris Oliver and Shaun Quegan.Understanding Synthetic Aperture Radar Images. Artech House, 1998

1998

[37] [37]

A tutorial on speckle reduction in synthetic aperture radar images.IEEE Geoscience and Remote Sensing Magazine, 1(3):6–35, 2013

Fabrizio Argenti, Andrea Lapini, Tiziano Bianchi, and Luciano Al- parone. A tutorial on speckle reduction in synthetic aperture radar images.IEEE Geoscience and Remote Sensing Magazine, 1(3):6–35, 2013

2013

[38] [38]

Integral human pose regression

Xiao Sun, Bin Xiao, Fangyin Wei, Shuang Liang, and Yichen Wei. Integral human pose regression. InProceedings of the European Conference on Computer Vision, pages 529–545, 2018

2018

[39] [39]

Semi-supervised learning by entropy minimization

Yves Grandvalet and Yoshua Bengio. Semi-supervised learning by entropy minimization. InAdvances in Neural Information Processing Systems, 2005

2005

[40] [40]

Multi-task learning using uncertainty to weigh losses for scene geometry and semantics

Alex Kendall, Yarin Gal, and Roberto Cipolla. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7482–7491, 2018

2018

[41] [41]

Gradient surgery for multi-task learning

Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning. In Advances in Neural Information Processing Systems, volume 33, pages 5824–5836, 2020

2020

[42] [42]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic op- timization. InProceedings of the International Conference on Learning Representations, 2015

2015

[43] [43]

OpenMMLab pose estimation toolbox and benchmark

MMPose Contributors. OpenMMLab pose estimation toolbox and benchmark. https://github.com/open-mmlab/mmpose, 2020

2020

[44] [44]

ViTPose: Simple vision transformer baselines for human pose estimation

Yufei Xu, Jing Zhang, Qiming Zhang, and Dacheng Tao. ViTPose: Simple vision transformer baselines for human pose estimation. In Advances in Neural Information Processing Systems, volume 35, pages 38571–38584, 2022

2022

[45] [45]

SimCC: A simple coordinate classification perspective for human pose estimation

Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang, and Shu-Tao Xia. SimCC: A simple coordinate classification perspective for human pose estimation. In Proceedings of the European Conference on Computer Vision, pages 89–106, 2022

2022

[46] [46]

ProbPose: A probabilistic approach to 2d human pose estimation

Miroslav Purkrabek and Jiri Matas. ProbPose: A probabilistic approach to 2d human pose estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2025

[47] [47]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770– 778, 2016

2016

[48] [48]

Very deep convolutional networks for large-scale image recognition

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. InProceedings of the International Conference on Learning Representations, 2015

2015

[49] [49]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weis- senborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. InProceedings of the International Conference on Learning Representati...

2021

[50] [50]

Keypoint- based angle estimation for SAR images with bayesian ambiguity opti- mization.IEEE Geoscience and Remote Sensing Letters, 22:1–5, 2025

Hanbo Sang, Tianrui Chen, Weiwei Guo, and Zenghui Zhang. Keypoint- based angle estimation for SAR images with bayesian ambiguity opti- mization.IEEE Geoscience and Remote Sensing Letters, 22:1–5, 2025

2025