RPCASSM: Robust PCA State Space Model For Infrared Small Target Detection

Aohua Li; Jin Kuang; Pingping Liu; Qiuzhan Zhou; Tongshun Zhang; Yubing Lu

arxiv: 2606.01689 · v1 · pith:5YFNA43Gnew · submitted 2026-06-01 · 💻 cs.CV · cs.AI

RPCASSM: Robust PCA State Space Model For Infrared Small Target Detection

Pingping Liu , Aohua Li , Yubing Lu , Jin Kuang , Tongshun Zhang , Qiuzhan Zhou This is my paper

Pith reviewed 2026-06-28 15:12 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords infrared small target detectionstate space modelrobust PCAbackground modelingtarget modelingedge structurescanning mechanismcomputer vision

0 comments

The pith

RPCASSM adapts the robust PCA paradigm into state space modules that scan background and target regions separately according to their distinct spatial properties in infrared images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that standard visual state space models fail to capture the precise edges of infrared small targets because those targets occupy few pixels and exhibit sparsity plus local highlight. To fix this, the authors build RPCASSM around the robust PCA decomposition idea, creating a background state space module that uses a spatial probe scanning mechanism to capture heterogeneous signals and a target state space module that uses a deformable prompt scanning mechanism to focus on sparse, highlighted regions. The two modules together are said to produce accurate edge modeling without departing from the state-space framework. Experiments on existing benchmark datasets are presented as evidence that the design works. A sympathetic reader would care because reliable infrared small-target detection matters for surveillance, security, and rescue tasks where current models lose boundary detail.

Core claim

RPCASSM is a network built on the robust PCA model paradigm that introduces a background state space module (BSSM) with a spatial probe scanning mechanism (SPCM) derived from background saliency of heterogeneous signals and a target state space module (TSSM) with a deformable prompt scanning mechanism (DPCM) derived from target sparsity and local highlight; together these modules solve the edge-modeling shortfall of mainstream vision state space models for infrared small targets.

What carries the argument

Background state space module (BSSM) with spatial probe scanning mechanism (SPCM) and target state space module (TSSM) with deformable prompt scanning mechanism (DPCM), both constructed from the spatial-domain properties of infrared small targets inside an RPCA-style separation.

If this is right

The separation of background and target scanning yields measurable gains in detection and segmentation accuracy on standard infrared small-target benchmarks.
The RPCA-inspired structure keeps the overall model inside the state-space family while adding domain-specific scanning rules.
The design directly targets the low-occupancy and edge-structure problems that current vision state space models leave unaddressed.
Public code release allows direct replication and extension on the reported datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the scanning mechanisms prove stable across different sensor resolutions, the same separation principle could be tested on other sparse-object detection tasks such as astronomy or medical imaging.
The approach leaves open whether the same RPCA-style split can be applied to video sequences where temporal consistency of small targets becomes an additional constraint.
A natural next measurement would be to quantify how much of the reported gain comes from the scanning rules versus the overall RPCA framing.

Load-bearing premise

The spatial probe and deformable prompt scanning mechanisms, derived from background and target spatial properties, will produce accurate edge modeling without introducing new artifacts or needing extensive extra tuning.

What would settle it

A controlled comparison on the same benchmark datasets in which edge-precision metrics (such as boundary F-score or pixel-level IoU on target contours) show no statistically significant gain over a standard vision state space model baseline.

Figures

Figures reproduced from arXiv: 2606.01689 by Aohua Li, Jin Kuang, Pingping Liu, Qiuzhan Zhou, Tongshun Zhang, Yubing Lu.

**Figure 2.** Figure 2: The overall structure of the RPCASSM. The network is composed of [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: The structure of the SPCM in the BSSM. Inspired by this, the field of infrared small target detection (ISTD) has begun to actively explore the application potential of SSM, trying to combine its powerful sequence modeling ability with infrared imaging characteristics. MiM-ISTD[15] achieves the collaborative unification of global context awareness and local detail focus by constructing the internal and ex… view at source ↗

**Figure 4.** Figure 4: The structure of the DPCM in the TSSM Background State Space Module (BSSM). The proposed module is based on the heterogeneous characteristics of saliency between infrared small target and background. By effectively segmenting heterogeneous information on the spatial axis, the refined modeling of state space relations and the enhancement of long-distance dependence are realized. As shown in [PITH_FULL_IMA… view at source ↗

**Figure 5.** Figure 5: Visual comparison of detection results on sample images from IRSTD-1K (first three rows) and NUDT-SIRST (last three rows). Yellow and red [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: 3D visualizations of the saliency maps produced by different methods on the six test images are presented as the counterparts to Fig. 5. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: ROC curves of different methods in NUDT-SIRST and IRSTD-1K. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Stage-wise target heatmaps of the RPCASSM model, demonstrating the heterogeneous signal characteristics of the target along the single-axis direction. [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

read the original abstract

The detection and segmentation of infrared small targets have important application significance in the fields of surveillance and security, maritime rescue and so on. Due to the low occupancy of these targets in long-distance imaging, the mainstream visual state space model is inefficient and difficult to accurately model the target edge. The existing infrared state space models do not deviate from the mainstream visual state space structure framework from the structural properties of infrared small targets. In order to solve this problem, this paper proposes the RPCASSM network based on the model paradigm of robust principal component analysis(RPCA), which aims to design the background state space module(BSSM) and the target state space module(TSSM) by the nature of the infrared small target in the spatial domain. The BSSM aims to use the saliency of spatial heterogeneous signals to design a spatial probe scanning mechanism(SPCM) to model background information. The TSSM designs a deformable prompt scanning mechanism(DPCM) by using the sparsity and local highlight of the target to focus on the deformable space of the target for state space modeling. According to the above design, we effectively solve the problem that the existing mainstream vision state space model is difficult to accurately model the edge structure of infrared small target. Experimental results on the existing benchmark data sets prove the effectiveness of the RPCASSM design. Our code will be made public at \href{https://github.com/PepperCS/RPCASSM}{RPCASSM}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper splits an SSM into RPCA-style background and target modules with two new scanning mechanisms for IR small targets, but the abstract gives no numbers or ablations to back the edge-modeling claim.

read the letter

The core idea is a state space model that borrows the RPCA separation into background and target parts, then adds a spatial probe scan for the background module and a deformable prompt scan for the target module. Both scans are motivated by the spatial traits of infrared small targets—saliency for background, sparsity and local brightness for targets.

This pairing looks like a genuine extension rather than a straight re-use of existing vision SSMs. The authors correctly flag that standard Mamba-style models struggle with the edge structure of these tiny, low-contrast objects, and they try to fix it by baking domain properties into the scanning operators.

The main weakness is that the abstract states the design solves the edge problem and works on benchmarks, yet supplies no tables, no quantitative scores, no error bars, and no ablation that isolates the two scanning mechanisms. Without those, the claim that the scans produce accurate edges without new artifacts stays an assumption. The stress-test note is right on this point: aggregate detection scores alone do not show a causal link.

If the full paper contains proper comparisons against recent IR detectors and SSM variants, plus direct edge metrics and ablations, the work is worth a look for the small-target detection crowd. Right now the evidence is too thin to judge.

I would bring it to a reading group only if someone has already read the experiments section. I would not cite it yet. A serious editor should send it to review rather than desk-reject, because the architectural choice is specific enough that referees can check whether the claimed gains actually appear in the data.

Referee Report

2 major / 1 minor

Summary. The paper proposes RPCASSM, a network based on the robust principal component analysis (RPCA) paradigm for infrared small target detection and segmentation. It introduces a Background State Space Module (BSSM) employing a Spatial Probe Scanning Mechanism (SPCM) to model background via spatial saliency, and a Target State Space Module (TSSM) using a Deformable Prompt Scanning Mechanism (DPCM) to focus on sparse, locally highlighted targets. The central claim is that these modules, derived from infrared small target spatial properties, solve the edge-modeling deficiencies of standard vision state space models. Effectiveness is asserted through experiments on existing benchmark datasets, with code to be released publicly.

Significance. If the experimental claims hold, the work offers a targeted adaptation of state space models to infrared small target characteristics via RPCA-inspired decomposition, which could improve edge fidelity in low-occupancy detection tasks. The explicit public code release is a positive contribution for reproducibility in the computer vision community.

major comments (2)

[Experimental Results] Experimental section: only aggregate detection/segmentation scores on benchmarks are reported; no ablation studies isolate the SPCM or DPCM contributions, and no direct edge-specific metrics (e.g., boundary precision, Hausdorff distance, or edge IoU) are provided to substantiate the claim of improved edge modeling.
[Method] Method section (BSSM/TSSM descriptions): the design of SPCM and DPCM is motivated by spatial-domain heuristics (saliency, sparsity, local highlight), but no analysis or visualization demonstrates that these mechanisms avoid introducing new artifacts or require post-hoc tuning, leaving the causal link to accurate edge modeling unverified.

minor comments (1)

[Abstract] The abstract states that experiments 'prove the effectiveness' without referencing specific tables, figures, or quantitative improvements; this phrasing should be softened to 'demonstrate' pending detailed results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and note the planned revisions.

read point-by-point responses

Referee: [Experimental Results] Experimental section: only aggregate detection/segmentation scores on benchmarks are reported; no ablation studies isolate the SPCM or DPCM contributions, and no direct edge-specific metrics (e.g., boundary precision, Hausdorff distance, or edge IoU) are provided to substantiate the claim of improved edge modeling.

Authors: We agree that the current experimental section reports only aggregate metrics. To substantiate the edge-modeling claim, the revised manuscript will add ablation studies isolating SPCM and DPCM contributions together with edge-specific metrics such as boundary precision and Hausdorff distance. revision: yes
Referee: [Method] Method section (BSSM/TSSM descriptions): the design of SPCM and DPCM is motivated by spatial-domain heuristics (saliency, sparsity, local highlight), but no analysis or visualization demonstrates that these mechanisms avoid introducing new artifacts or require post-hoc tuning, leaving the causal link to accurate edge modeling unverified.

Authors: The SPCM and DPCM designs are derived directly from the spatial properties stated in the method section. To strengthen verification of the causal link, the revised manuscript will incorporate visualizations and analysis showing the mechanisms' effects on edge modeling and confirming absence of new artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity; design is heuristic and externally validated

full rationale

The paper motivates BSSM (via SPCM) and TSSM (via DPCM) from explicit spatial-domain properties of IR small targets (saliency, sparsity, local highlight) inside an RPCA-inspired framework, then reports aggregate detection/segmentation results on standard benchmarks as evidence of effectiveness. No equations, fitted parameters, or self-citations are exhibited that would make any performance claim or edge-modeling assertion reduce to the inputs by construction. The derivation chain therefore remains self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Review performed on abstract only; full model equations, training details, and any fitted hyperparameters are unavailable, so ledger entries are limited to those inferable from the high-level description.

axioms (1)

domain assumption Infrared small targets exhibit sparsity and local highlight in the spatial domain while background signals are heterogeneous.
Invoked to justify the design of TSSM and BSSM.

invented entities (2)

Background State Space Module (BSSM) with Spatial Probe Scanning Mechanism (SPCM) no independent evidence
purpose: Model background information using saliency of spatial heterogeneous signals
New module introduced in the paper; no independent evidence provided in abstract.
Target State Space Module (TSSM) with Deformable Prompt Scanning Mechanism (DPCM) no independent evidence
purpose: Focus on deformable space of the target for state space modeling using sparsity and local highlight
New module introduced in the paper; no independent evidence provided in abstract.

pith-pipeline@v0.9.1-grok · 5805 in / 1280 out tokens · 23276 ms · 2026-06-28T15:12:34.883867+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 7 canonical work pages · 2 internal anchors

[1]

Single- frame infrared small-target detection: A survey,

M. Zhao, W. Li, L. Li, J. Hu, P. Ma, and R. Tao, “Single- frame infrared small-target detection: A survey,”IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 87–119, 2022

2022
[2]

Classification of small boats in infrared images for maritime surveillance,

M. Teutsch and W. Kr ¨uger, “Classification of small boats in infrared images for maritime surveillance,” in2010 International WaterSide Security Conference, 2010, pp. 1–7

2010
[3]

Ascnet: Asymmetric sampling correction network for infrared image destriping,

S. Yuan, H. Qin, X. Yan, S. Yang, S. Yang, N. Akhtar, and H. Zhou, “Ascnet: Asymmetric sampling correction network for infrared image destriping,”IEEE Transac- tions on Geoscience and Remote Sensing, 2025

2025
[4]

Max-mean and max-median filters for detection of small targets,

S. D. Deshpande, M. H. Er, R. Venkateswarlu, and P. Chan, “Max-mean and max-median filters for detection of small targets,” inSignal and Data Processing of Small Targets 1999, vol. 3809. SPIE, 1999, pp. 74–83

1999
[5]

Detection of dim targets in digital infrared imagery by morphological image process- ing,

J.-F. Rivest and R. Fortin, “Detection of dim targets in digital infrared imagery by morphological image process- ing,”Optical Engineering, vol. 35, no. 7, pp. 1886–1893, 1996

1996
[6]

A local contrast method for small infrared target detection,

C. P. Chen, H. Li, Y . Wei, T. Xia, and Y . Y . Tang, “A local contrast method for small infrared target detection,”IEEE transactions on geoscience and remote sensing, vol. 52, no. 1, pp. 574–581, 2013

2013
[7]

A local contrast method for infrared small-target detection utilizing a tri-layer window,

J. Han, S. Moradi, I. Faramarzi, C. Liu, H. Zhang, and Q. Zhao, “A local contrast method for infrared small-target detection utilizing a tri-layer window,”IEEE Geoscience and Remote Sensing Letters, vol. 17, no. 10, pp. 1822–1826, 2019

2019
[8]

Infrared patch-image model for small target detection in a single image,

C. Gao, D. Meng, Y . Yang, Y . Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,”IEEE transactions on image processing, vol. 22, no. 12, pp. 4996–5009, 2013

2013
[9]

Infrared small target detection via low-rank tensor completion with top-hat regularization,

H. Zhu, S. Liu, L. Deng, Y . Li, and F. Xiao, “Infrared small target detection via low-rank tensor completion with top-hat regularization,”IEEE Transactions on Geo- science and Remote Sensing, vol. 58, no. 2, pp. 1004– 1016, 2019

2019
[10]

Dense nested attention network for infrared small target detection,

B. Li, C. Xiao, L. Wang, Y . Wang, Z. Lin, M. Li, W. An, and Y . Guo, “Dense nested attention network for infrared small target detection,”IEEE Transactions on Image Processing, vol. 32, pp. 1745–1758, 2022

2022
[11]

Asymmetric contextual modulation for infrared small target detec- tion,

Y . Dai, Y . Wu, F. Zhou, and K. Barnard, “Asymmetric contextual modulation for infrared small target detec- tion,” inProceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 950–959

2021
[12]

Attentional local contrast networks for infrared small target detection,

——, “Attentional local contrast networks for infrared small target detection,”IEEE transactions on geoscience and remote sensing, vol. 59, no. 11, pp. 9813–9824, 2021

2021
[13]

Sctransnet: Spatial-channel cross transformer network for infrared small target detection,

S. Yuan, H. Qin, X. Yan, N. Akhtar, and A. Mian, “Sctransnet: Spatial-channel cross transformer network for infrared small target detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024

2024
[14]

Lsdssms: Infrared small target detection network based on low- rank sparse decomposition state space models,

Y . Lu, P. Liu, A. Li, Q. Zhou, and K. Zhang, “Lsdssms: Infrared small target detection network based on low- rank sparse decomposition state space models,”IEEE Transactions on Geoscience and Remote Sensing, 2025

2025
[15]

Mim-istd: Mamba-in-mamba for effi- cient infrared small target detection,

T. Chen, Z. Ye, Z. Tan, T. Gong, Y . Wu, Q. Chu, B. Liu, N. Yu, and J. Ye, “Mim-istd: Mamba-in-mamba for effi- cient infrared small target detection,”IEEE Transactions on Geoscience and Remote Sensing, 2024

2024
[16]

Mamba: Linear-time sequence mod- eling with selective state spaces,

A. Gu and T. Dao, “Mamba: Linear-time sequence mod- eling with selective state spaces,” inFirst Conference on Language Modeling, 2024

2024
[17]

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

T. Dao and A. Gu, “Transformers are ssms: Generalized models and efficient algorithms through structured state space duality,”arXiv preprint arXiv:2405.21060, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[18]

Vmamba: Visual state space model,

Y . Liu, Y . Tian, Y . Zhao, H. Yu, L. Xie, Y . Wang, Q. Ye, J. Jiao, and Y . Liu, “Vmamba: Visual state space model,”Advances in neural information processing systems, vol. 37, pp. 103 031–103 063, 2024

2024
[19]

Cwnet: Causal wavelet network for low-light image enhancement,

T. Zhang, P. Liu, Y . Lu, M. Cai, Z. Zhang, Z. Zhang, and Q. Zhou, “Cwnet: Causal wavelet network for low-light image enhancement,”arXiv preprint arXiv:2507.10689, 2025

work page arXiv 2025
[20]

Bsmamba: Brightness and semantic modeling for long- range interaction in low-light image enhancement,

T. Zhang, P. Liu, M. Cai, Z. Zhang, Y . Lu, and Q. Zhou, “Bsmamba: Brightness and semantic modeling for long- range interaction in low-light image enhancement,”arXiv preprint arXiv:2506.18346, 2025

work page arXiv 2025
[21]

Irmamba: Pixel difference mamba with layer restoration for infrared small target detection,

M. Zhang, X. Li, F. Gao, and J. Guo, “Irmamba: Pixel difference mamba with layer restoration for infrared small target detection,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 9, 2025, pp. 10 003–10 011

2025
[22]

Smile: Spatial-spectral mamba interactive learning for infrared small target de- tection,

Y . Li, L. Wang, and S. Chen, “Smile: Spatial-spectral mamba interactive learning for infrared small target de- tection,”IEEE Transactions on Geoscience and Remote Sensing, 2025

2025
[23]

Rp- canet: Deep unfolding rpca based infrared small target detection,

F. Wu, T. Zhang, L. Li, Y . Huang, and Z. Peng, “Rp- canet: Deep unfolding rpca based infrared small target detection,” inProceedings of the IEEE/CVF Winter Con- ference on Applications of Computer Vision, 2024, pp. 4809–4818

2024
[24]

Point-to-point regression: Accurate infrared small target detection with single- point annotation,

R. Ni, J. Wu, Z. Qiu, L. Chen, C. Luo, F. Huang, Q. Liu, B. Wang, Y . Li, and Y . Li, “Point-to-point regression: Accurate infrared small target detection with single- point annotation,”IEEE Transactions on Geoscience and Remote Sensing, 2025

2025
[25]

Mapping degeneration meets label evolution: Learning infrared small target detection with single point supervision,

X. Ying, L. Liu, Y . Wang, R. Li, N. Chen, Z. Lin, W. Sheng, and S. Zhou, “Mapping degeneration meets label evolution: Learning infrared small target detection with single point supervision,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 15 528–15 538

2023
[26]

Background modeling in the fourier domain for maritime infrared target detec- JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11 tion,

A. Zhou, W. Xie, and J. Pei, “Background modeling in the fourier domain for maritime infrared target detec- JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11 tion,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 8, pp. 2634–2649, 2020

2021
[27]

Dynamic high-frequency convolution for infrared small target detection,

R. Li, C. Xiao, Q. Yin, W. An, N. Chen, X. Ying, M. Li, and Y . Wang, “Dynamic high-frequency convolution for infrared small target detection,”IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2026

2026
[28]

Irsam: Advancing segment anything model for infrared small target detection,

M. Zhang, Y . Wang, J. Guo, Y . Li, X. Gao, and J. Zhang, “Irsam: Advancing segment anything model for infrared small target detection,” inEuropean Conference on Com- puter Vision. Springer, 2024, pp. 233–249

2024
[29]

Text-irstd: Leveraging semantic text to promote infrared small target detection in complex scenes,

F. Huang, S. Zheng, Z. Qiu, H. Liu, H. Bai, and L. Chen, “Text-irstd: Leveraging semantic text to promote infrared small target detection in complex scenes,”arXiv preprint arXiv:2503.07249, 2025

work page arXiv 2025
[30]

Rethinking evaluation of infrared small target detection.CoRR, abs/2509.16888, 2025

Y . Pang, X. Zhao, L. Zhang, H. Lu, G. E. Fakhri, X. Liu, and S. Lu, “Rethinking evaluation of infrared small target detection,”arXiv preprint arXiv:2509.16888, 2025

work page arXiv 2025
[31]

Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images,

H. Wang, L. Zhou, and L. Wang, “Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images,” inProceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8509–8518

2019
[32]

Isnet: Shape matters for infrared small target detection,

M. Zhang, R. Zhang, Y . Yang, H. Bai, J. Zhang, and J. Guo, “Isnet: Shape matters for infrared small target detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 877–886

2022
[33]

Infrared small target detection with scale and location sensitivity,

Q. Liu, R. Liu, B. Zheng, H. Wang, and Y . Fu, “Infrared small target detection with scale and location sensitivity,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 17 490– 17 499

2024
[34]

Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection,

J. Yang, S. Liu, J. Wu, X. Su, N. Hai, and X. Huang, “Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 9, 2025, pp. 9202–9210

2025
[35]

Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,

T. Wu, B. Li, Y . Luo, Y . Wang, C. Xiao, T. Liu, J. Yang, W. An, and Y . Guo, “Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,”IEEE Transac- tions on Geoscience and Remote Sensing, vol. 61, pp. 1–15, 2023

2023
[36]

Drpca-net: Make robust pca great again for infrared small target detection,

Z. Xiong, F. Zhou, F. Wu, S. Yuan, M. Fu, Z. Peng, J. Yang, and Y . Dai, “Drpca-net: Make robust pca great again for infrared small target detection,”IEEE Transac- tions on Geoscience and Remote Sensing, 2025

2025
[37]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural informa- tion processing systems, vol. 30, 2017

2017
[38]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recog- nition, 2016, pp. 770–778

2016
[39]

KAN: Kolmogorov-Arnold Networks

Z. Liu, Y . Wang, S. Vaidya, F. Ruehle, J. Halver- son, M. Solja ˇci´c, T. Y . Hou, and M. Tegmark, “Kan: Kolmogorov-arnold networks,”arXiv preprint arXiv:2404.19756, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[40]

Deformable convolutional networks,

J. Dai, H. Qi, Y . Xiong, Y . Li, G. Zhang, H. Hu, and Y . Wei, “Deformable convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 764–773

2017
[41]

Agpcnet: Attention-guided pyramid context networks for infrared small target detection,

T. Zhang, S. Cao, T. Pu, and Z. Peng, “Agpcnet: Attention-guided pyramid context networks for infrared small target detection,”arXiv preprint arXiv:2111.03580, 2021

work page arXiv 2021
[42]

Uiu-net: U-net in u-net for infrared small object detection,

X. Wu, D. Hong, and J. Chanussot, “Uiu-net: U-net in u-net for infrared small object detection,”IEEE Transac- tions on Image Processing, vol. 32, pp. 364–376, 2022. VI. BIOGRAPHYSECTION Pingping Liureceived M.S. and Ph.D. degrees from College of Computer Science and Technology, Jilin University, China, in 2004 and 2009, respec- tively. She is currently a ...

2022
[43]

degree in College of Computer Science and Technology, Jilin University, China

He is currently pursuing his Ph.D. degree in College of Computer Science and Technology, Jilin University, China. His research interests include infrared small target detection, tracking, and image segmentation. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12 Jin Kuangwas born in 2001. He received the B.S. degree from Xiangnan University in 2...

2021

[1] [1]

Single- frame infrared small-target detection: A survey,

M. Zhao, W. Li, L. Li, J. Hu, P. Ma, and R. Tao, “Single- frame infrared small-target detection: A survey,”IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 87–119, 2022

2022

[2] [2]

Classification of small boats in infrared images for maritime surveillance,

M. Teutsch and W. Kr ¨uger, “Classification of small boats in infrared images for maritime surveillance,” in2010 International WaterSide Security Conference, 2010, pp. 1–7

2010

[3] [3]

Ascnet: Asymmetric sampling correction network for infrared image destriping,

S. Yuan, H. Qin, X. Yan, S. Yang, S. Yang, N. Akhtar, and H. Zhou, “Ascnet: Asymmetric sampling correction network for infrared image destriping,”IEEE Transac- tions on Geoscience and Remote Sensing, 2025

2025

[4] [4]

Max-mean and max-median filters for detection of small targets,

S. D. Deshpande, M. H. Er, R. Venkateswarlu, and P. Chan, “Max-mean and max-median filters for detection of small targets,” inSignal and Data Processing of Small Targets 1999, vol. 3809. SPIE, 1999, pp. 74–83

1999

[5] [5]

Detection of dim targets in digital infrared imagery by morphological image process- ing,

J.-F. Rivest and R. Fortin, “Detection of dim targets in digital infrared imagery by morphological image process- ing,”Optical Engineering, vol. 35, no. 7, pp. 1886–1893, 1996

1996

[6] [6]

A local contrast method for small infrared target detection,

C. P. Chen, H. Li, Y . Wei, T. Xia, and Y . Y . Tang, “A local contrast method for small infrared target detection,”IEEE transactions on geoscience and remote sensing, vol. 52, no. 1, pp. 574–581, 2013

2013

[7] [7]

A local contrast method for infrared small-target detection utilizing a tri-layer window,

J. Han, S. Moradi, I. Faramarzi, C. Liu, H. Zhang, and Q. Zhao, “A local contrast method for infrared small-target detection utilizing a tri-layer window,”IEEE Geoscience and Remote Sensing Letters, vol. 17, no. 10, pp. 1822–1826, 2019

2019

[8] [8]

Infrared patch-image model for small target detection in a single image,

C. Gao, D. Meng, Y . Yang, Y . Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,”IEEE transactions on image processing, vol. 22, no. 12, pp. 4996–5009, 2013

2013

[9] [9]

Infrared small target detection via low-rank tensor completion with top-hat regularization,

H. Zhu, S. Liu, L. Deng, Y . Li, and F. Xiao, “Infrared small target detection via low-rank tensor completion with top-hat regularization,”IEEE Transactions on Geo- science and Remote Sensing, vol. 58, no. 2, pp. 1004– 1016, 2019

2019

[10] [10]

Dense nested attention network for infrared small target detection,

B. Li, C. Xiao, L. Wang, Y . Wang, Z. Lin, M. Li, W. An, and Y . Guo, “Dense nested attention network for infrared small target detection,”IEEE Transactions on Image Processing, vol. 32, pp. 1745–1758, 2022

2022

[11] [11]

Asymmetric contextual modulation for infrared small target detec- tion,

Y . Dai, Y . Wu, F. Zhou, and K. Barnard, “Asymmetric contextual modulation for infrared small target detec- tion,” inProceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 950–959

2021

[12] [12]

Attentional local contrast networks for infrared small target detection,

——, “Attentional local contrast networks for infrared small target detection,”IEEE transactions on geoscience and remote sensing, vol. 59, no. 11, pp. 9813–9824, 2021

2021

[13] [13]

Sctransnet: Spatial-channel cross transformer network for infrared small target detection,

S. Yuan, H. Qin, X. Yan, N. Akhtar, and A. Mian, “Sctransnet: Spatial-channel cross transformer network for infrared small target detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024

2024

[14] [14]

Lsdssms: Infrared small target detection network based on low- rank sparse decomposition state space models,

Y . Lu, P. Liu, A. Li, Q. Zhou, and K. Zhang, “Lsdssms: Infrared small target detection network based on low- rank sparse decomposition state space models,”IEEE Transactions on Geoscience and Remote Sensing, 2025

2025

[15] [15]

Mim-istd: Mamba-in-mamba for effi- cient infrared small target detection,

T. Chen, Z. Ye, Z. Tan, T. Gong, Y . Wu, Q. Chu, B. Liu, N. Yu, and J. Ye, “Mim-istd: Mamba-in-mamba for effi- cient infrared small target detection,”IEEE Transactions on Geoscience and Remote Sensing, 2024

2024

[16] [16]

Mamba: Linear-time sequence mod- eling with selective state spaces,

A. Gu and T. Dao, “Mamba: Linear-time sequence mod- eling with selective state spaces,” inFirst Conference on Language Modeling, 2024

2024

[17] [17]

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

T. Dao and A. Gu, “Transformers are ssms: Generalized models and efficient algorithms through structured state space duality,”arXiv preprint arXiv:2405.21060, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[18] [18]

Vmamba: Visual state space model,

Y . Liu, Y . Tian, Y . Zhao, H. Yu, L. Xie, Y . Wang, Q. Ye, J. Jiao, and Y . Liu, “Vmamba: Visual state space model,”Advances in neural information processing systems, vol. 37, pp. 103 031–103 063, 2024

2024

[19] [19]

Cwnet: Causal wavelet network for low-light image enhancement,

T. Zhang, P. Liu, Y . Lu, M. Cai, Z. Zhang, Z. Zhang, and Q. Zhou, “Cwnet: Causal wavelet network for low-light image enhancement,”arXiv preprint arXiv:2507.10689, 2025

work page arXiv 2025

[20] [20]

Bsmamba: Brightness and semantic modeling for long- range interaction in low-light image enhancement,

T. Zhang, P. Liu, M. Cai, Z. Zhang, Y . Lu, and Q. Zhou, “Bsmamba: Brightness and semantic modeling for long- range interaction in low-light image enhancement,”arXiv preprint arXiv:2506.18346, 2025

work page arXiv 2025

[21] [21]

Irmamba: Pixel difference mamba with layer restoration for infrared small target detection,

M. Zhang, X. Li, F. Gao, and J. Guo, “Irmamba: Pixel difference mamba with layer restoration for infrared small target detection,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 9, 2025, pp. 10 003–10 011

2025

[22] [22]

Smile: Spatial-spectral mamba interactive learning for infrared small target de- tection,

Y . Li, L. Wang, and S. Chen, “Smile: Spatial-spectral mamba interactive learning for infrared small target de- tection,”IEEE Transactions on Geoscience and Remote Sensing, 2025

2025

[23] [23]

Rp- canet: Deep unfolding rpca based infrared small target detection,

F. Wu, T. Zhang, L. Li, Y . Huang, and Z. Peng, “Rp- canet: Deep unfolding rpca based infrared small target detection,” inProceedings of the IEEE/CVF Winter Con- ference on Applications of Computer Vision, 2024, pp. 4809–4818

2024

[24] [24]

Point-to-point regression: Accurate infrared small target detection with single- point annotation,

R. Ni, J. Wu, Z. Qiu, L. Chen, C. Luo, F. Huang, Q. Liu, B. Wang, Y . Li, and Y . Li, “Point-to-point regression: Accurate infrared small target detection with single- point annotation,”IEEE Transactions on Geoscience and Remote Sensing, 2025

2025

[25] [25]

Mapping degeneration meets label evolution: Learning infrared small target detection with single point supervision,

X. Ying, L. Liu, Y . Wang, R. Li, N. Chen, Z. Lin, W. Sheng, and S. Zhou, “Mapping degeneration meets label evolution: Learning infrared small target detection with single point supervision,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 15 528–15 538

2023

[26] [26]

Background modeling in the fourier domain for maritime infrared target detec- JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11 tion,

A. Zhou, W. Xie, and J. Pei, “Background modeling in the fourier domain for maritime infrared target detec- JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11 tion,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 8, pp. 2634–2649, 2020

2021

[27] [27]

Dynamic high-frequency convolution for infrared small target detection,

R. Li, C. Xiao, Q. Yin, W. An, N. Chen, X. Ying, M. Li, and Y . Wang, “Dynamic high-frequency convolution for infrared small target detection,”IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2026

2026

[28] [28]

Irsam: Advancing segment anything model for infrared small target detection,

M. Zhang, Y . Wang, J. Guo, Y . Li, X. Gao, and J. Zhang, “Irsam: Advancing segment anything model for infrared small target detection,” inEuropean Conference on Com- puter Vision. Springer, 2024, pp. 233–249

2024

[29] [29]

Text-irstd: Leveraging semantic text to promote infrared small target detection in complex scenes,

F. Huang, S. Zheng, Z. Qiu, H. Liu, H. Bai, and L. Chen, “Text-irstd: Leveraging semantic text to promote infrared small target detection in complex scenes,”arXiv preprint arXiv:2503.07249, 2025

work page arXiv 2025

[30] [30]

Rethinking evaluation of infrared small target detection.CoRR, abs/2509.16888, 2025

Y . Pang, X. Zhao, L. Zhang, H. Lu, G. E. Fakhri, X. Liu, and S. Lu, “Rethinking evaluation of infrared small target detection,”arXiv preprint arXiv:2509.16888, 2025

work page arXiv 2025

[31] [31]

Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images,

H. Wang, L. Zhou, and L. Wang, “Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images,” inProceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8509–8518

2019

[32] [32]

Isnet: Shape matters for infrared small target detection,

M. Zhang, R. Zhang, Y . Yang, H. Bai, J. Zhang, and J. Guo, “Isnet: Shape matters for infrared small target detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 877–886

2022

[33] [33]

Infrared small target detection with scale and location sensitivity,

Q. Liu, R. Liu, B. Zheng, H. Wang, and Y . Fu, “Infrared small target detection with scale and location sensitivity,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 17 490– 17 499

2024

[34] [34]

Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection,

J. Yang, S. Liu, J. Wu, X. Su, N. Hai, and X. Huang, “Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 9, 2025, pp. 9202–9210

2025

[35] [35]

Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,

T. Wu, B. Li, Y . Luo, Y . Wang, C. Xiao, T. Liu, J. Yang, W. An, and Y . Guo, “Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,”IEEE Transac- tions on Geoscience and Remote Sensing, vol. 61, pp. 1–15, 2023

2023

[36] [36]

Drpca-net: Make robust pca great again for infrared small target detection,

Z. Xiong, F. Zhou, F. Wu, S. Yuan, M. Fu, Z. Peng, J. Yang, and Y . Dai, “Drpca-net: Make robust pca great again for infrared small target detection,”IEEE Transac- tions on Geoscience and Remote Sensing, 2025

2025

[37] [37]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural informa- tion processing systems, vol. 30, 2017

2017

[38] [38]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recog- nition, 2016, pp. 770–778

2016

[39] [39]

KAN: Kolmogorov-Arnold Networks

Z. Liu, Y . Wang, S. Vaidya, F. Ruehle, J. Halver- son, M. Solja ˇci´c, T. Y . Hou, and M. Tegmark, “Kan: Kolmogorov-arnold networks,”arXiv preprint arXiv:2404.19756, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[40] [40]

Deformable convolutional networks,

J. Dai, H. Qi, Y . Xiong, Y . Li, G. Zhang, H. Hu, and Y . Wei, “Deformable convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 764–773

2017

[41] [41]

Agpcnet: Attention-guided pyramid context networks for infrared small target detection,

T. Zhang, S. Cao, T. Pu, and Z. Peng, “Agpcnet: Attention-guided pyramid context networks for infrared small target detection,”arXiv preprint arXiv:2111.03580, 2021

work page arXiv 2021

[42] [42]

Uiu-net: U-net in u-net for infrared small object detection,

X. Wu, D. Hong, and J. Chanussot, “Uiu-net: U-net in u-net for infrared small object detection,”IEEE Transac- tions on Image Processing, vol. 32, pp. 364–376, 2022. VI. BIOGRAPHYSECTION Pingping Liureceived M.S. and Ph.D. degrees from College of Computer Science and Technology, Jilin University, China, in 2004 and 2009, respec- tively. She is currently a ...

2022

[43] [43]

degree in College of Computer Science and Technology, Jilin University, China

He is currently pursuing his Ph.D. degree in College of Computer Science and Technology, Jilin University, China. His research interests include infrared small target detection, tracking, and image segmentation. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12 Jin Kuangwas born in 2001. He received the B.S. degree from Xiangnan University in 2...

2021