HyperTea: A Hypergraph-based Temporal Enhancement and Alignment Network for Moving Infrared Small Target Detection

Jie Tang; Weihua Gao; Wenlong Niu; Xiaodong Peng; Yun Li; Zhaoyuan Qi

arxiv: 2508.10678 · v2 · pith:JZ3PHMO4new · submitted 2025-08-14 · 💻 cs.CV

HyperTea: A Hypergraph-based Temporal Enhancement and Alignment Network for Moving Infrared Small Target Detection

Zhaoyuan Qi , Weihua Gao , Wenlong Niu , Jie Tang , Yun Li , Xiaodong Peng This is my paper

Pith reviewed 2026-05-21 22:47 UTC · model grok-4.3

classification 💻 cs.CV

keywords moving infrared small target detectionhypergraph neural networkstemporal enhancementfeature alignmentCNN-RNN integrationMIRSTDspatiotemporal correlations

0 comments

The pith

HyperTea integrates hypergraphs with CNNs and RNNs to model high-order spatiotemporal correlations for moving infrared small target detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HyperTea to tackle the challenges of small size, weak intensity, and complex motions in moving infrared small target detection. It combines global temporal enhancement through semantic aggregation, local motion pattern capture between frames, and cross-scale alignment to improve feature representation at multiple timescales. A sympathetic reader would care because existing methods rely on low-order correlations that often fail in practical surveillance or defense scenarios, and this approach claims to deliver state-of-the-art results on standard benchmarks by capturing richer relations via hypergraphs.

Core claim

HyperTea is the first network to fuse CNNs for spatial features, RNNs for sequential context, and hypergraph neural networks for high-order spatiotemporal correlations in MIRSTD. The architecture uses a global temporal enhancement module to aggregate and propagate semantic context across the sequence, a local temporal enhancement module to model motion between adjacent frames, and a temporal alignment module to correct cross-scale feature misalignment, resulting in superior detection performance on the DAUB and IRDST datasets.

What carries the argument

HyperTea architecture with global temporal enhancement module (GTEM) for semantic aggregation and propagation, local temporal enhancement module (LTEM) for adjacent-frame motion patterns, and temporal alignment module (TAM) for cross-scale correction, all built on hypergraph neural networks to model high-order feature correlations.

If this is right

Detection accuracy improves for targets with irregular trajectories by explicitly modeling relations beyond pairwise frame connections.
Multi-timescale feature enhancement reduces missed detections in low-signal infrared video.
Cross-scale alignment mitigates errors when combining global and local temporal information.
The combined CNN-RNN-HGNN pipeline sets a new performance baseline on existing MIRSTD benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar hypergraph temporal modules could be tested on visible-light small-object tracking to check if the high-order benefit transfers across modalities.
If the alignment module proves critical, it might be adapted as a lightweight plug-in for other multi-scale video networks.
Real-world deployment would require checking whether the added hypergraph computation remains feasible under strict latency constraints typical of infrared sensors.

Load-bearing premise

High-order spatiotemporal correlations from hypergraphs applied to CNN-extracted features will consistently outperform lower-order temporal models when handling complex motion patterns of small infrared targets.

What would settle it

Running the model on a new infrared sequence dataset containing highly erratic or non-smooth target motions and finding that detection precision or recall drops below that of a strong RNN-only or graph-convolution baseline would falsify the core performance claim.

Figures

Figures reproduced from arXiv: 2508.10678 by Jie Tang, Weihua Gao, Wenlong Niu, Xiaodong Peng, Yun Li, Zhaoyuan Qi.

**Figure 1.** Figure 1: Overview of the proposed framework HyperTea. Our HyperTea consists of a backbone and three key modules: the [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Simplified workflow of our HyperTea. It contains the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Details of our proposed CSAM. them can be captured, rather than being confined solely to the current temporal scale. In addition, with the aim of preserving more original information, we also incorporate residual blocks to embed keyframe features at different temporal scales into the query results R: R = LN(GT + UP(LN(Conv1×1(Attn) + Lˆ st))) (14) where LN is layer normalization, UP denotes the reconstruc… view at source ↗

**Figure 4.** Figure 4: Visualization comparisons of 14 methods on IRDST, with 72/266.bmp. GT is ground truth. Red and blue boxes represent [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Visualization comparisons of 14 methods on IRDST, with 6/14.bmp. GT is ground truth. Red and blue boxes represent [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization comparisons of 14 methods on DAUB, with 21/433.bmp. GT is ground truth. Red and blue boxes represent [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: PR curves of 16 representative detection methods on [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Effects of time window size T on HyperTea. [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

read the original abstract

In practical application scenarios, moving infrared small target detection (MIRSTD) remains highly challenging due to the target's small size, weak intensity, and complex motion pattern. Existing methods typically only model low-order correlations between feature nodes and perform feature extraction and enhancement within a single temporal scale. Although hypergraphs have been widely used for high-order correlation learning, they have received limited attention in MIRSTD. To explore the potential of hypergraphs and enhance multi-timescale feature representation, we propose HyperTea, which integrates global and local temporal perspectives to effectively model high-order spatiotemporal correlations of features. HyperTea consists of three modules: the global temporal enhancement module (GTEM) realizes global temporal context enhancement through semantic aggregation and propagation; the local temporal enhancement module (LTEM) is designed to capture local motion patterns between adjacent frames and then enhance local temporal context; additionally, we further develop a temporal alignment module (TAM) to address potential cross-scale feature misalignment. To our best knowledge, HyperTea is the first work to integrate convolutional neural networks (CNNs), recurrent neural networks (RNNs), and hypergraph neural networks (HGNNs) for MIRSTD, significantly improving detection performance. Experiments on DAUB and IRDST demonstrate its state-of-the-art (SOTA) performance. Our source codes are available at https://github.com/Lurenjia-LRJ/HyperTea.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes HyperTea, a network for moving infrared small target detection (MIRSTD) that integrates CNN feature extraction, RNN-based temporal modeling, and HGNNs for high-order spatiotemporal correlations. It introduces three modules: GTEM for global temporal context enhancement via semantic aggregation and propagation, LTEM to capture local motion patterns between adjacent frames, and TAM to address cross-scale feature misalignment. The authors claim this is the first such CNN-RNN-HGNN integration for the task and report SOTA performance on the DAUB and IRDST datasets, with code released.

Significance. If the hypergraph components demonstrably outperform strong low-order temporal baselines on the same backbone, the work could advance MIRSTD by showing the value of high-order correlation modeling for complex target motions. The open-source code is a clear strength for reproducibility.

major comments (2)

[Experiments] Experimental section / ablation studies: no controls replace the hypergraph modules (GTEM/LTEM) with strong low-order alternatives such as multi-head self-attention or advanced multi-scale RNN variants on the identical CNN backbone. Without these, gains cannot be attributed specifically to high-order hyperedge modeling rather than general temporal enhancement, undermining the central motivation.
[Method] §3.2 and §3.3 (GTEM and LTEM descriptions): the claim that hypergraphs reliably capture high-order correlations outperforming lower-order modeling is asserted in the motivation but not isolated via targeted ablations or comparisons; this is load-bearing for the novelty of the HGNN integration.

minor comments (2)

[Abstract] Abstract: reports SOTA without any numerical margins, dataset-specific metrics, or baseline comparisons; adding one sentence with key improvements (e.g., mAP or Pd/Fa deltas) would improve clarity.
[Method] Hypergraph construction: details on how hyperedges are formed from CNN features (including any temporal scale hyperparameters) are insufficiently specified for exact reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major comment below and will revise the manuscript to strengthen the experimental validation of the hypergraph components.

read point-by-point responses

Referee: [Experiments] Experimental section / ablation studies: no controls replace the hypergraph modules (GTEM/LTEM) with strong low-order alternatives such as multi-head self-attention or advanced multi-scale RNN variants on the identical CNN backbone. Without these, gains cannot be attributed specifically to high-order hyperedge modeling rather than general temporal enhancement, undermining the central motivation.

Authors: We agree that the current ablations do not fully isolate the contribution of high-order hyperedge modeling. In the revised version we will add new experiments that replace the GTEM and LTEM hypergraph modules with strong low-order baselines (multi-head self-attention and advanced multi-scale RNN variants) while keeping the identical CNN backbone and all other components fixed. These results will be reported in an expanded ablation table. revision: yes
Referee: [Method] §3.2 and §3.3 (GTEM and LTEM descriptions): the claim that hypergraphs reliably capture high-order correlations outperforming lower-order modeling is asserted in the motivation but not isolated via targeted ablations or comparisons; this is load-bearing for the novelty of the HGNN integration.

Authors: We acknowledge that the motivation section asserts the benefit of high-order modeling without dedicated isolation experiments. We will insert targeted ablation studies in the revision that directly compare hypergraph-based GTEM/LTEM against their low-order counterparts on the same backbone, thereby providing empirical support for the novelty claim of the CNN-RNN-HGNN integration. revision: yes

Circularity Check

0 steps flagged

No circularity: architectural proposal evaluated on external datasets

full rationale

The paper proposes HyperTea as an integration of CNNs, RNNs, and HGNNs with modules GTEM, LTEM, and TAM to model high-order spatiotemporal correlations for MIRSTD. Claims of novelty and SOTA performance rest on experiments using public external datasets (DAUB, IRDST) rather than any self-referential equations or fitted parameters. No derivation reduces reported metrics to inputs by construction, and no load-bearing self-citations or uniqueness theorems from prior author work are invoked in the provided text. The central contribution is an empirical architectural design whose validity is tested independently of its own definitions.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 3 invented entities

The method rests on standard deep-learning assumptions plus the domain-specific premise that hypergraphs are an effective inductive bias for infrared motion; no new physical entities are postulated.

free parameters (2)

hypergraph construction hyperparameters
Number of hyperedges, vertex grouping strategy, and aggregation weights are chosen during architecture design and training.
temporal scale parameters
Window sizes for global versus local modules and alignment offsets are selected to fit the target motion statistics.

axioms (2)

domain assumption High-order correlations among feature nodes improve detection of complex motion patterns over pairwise modeling
Invoked in the introduction when motivating hypergraphs for MIRSTD.
domain assumption Cross-scale feature misalignment can be corrected by a dedicated alignment module without introducing new artifacts
Stated as motivation for the TAM module.

invented entities (3)

Global Temporal Enhancement Module (GTEM) no independent evidence
purpose: Semantic aggregation and propagation across the entire sequence
New architectural component introduced to realize global temporal context.
Local Temporal Enhancement Module (LTEM) no independent evidence
purpose: Capture local motion patterns between adjacent frames
New architectural component for local temporal context.
Temporal Alignment Module (TAM) no independent evidence
purpose: Address potential cross-scale feature misalignment
New component to keep multi-timescale features consistent.

pith-pipeline@v0.9.0 · 5797 in / 1614 out tokens · 34004 ms · 2026-05-21T22:47:28.852578+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

integrates CNNs, RNNs, and HGNNs ... GTEM ... LTEM ... TAM ... high-order spatiotemporal correlations

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages

[1]

Infrared small target segmentation networks: A survey,

R. Kou, C. Wang, Z. Peng, Z. Zhao, Y . Chen, J. Han, F. Huang, Y . Yu, and Q. Fu, “Infrared small target segmentation networks: A survey,” Pattern Recognition, vol. 143, p. 109788, Nov. 2023

work page 2023
[2]

A local contrast method for small infrared target detection,

C. L. P. Chen, H. Li, Y . Wei, T. Xia, and Y . Y . Tang, “A local contrast method for small infrared target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 52, no. 1, pp. 574–581, Jan. 2014

work page 2014
[3]

The design of top-hat morphological filter and application to infrared target detection,

M. Zeng, J. Li, and Z. Peng, “The design of top-hat morphological filter and application to infrared target detection,” Infrared Physics & Technology, vol. 48, no. 1, pp. 67–76, Apr. 2006

work page 2006
[4]

Infrared patch-image model for small target detection in a single image,

C. Gao, D. Meng, Y . Yang, Y . Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,” IEEE Transactions on Image Processing , vol. 22, no. 12, pp. 4996–5009, Dec. 2013

work page 2013
[5]

Infrared small target detection based on partial sum of the tensor nuclear norm,

L. Zhang and Z. Peng, “Infrared small target detection based on partial sum of the tensor nuclear norm,” Remote Sensing, vol. 11, no. 4, p. 382, Jan. 2019

work page 2019
[6]

Dim small target detection and tracking: A novel method based on temporal energy selective scaling and trajectory association,

W. Gao, W. Niu, W. Lu, P. Wang, Z. Qi, X. Peng, and Z. Yang, “Dim small target detection and tracking: A novel method based on temporal energy selective scaling and trajectory association,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , vol. 17, pp. 17 239–17 262

work page
[7]

Asymmetric contextual modulation for infrared small target detection,

Y . Dai, Y . Wu, F. Zhou, and K. Barnard, “Asymmetric contextual modulation for infrared small target detection,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Jan. 2021, pp. 949–958

work page 2021
[8]

Dense nested attention network for infrared small target detection,

B. Li, C. Xiao, L. Wang, Y . Wang, Z. Lin, M. Li, W. An, and Y . Guo, “Dense nested attention network for infrared small target detection,” IEEE Transactions on Image Processing, vol. 32, pp. 1745–1758, 2023

work page 2023
[9]

Sstnet: Sliced spatio- temporal network with cross-slice convlstm for moving infrared dim- small target detection,

S. Chen, L. Ji, J. Zhu, M. Ye, and X. Yao, “Sstnet: Sliced spatio- temporal network with cross-slice convlstm for moving infrared dim- small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–12, 2024

work page 2024
[10]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV , USA: IEEE, Jun. 2016, pp. 770– 778

work page 2016
[11]

Imagenet classification with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017

work page 2017
[12]

End-to-end object detection with transformers,

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” Berlin, Heidelberg, Aug. 2020, pp. 213–229

work page 2020
[13]

An image is worth 16x16 words: Trans- formers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Trans- formers for image recognition at scale,” in International Conference on Learning Representations, Oct. 2020

work page 2020
[14]

Video swin transformer,

Z. Liu, J. Ning, Y . Cao, Y . Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 3202–3211

work page 2022
[15]

Convolutional lstm network: A machine learning approach for precipitation nowcasting,

X. SHI, Z. Chen, H. Wang, D.-Y . Yeung, W.-k. Wong, and W.-c. WOO, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” inAdvances in Neural Information Processing Systems, vol. 28, 2015

work page 2015
[16]

Uiu-net: U-net in u-net for infrared small object detection,

X. Wu, D. Hong, and J. Chanussot, “Uiu-net: U-net in u-net for infrared small object detection,”IEEE Transactions on Image Processing, vol. 32, pp. 364–376, 2023

work page 2023
[17]

Attention-guided pyramid context networks for detecting infrared small target under complex background,

T. Zhang, L. Li, S. Cao, T. Pu, and Z. Peng, “Attention-guided pyramid context networks for detecting infrared small target under complex background,” IEEE Transactions on Aerospace and Electronic Systems , vol. 59, no. 4, pp. 4250–4261, Aug. 2023

work page 2023
[18]

Sctransnet: Spatial- channel cross transformer network for infrared small target detection,

S. Yuan, H. Qin, X. Yan, N. Akhtar, and A. Mian, “Sctransnet: Spatial- channel cross transformer network for infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1– 15, 2024. 12

work page 2024
[19]

St- trans: Spatial-temporal transformer for infrared small target detection in sequential images,

X. Tong, Z. Zuo, S. Su, J. Wei, X. Sun, P. Wu, and Z. Zhao, “St- trans: Spatial-temporal transformer for infrared small target detection in sequential images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–19, 2024

work page 2024
[20]

Ir-transdet: Infrared dim and small target detection with ir-transformer,

J. Lin, S. Li, L. Zhang, X. Yang, B. Yan, and Z. Meng, “Ir-transdet: Infrared dim and small target detection with ir-transformer,” IEEE Transactions on Geoscience and Remote Sensing , vol. 61, pp. 1–13, 2023

work page 2023
[21]

Toward dense moving infrared small target detection: New datasets and baseline,

S. Chen, L. Ji, S. Zhu, M. Ye, H. Ren, and Y . Sang, “Toward dense moving infrared small target detection: New datasets and baseline,”IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–13, 2024

work page 2024
[22]

Hgnn+: General hypergraph neural networks,

Y . Gao, Y . Feng, S. Ji, and R. Ji, “Hgnn+: General hypergraph neural networks,” IEEE Transactions on Pattern Analysis and Machine Intelli- gence, vol. 45, no. 3, pp. 3181–3199, Mar. 2023

work page 2023
[23]

Hyper-yolo: When visual object detection meets hypergraph computation,

Y . Feng, J. Huang, S. Du, S. Ying, J.-H. Yong, Y . Li, G. Ding, R. Ji, and Y . Gao, “Hyper-yolo: When visual object detection meets hypergraph computation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–14, 2025

work page 2025
[24]

Hypergraph learning: Methods and practices,

Y . Gao, Z. Zhang, H. Lin, X. Zhao, S. Du, and C. Zou, “Hypergraph learning: Methods and practices,”IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 44, no. 5, pp. 2548–2566, 2022

work page 2022
[25]

Lbsn2vec++: Het- erogeneous hypergraph embedding for location-based social networks,

D. Yang, B. Qu, J. Yang, and P. Cudr ´e-Mauroux, “Lbsn2vec++: Het- erogeneous hypergraph embedding for location-based social networks,” IEEE Transactions on Knowledge and Data Engineering , vol. 34, no. 4, pp. 1843–1855, 2022

work page 2022
[26]

Hypergraph factorization for multi-tissue gene expression imputation,

R. Vi ˜nas, C. K. Joshi, D. Georgiev, P. Lin, B. Dumitrascu, E. R. Gamazon, and P. Li `o, “Hypergraph factorization for multi-tissue gene expression imputation,” Nature Machine Intelligence , vol. 5, no. 7, pp. 739–753, Jul. 2023

work page 2023
[27]

Multi-hypergraph learning-based brain functional connectivity analysis in fmri data,

L. Xiao, J. Wang, P. H. Kassani, Y . Zhang, Y . Bai, J. M. Stephen, T. W. Wilson, V . D. Calhoun, and Y .-P. Wang, “Multi-hypergraph learning-based brain functional connectivity analysis in fmri data,” IEEE Transactions on Medical Imaging , vol. 39, no. 5, pp. 1746–1758, May 2020

work page 2020
[28]

Max-mean and max-median filters for detection of small targets,

S. D. Deshpande, M. H. Er, R. Venkateswarlu, and P. Chan, “Max-mean and max-median filters for detection of small targets,” in SPIE’s Interna- tional Symposium on Optical Science, Engineering, and Instrumentation, Denver, CO, Oct. 1999, pp. 74–83

work page 1999
[29]

Infrared small target detection utilizing the multiscale relative local contrast measure,

J. Han, K. Liang, B. Zhou, X. Zhu, J. Zhao, and L. Zhao, “Infrared small target detection utilizing the multiscale relative local contrast measure,” IEEE Geoscience and Remote Sensing Letters , vol. 15, no. 4, pp. 612– 616, Apr. 2018

work page 2018
[30]

Infrared small target detection based on the weighted strengthened local contrast measure,

J. Han, S. Moradi, I. Faramarzi, H. Zhang, Q. Zhao, X. Zhang, and N. Li, “Infrared small target detection based on the weighted strengthened local contrast measure,” IEEE Geoscience and Remote Sensing Letters , vol. 18, no. 9, pp. 1670–1674, Sep. 2021

work page 2021
[31]

A local contrast method for infrared small-target detection utilizing a tri-layer window,

J. Han, S. Moradi, I. Faramarzi, C. Liu, H. Zhang, and Q. Zhao, “A local contrast method for infrared small-target detection utilizing a tri-layer window,”IEEE Geoscience and Remote Sensing Letters , vol. 17, no. 10, pp. 1822–1826, Oct. 2020

work page 2020
[32]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 , Cham, 2015, pp. 234– 241

work page 2015
[33]

Isnet: Shape matters for infrared small target detection,

M. Zhang, R. Zhang, Y . Yang, H. Bai, J. Zhang, and J. Guo, “Isnet: Shape matters for infrared small target detection,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , Jun. 2022, pp. 867–876

work page 2022
[34]

Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset irdst,

H. Sun, J. Bai, F. Yang, and X. Bai, “Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset irdst,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–13, 2023

work page 2023
[35]

Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection,

M. Zhang, H. Bai, J. Zhang, R. Zhang, C. Wang, J. Guo, and X. Gao, “Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection,” in Proceedings of the 30th ACM International Conference on Multimedia , New York, NY , USA, Oct. 2022, pp. 1730–1738

work page 2022
[36]

Abmnet: Coupling transformer with cnn based on adams-bashforth-moulton method for infrared small target detection,

T. Chen, Q. Chu, Z. Tan, B. Liu, and N. Yu, “Abmnet: Coupling transformer with cnn based on adams-bashforth-moulton method for infrared small target detection,” in 2023 IEEE International Conference on Multimedia and Expo (ICME) , Brisbane, Australia, Jul. 2023, pp. 1901–1906

work page 2023
[37]

Monte carlo linear clustering with single-point supervision is enough for infrared small target detection,

B. Li, Y . Wang, L. Wang, F. Zhang, T. Liu, Z. Lin, W. An, and Y . Guo, “Monte carlo linear clustering with single-point supervision is enough for infrared small target detection,” in 2023 IEEE/CVF International Conference on Computer Vision (ICCV) , Oct. 2023, pp. 1009–1019

work page 2023
[38]

Mapping degeneration meets label evolution: Learning in- frared small target detection with single point supervision,

X. Ying, L. Liu, Y . Wang, R. Li, N. Chen, Z. Lin, W. Sheng, and S. Zhou, “Mapping degeneration meets label evolution: Learning in- frared small target detection with single point supervision,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, Jun. 2023, pp. 15 528–15 538

work page 2023
[39]

Label evolution based on local contrast measure for single-point supervised infrared small-target detection,

D. Yang, H. Zhang, Y . Li, and Z. Jiang, “Label evolution based on local contrast measure for single-point supervised infrared small-target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–12, 2024

work page 2024
[40]

A level set annotation framework with single-point supervision for infrared small target detection,

H. Li, J. Yang, Y . Xu, and R. Wang, “A level set annotation framework with single-point supervision for infrared small target detection,” IEEE Signal Processing Letters , vol. 31, pp. 451–455, 2024

work page 2024
[41]

Mcgc: A multiscale chain growth clustering algorithm for generating infrared small target mask under single-point supervision,

R. Kou, C. Wang, Q. Fu, Z. Li, Y . Luo, B. Li, W. Li, and Z. Peng, “Mcgc: A multiscale chain growth clustering algorithm for generating infrared small target mask under single-point supervision,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–12, 2024

work page 2024
[42]

Point-to-point regression: Accurate infrared small target detection with single-point annotation,

R. Ni, J. Wu, Z. Qiu, L. Chen, C. Luo, F. Huang, Q. Liu, B. Wang, Y . Li, and Y . Li, “Point-to-point regression: Accurate infrared small target detection with single-point annotation,” IEEE Transactions on Geoscience and Remote Sensing , vol. 63, pp. 1–19, 2025

work page 2025
[43]

Sirst-5k: Exploring massive negatives synthesis with self-supervised learning for robust infrared small target detection,

Y . Lu, Y . Lin, H. Wu, X. Xian, Y . Shi, and L. Lin, “Sirst-5k: Exploring massive negatives synthesis with self-supervised learning for robust infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–11, 2024

work page 2024
[44]

Mim-istd: Mamba-in-mamba for efficient infrared small- target detection,

T. Chen, Z. Ye, Z. Tan, T. Gong, Y . Wu, Q. Chu, B. Liu, N. Yu, and J. Ye, “Mim-istd: Mamba-in-mamba for efficient infrared small- target detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–13, 2024

work page 2024
[45]

Irsam: Advancing segment anything model for infrared small target detection,

M. Zhang, Y . Wang, J. Guo, Y . Li, X. Gao, and J. Zhang, “Irsam: Advancing segment anything model for infrared small target detection,” in ECCV 2024, Cham, 2025, vol. 15125, pp. 233–249

work page 2024
[46]

Stdmanet: Spatio-temporal differential multiscale attention network for small mov- ing infrared target detection,

P. Yan, R. Hou, X. Duan, C. Yue, X. Wang, and X. Cao, “Stdmanet: Spatio-temporal differential multiscale attention network for small mov- ing infrared target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–16, 2023

work page 2023
[47]

Direction- coded temporal u-shape module for multiframe infrared small target de- tection,

R. Li, W. An, C. Xiao, B. Li, Y . Wang, M. Li, and Y . Guo, “Direction- coded temporal u-shape module for multiframe infrared small target de- tection,” IEEE Transactions on Neural Networks and Learning Systems , pp. 1–14, 2025

work page 2025
[48]

Tmp: Temporal motion perception with spatial auxiliary enhancement for moving infrared dim- small target detection,

S. Zhu, L. Ji, J. Zhu, S. Chen, and W. Duan, “Tmp: Temporal motion perception with spatial auxiliary enhancement for moving infrared dim- small target detection,” Expert Systems with Applications , vol. 255, p. 124731, Dec. 2024

work page 2024
[49]

Triple-domain feature learning with frequency-aware memory enhancement for moving in- frared small target detection,

W. Duan, L. Ji, S. Chen, S. Zhu, and M. Ye, “Triple-domain feature learning with frequency-aware memory enhancement for moving in- frared small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–14, 2024

work page 2024
[50]

Semi- supervised multiview prototype learning with motion reconstruction for moving infrared small target detection,

W. Duan, L. Ji, J. Huang, S. Chen, S. Peng, S. Zhu, and M. Ye, “Semi- supervised multiview prototype learning with motion reconstruction for moving infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 63, pp. 1–15, 2025

work page 2025
[51]

Motion prior knowledge learning with homogeneous language descriptions for moving infrared small target detection,

S. Chen, L. Ji, W. Duan, S. Peng, and M. Ye, “Motion prior knowledge learning with homogeneous language descriptions for moving infrared small target detection,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 2, pp. 2186–2194, Apr. 2025

work page 2025
[52]

Yolox: Exceeding yolo series in 2021,

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” Aug. 2021

work page 2021
[53]

Predrnn: A recurrent neural network for spatiotemporal predictive learning,

Y . Wang, H. Wu, J. Zhang, Z. Gao, J. Wang, P. S. Yu, and M. Long, “Predrnn: A recurrent neural network for spatiotemporal predictive learning,” IEEE Transactions on Pattern Analysis and Machine Intel- ligence, vol. 45, no. 2, pp. 2208–2225, Feb. 2023

work page 2023
[54]

Swinlstm: Improving spa- tiotemporal prediction accuracy using swin transformer and lstm,

S. Tang, C. Li, P. Zhang, and R. Tang, “Swinlstm: Improving spa- tiotemporal prediction accuracy using swin transformer and lstm,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 13 470–13 479

work page 2023
[55]

Attentional local contrast networks for infrared small target detection,

Y . Dai, Y . Wu, F. Zhou, and K. Barnard, “Attentional local contrast networks for infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 59, no. 11, pp. 9813–9824, Nov. 2021

work page 2021
[56]

Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,

T. Wu, B. Li, Y . Luo, Y . Wang, C. Xiao, T. Liu, J. Yang, W. An, and Y . Guo, “Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 61, pp. 1–15, 2023

work page 2023
[57]

Infrared small target detection with scale and location sensitivity,

Q. Liu, R. Liu, B. Zheng, H. Wang, and Y . FU, “Infrared small target detection with scale and location sensitivity,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 17 490–17 499. 13

work page 2024
[58]

Rpcanet: Deep unfolding rpca based infrared small target detection,

F. Wu, T. Zhang, L. Li, Y . Huang, and Z. Peng, “Rpcanet: Deep unfolding rpca based infrared small target detection,” in 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2024, pp. 4797–4806

work page 2024
[59]

Pick of the bunch: Detecting infrared small targets beyond hit-miss trade-offs via selective rank-aware attention,

Y . Dai, P. Pan, Y . Qian, Y . Li, X. Li, J. Yang, and H. Wang, “Pick of the bunch: Detecting infrared small targets beyond hit-miss trade-offs via selective rank-aware attention,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–15, 2024

work page 2024
[60]

A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background,

B. Hui, Z. Song, H. Fan, P. Zhong, W. Hu, X. Zhang, J. Lin, H. Su, W. Jin, Y . Zhang, and Y . Bai, “A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background,” Oct. 2019

work page 2019

[1] [1]

Infrared small target segmentation networks: A survey,

R. Kou, C. Wang, Z. Peng, Z. Zhao, Y . Chen, J. Han, F. Huang, Y . Yu, and Q. Fu, “Infrared small target segmentation networks: A survey,” Pattern Recognition, vol. 143, p. 109788, Nov. 2023

work page 2023

[2] [2]

A local contrast method for small infrared target detection,

C. L. P. Chen, H. Li, Y . Wei, T. Xia, and Y . Y . Tang, “A local contrast method for small infrared target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 52, no. 1, pp. 574–581, Jan. 2014

work page 2014

[3] [3]

The design of top-hat morphological filter and application to infrared target detection,

M. Zeng, J. Li, and Z. Peng, “The design of top-hat morphological filter and application to infrared target detection,” Infrared Physics & Technology, vol. 48, no. 1, pp. 67–76, Apr. 2006

work page 2006

[4] [4]

Infrared patch-image model for small target detection in a single image,

C. Gao, D. Meng, Y . Yang, Y . Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,” IEEE Transactions on Image Processing , vol. 22, no. 12, pp. 4996–5009, Dec. 2013

work page 2013

[5] [5]

Infrared small target detection based on partial sum of the tensor nuclear norm,

L. Zhang and Z. Peng, “Infrared small target detection based on partial sum of the tensor nuclear norm,” Remote Sensing, vol. 11, no. 4, p. 382, Jan. 2019

work page 2019

[6] [6]

Dim small target detection and tracking: A novel method based on temporal energy selective scaling and trajectory association,

W. Gao, W. Niu, W. Lu, P. Wang, Z. Qi, X. Peng, and Z. Yang, “Dim small target detection and tracking: A novel method based on temporal energy selective scaling and trajectory association,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , vol. 17, pp. 17 239–17 262

work page

[7] [7]

Asymmetric contextual modulation for infrared small target detection,

Y . Dai, Y . Wu, F. Zhou, and K. Barnard, “Asymmetric contextual modulation for infrared small target detection,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Jan. 2021, pp. 949–958

work page 2021

[8] [8]

Dense nested attention network for infrared small target detection,

B. Li, C. Xiao, L. Wang, Y . Wang, Z. Lin, M. Li, W. An, and Y . Guo, “Dense nested attention network for infrared small target detection,” IEEE Transactions on Image Processing, vol. 32, pp. 1745–1758, 2023

work page 2023

[9] [9]

Sstnet: Sliced spatio- temporal network with cross-slice convlstm for moving infrared dim- small target detection,

S. Chen, L. Ji, J. Zhu, M. Ye, and X. Yao, “Sstnet: Sliced spatio- temporal network with cross-slice convlstm for moving infrared dim- small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–12, 2024

work page 2024

[10] [10]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV , USA: IEEE, Jun. 2016, pp. 770– 778

work page 2016

[11] [11]

Imagenet classification with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017

work page 2017

[12] [12]

End-to-end object detection with transformers,

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” Berlin, Heidelberg, Aug. 2020, pp. 213–229

work page 2020

[13] [13]

An image is worth 16x16 words: Trans- formers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Trans- formers for image recognition at scale,” in International Conference on Learning Representations, Oct. 2020

work page 2020

[14] [14]

Video swin transformer,

Z. Liu, J. Ning, Y . Cao, Y . Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 3202–3211

work page 2022

[15] [15]

Convolutional lstm network: A machine learning approach for precipitation nowcasting,

X. SHI, Z. Chen, H. Wang, D.-Y . Yeung, W.-k. Wong, and W.-c. WOO, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” inAdvances in Neural Information Processing Systems, vol. 28, 2015

work page 2015

[16] [16]

Uiu-net: U-net in u-net for infrared small object detection,

X. Wu, D. Hong, and J. Chanussot, “Uiu-net: U-net in u-net for infrared small object detection,”IEEE Transactions on Image Processing, vol. 32, pp. 364–376, 2023

work page 2023

[17] [17]

Attention-guided pyramid context networks for detecting infrared small target under complex background,

T. Zhang, L. Li, S. Cao, T. Pu, and Z. Peng, “Attention-guided pyramid context networks for detecting infrared small target under complex background,” IEEE Transactions on Aerospace and Electronic Systems , vol. 59, no. 4, pp. 4250–4261, Aug. 2023

work page 2023

[18] [18]

Sctransnet: Spatial- channel cross transformer network for infrared small target detection,

S. Yuan, H. Qin, X. Yan, N. Akhtar, and A. Mian, “Sctransnet: Spatial- channel cross transformer network for infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1– 15, 2024. 12

work page 2024

[19] [19]

St- trans: Spatial-temporal transformer for infrared small target detection in sequential images,

X. Tong, Z. Zuo, S. Su, J. Wei, X. Sun, P. Wu, and Z. Zhao, “St- trans: Spatial-temporal transformer for infrared small target detection in sequential images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–19, 2024

work page 2024

[20] [20]

Ir-transdet: Infrared dim and small target detection with ir-transformer,

J. Lin, S. Li, L. Zhang, X. Yang, B. Yan, and Z. Meng, “Ir-transdet: Infrared dim and small target detection with ir-transformer,” IEEE Transactions on Geoscience and Remote Sensing , vol. 61, pp. 1–13, 2023

work page 2023

[21] [21]

Toward dense moving infrared small target detection: New datasets and baseline,

S. Chen, L. Ji, S. Zhu, M. Ye, H. Ren, and Y . Sang, “Toward dense moving infrared small target detection: New datasets and baseline,”IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–13, 2024

work page 2024

[22] [22]

Hgnn+: General hypergraph neural networks,

Y . Gao, Y . Feng, S. Ji, and R. Ji, “Hgnn+: General hypergraph neural networks,” IEEE Transactions on Pattern Analysis and Machine Intelli- gence, vol. 45, no. 3, pp. 3181–3199, Mar. 2023

work page 2023

[23] [23]

Hyper-yolo: When visual object detection meets hypergraph computation,

Y . Feng, J. Huang, S. Du, S. Ying, J.-H. Yong, Y . Li, G. Ding, R. Ji, and Y . Gao, “Hyper-yolo: When visual object detection meets hypergraph computation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–14, 2025

work page 2025

[24] [24]

Hypergraph learning: Methods and practices,

Y . Gao, Z. Zhang, H. Lin, X. Zhao, S. Du, and C. Zou, “Hypergraph learning: Methods and practices,”IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 44, no. 5, pp. 2548–2566, 2022

work page 2022

[25] [25]

Lbsn2vec++: Het- erogeneous hypergraph embedding for location-based social networks,

D. Yang, B. Qu, J. Yang, and P. Cudr ´e-Mauroux, “Lbsn2vec++: Het- erogeneous hypergraph embedding for location-based social networks,” IEEE Transactions on Knowledge and Data Engineering , vol. 34, no. 4, pp. 1843–1855, 2022

work page 2022

[26] [26]

Hypergraph factorization for multi-tissue gene expression imputation,

R. Vi ˜nas, C. K. Joshi, D. Georgiev, P. Lin, B. Dumitrascu, E. R. Gamazon, and P. Li `o, “Hypergraph factorization for multi-tissue gene expression imputation,” Nature Machine Intelligence , vol. 5, no. 7, pp. 739–753, Jul. 2023

work page 2023

[27] [27]

Multi-hypergraph learning-based brain functional connectivity analysis in fmri data,

L. Xiao, J. Wang, P. H. Kassani, Y . Zhang, Y . Bai, J. M. Stephen, T. W. Wilson, V . D. Calhoun, and Y .-P. Wang, “Multi-hypergraph learning-based brain functional connectivity analysis in fmri data,” IEEE Transactions on Medical Imaging , vol. 39, no. 5, pp. 1746–1758, May 2020

work page 2020

[28] [28]

Max-mean and max-median filters for detection of small targets,

S. D. Deshpande, M. H. Er, R. Venkateswarlu, and P. Chan, “Max-mean and max-median filters for detection of small targets,” in SPIE’s Interna- tional Symposium on Optical Science, Engineering, and Instrumentation, Denver, CO, Oct. 1999, pp. 74–83

work page 1999

[29] [29]

Infrared small target detection utilizing the multiscale relative local contrast measure,

J. Han, K. Liang, B. Zhou, X. Zhu, J. Zhao, and L. Zhao, “Infrared small target detection utilizing the multiscale relative local contrast measure,” IEEE Geoscience and Remote Sensing Letters , vol. 15, no. 4, pp. 612– 616, Apr. 2018

work page 2018

[30] [30]

Infrared small target detection based on the weighted strengthened local contrast measure,

J. Han, S. Moradi, I. Faramarzi, H. Zhang, Q. Zhao, X. Zhang, and N. Li, “Infrared small target detection based on the weighted strengthened local contrast measure,” IEEE Geoscience and Remote Sensing Letters , vol. 18, no. 9, pp. 1670–1674, Sep. 2021

work page 2021

[31] [31]

A local contrast method for infrared small-target detection utilizing a tri-layer window,

J. Han, S. Moradi, I. Faramarzi, C. Liu, H. Zhang, and Q. Zhao, “A local contrast method for infrared small-target detection utilizing a tri-layer window,”IEEE Geoscience and Remote Sensing Letters , vol. 17, no. 10, pp. 1822–1826, Oct. 2020

work page 2020

[32] [32]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 , Cham, 2015, pp. 234– 241

work page 2015

[33] [33]

Isnet: Shape matters for infrared small target detection,

M. Zhang, R. Zhang, Y . Yang, H. Bai, J. Zhang, and J. Guo, “Isnet: Shape matters for infrared small target detection,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , Jun. 2022, pp. 867–876

work page 2022

[34] [34]

Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset irdst,

H. Sun, J. Bai, F. Yang, and X. Bai, “Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset irdst,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–13, 2023

work page 2023

[35] [35]

Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection,

M. Zhang, H. Bai, J. Zhang, R. Zhang, C. Wang, J. Guo, and X. Gao, “Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection,” in Proceedings of the 30th ACM International Conference on Multimedia , New York, NY , USA, Oct. 2022, pp. 1730–1738

work page 2022

[36] [36]

Abmnet: Coupling transformer with cnn based on adams-bashforth-moulton method for infrared small target detection,

T. Chen, Q. Chu, Z. Tan, B. Liu, and N. Yu, “Abmnet: Coupling transformer with cnn based on adams-bashforth-moulton method for infrared small target detection,” in 2023 IEEE International Conference on Multimedia and Expo (ICME) , Brisbane, Australia, Jul. 2023, pp. 1901–1906

work page 2023

[37] [37]

Monte carlo linear clustering with single-point supervision is enough for infrared small target detection,

B. Li, Y . Wang, L. Wang, F. Zhang, T. Liu, Z. Lin, W. An, and Y . Guo, “Monte carlo linear clustering with single-point supervision is enough for infrared small target detection,” in 2023 IEEE/CVF International Conference on Computer Vision (ICCV) , Oct. 2023, pp. 1009–1019

work page 2023

[38] [38]

Mapping degeneration meets label evolution: Learning in- frared small target detection with single point supervision,

X. Ying, L. Liu, Y . Wang, R. Li, N. Chen, Z. Lin, W. Sheng, and S. Zhou, “Mapping degeneration meets label evolution: Learning in- frared small target detection with single point supervision,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, Jun. 2023, pp. 15 528–15 538

work page 2023

[39] [39]

Label evolution based on local contrast measure for single-point supervised infrared small-target detection,

D. Yang, H. Zhang, Y . Li, and Z. Jiang, “Label evolution based on local contrast measure for single-point supervised infrared small-target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–12, 2024

work page 2024

[40] [40]

A level set annotation framework with single-point supervision for infrared small target detection,

H. Li, J. Yang, Y . Xu, and R. Wang, “A level set annotation framework with single-point supervision for infrared small target detection,” IEEE Signal Processing Letters , vol. 31, pp. 451–455, 2024

work page 2024

[41] [41]

Mcgc: A multiscale chain growth clustering algorithm for generating infrared small target mask under single-point supervision,

R. Kou, C. Wang, Q. Fu, Z. Li, Y . Luo, B. Li, W. Li, and Z. Peng, “Mcgc: A multiscale chain growth clustering algorithm for generating infrared small target mask under single-point supervision,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–12, 2024

work page 2024

[42] [42]

Point-to-point regression: Accurate infrared small target detection with single-point annotation,

R. Ni, J. Wu, Z. Qiu, L. Chen, C. Luo, F. Huang, Q. Liu, B. Wang, Y . Li, and Y . Li, “Point-to-point regression: Accurate infrared small target detection with single-point annotation,” IEEE Transactions on Geoscience and Remote Sensing , vol. 63, pp. 1–19, 2025

work page 2025

[43] [43]

Sirst-5k: Exploring massive negatives synthesis with self-supervised learning for robust infrared small target detection,

Y . Lu, Y . Lin, H. Wu, X. Xian, Y . Shi, and L. Lin, “Sirst-5k: Exploring massive negatives synthesis with self-supervised learning for robust infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–11, 2024

work page 2024

[44] [44]

Mim-istd: Mamba-in-mamba for efficient infrared small- target detection,

T. Chen, Z. Ye, Z. Tan, T. Gong, Y . Wu, Q. Chu, B. Liu, N. Yu, and J. Ye, “Mim-istd: Mamba-in-mamba for efficient infrared small- target detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–13, 2024

work page 2024

[45] [45]

Irsam: Advancing segment anything model for infrared small target detection,

M. Zhang, Y . Wang, J. Guo, Y . Li, X. Gao, and J. Zhang, “Irsam: Advancing segment anything model for infrared small target detection,” in ECCV 2024, Cham, 2025, vol. 15125, pp. 233–249

work page 2024

[46] [46]

Stdmanet: Spatio-temporal differential multiscale attention network for small mov- ing infrared target detection,

P. Yan, R. Hou, X. Duan, C. Yue, X. Wang, and X. Cao, “Stdmanet: Spatio-temporal differential multiscale attention network for small mov- ing infrared target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–16, 2023

work page 2023

[47] [47]

Direction- coded temporal u-shape module for multiframe infrared small target de- tection,

R. Li, W. An, C. Xiao, B. Li, Y . Wang, M. Li, and Y . Guo, “Direction- coded temporal u-shape module for multiframe infrared small target de- tection,” IEEE Transactions on Neural Networks and Learning Systems , pp. 1–14, 2025

work page 2025

[48] [48]

Tmp: Temporal motion perception with spatial auxiliary enhancement for moving infrared dim- small target detection,

S. Zhu, L. Ji, J. Zhu, S. Chen, and W. Duan, “Tmp: Temporal motion perception with spatial auxiliary enhancement for moving infrared dim- small target detection,” Expert Systems with Applications , vol. 255, p. 124731, Dec. 2024

work page 2024

[49] [49]

Triple-domain feature learning with frequency-aware memory enhancement for moving in- frared small target detection,

W. Duan, L. Ji, S. Chen, S. Zhu, and M. Ye, “Triple-domain feature learning with frequency-aware memory enhancement for moving in- frared small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–14, 2024

work page 2024

[50] [50]

Semi- supervised multiview prototype learning with motion reconstruction for moving infrared small target detection,

W. Duan, L. Ji, J. Huang, S. Chen, S. Peng, S. Zhu, and M. Ye, “Semi- supervised multiview prototype learning with motion reconstruction for moving infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 63, pp. 1–15, 2025

work page 2025

[51] [51]

Motion prior knowledge learning with homogeneous language descriptions for moving infrared small target detection,

S. Chen, L. Ji, W. Duan, S. Peng, and M. Ye, “Motion prior knowledge learning with homogeneous language descriptions for moving infrared small target detection,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 2, pp. 2186–2194, Apr. 2025

work page 2025

[52] [52]

Yolox: Exceeding yolo series in 2021,

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” Aug. 2021

work page 2021

[53] [53]

Predrnn: A recurrent neural network for spatiotemporal predictive learning,

Y . Wang, H. Wu, J. Zhang, Z. Gao, J. Wang, P. S. Yu, and M. Long, “Predrnn: A recurrent neural network for spatiotemporal predictive learning,” IEEE Transactions on Pattern Analysis and Machine Intel- ligence, vol. 45, no. 2, pp. 2208–2225, Feb. 2023

work page 2023

[54] [54]

Swinlstm: Improving spa- tiotemporal prediction accuracy using swin transformer and lstm,

S. Tang, C. Li, P. Zhang, and R. Tang, “Swinlstm: Improving spa- tiotemporal prediction accuracy using swin transformer and lstm,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 13 470–13 479

work page 2023

[55] [55]

Attentional local contrast networks for infrared small target detection,

Y . Dai, Y . Wu, F. Zhou, and K. Barnard, “Attentional local contrast networks for infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 59, no. 11, pp. 9813–9824, Nov. 2021

work page 2021

[56] [56]

Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,

T. Wu, B. Li, Y . Luo, Y . Wang, C. Xiao, T. Liu, J. Yang, W. An, and Y . Guo, “Mtu-net: Multilevel transunet for space-based infrared tiny ship detection,” IEEE Transactions on Geoscience and Remote Sensing , vol. 61, pp. 1–15, 2023

work page 2023

[57] [57]

Infrared small target detection with scale and location sensitivity,

Q. Liu, R. Liu, B. Zheng, H. Wang, and Y . FU, “Infrared small target detection with scale and location sensitivity,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 17 490–17 499. 13

work page 2024

[58] [58]

Rpcanet: Deep unfolding rpca based infrared small target detection,

F. Wu, T. Zhang, L. Li, Y . Huang, and Z. Peng, “Rpcanet: Deep unfolding rpca based infrared small target detection,” in 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2024, pp. 4797–4806

work page 2024

[59] [59]

Pick of the bunch: Detecting infrared small targets beyond hit-miss trade-offs via selective rank-aware attention,

Y . Dai, P. Pan, Y . Qian, Y . Li, X. Li, J. Yang, and H. Wang, “Pick of the bunch: Detecting infrared small targets beyond hit-miss trade-offs via selective rank-aware attention,” IEEE Transactions on Geoscience and Remote Sensing , vol. 62, pp. 1–15, 2024

work page 2024

[60] [60]

A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background,

B. Hui, Z. Song, H. Fan, P. Zhong, W. Hu, X. Zhang, J. Lin, H. Su, W. Jin, Y . Zhang, and Y . Bai, “A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background,” Oct. 2019

work page 2019