Revisiting the Scale Loss Function and Gaussian-Shape Convolution for Infrared Small Target Detection

Hao Li; Man Fung Zhuo

arxiv: 2604.09991 · v1 · submitted 2026-04-11 · 💻 cs.CV

Revisiting the Scale Loss Function and Gaussian-Shape Convolution for Infrared Small Target Detection

Hao Li , Man Fung Zhuo This is my paper

Pith reviewed 2026-05-10 16:24 UTC · model grok-4.3

classification 💻 cs.CV

keywords infrared small target detectionscale loss functionGaussian convolutionmonotonic gradientsspatial attentionrotated pinwheel masktarget detection

0 comments

The pith

A diff-based scale loss and Gaussian-shaped convolution improve infrared small target detection by stabilizing training and matching target profiles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to fix unstable training from scale losses that lack monotonic gradients and weak spatial focus from standard convolutions that ignore how small targets concentrate intensity in infrared images. It replaces the loss with one based on signed area differences between predicted and true masks to guarantee consistent gradient directions, and replaces kernels with Gaussian-shaped ones that learn a scale parameter while using a rotated pinwheel mask to adapt to target direction. A reader would care because reliable detection of tiny infrared signals matters for surveillance, tracking, and warning systems. The authors test the combination on three public datasets and report gains in overlap and detection rates over prior work.

Core claim

The authors claim that weighting predictions by the signed area difference between the predicted mask and ground truth produces strictly monotonic gradients for stable convergence, unlike earlier scale losses. They further claim that Gaussian-shaped convolution with a learnable scale parameter, combined with a rotated pinwheel mask aligned through a straight-through estimator, better captures the center-concentrated intensity profile of infrared small targets than generic kernels, yielding higher mIoU, Pd, and lower Fa on IRSTD-1k, NUDT-SIRST, and SIRST-UAVB.

What carries the argument

The diff-based scale loss, which weights by signed area difference to enforce monotonic gradients, together with Gaussian-shaped convolution that uses a learnable scale and a rotated pinwheel mask for orientation alignment.

Load-bearing premise

The signed area difference between any predicted and ground-truth mask always produces strictly monotonic gradients, and the intensity distribution of infrared small targets is adequately captured by a center-concentrated Gaussian profile.

What would settle it

A training run on any of the three datasets in which the proposed loss produces non-monotonic gradients for some mask configurations, or in which the Gaussian kernel method shows no improvement over baselines on targets whose intensity profiles deviate from a Gaussian shape.

Figures

Figures reproduced from arXiv: 2604.09991 by Hao Li, Man Fung Zhuo.

**Figure 1.** Figure 1: Comparison between our monotonic diff-based scale loss and conventional nonmonotonic scale losses. The proposed diff-based loss exhibits strict monotonicity with respect to scale deviation, stably penalizing mismatches between predicted and groundtruth target areas and naturally guiding optimization toward the optimal center, thus eliminating unstable gradients and ensuring stable convergence during trai… view at source ↗

**Figure 2.** Figure 2: Illustration of the Gaussian-like intensity distribution inherent to IRSTs. As shown, small targets exhibit a distinct center-concentrated, smoothly decaying grayscale pattern, which motivates the design of Gaussian-shaped spatial attention or convolution to align with this natural imaging characteristic, rather than using generic fully learned receptive fields. convergence to the correct target scale (as … view at source ↗

**Figure 3.** Figure 3: Illustration of the directional morphological diversity of IRSTs. As shown, IRSTs exhibit varied directional structures (horizontal, vertical, or oblique pointlike/elongated shapes), which motivates the use of learnable rotated pinwheel masks to adaptively align spatial attention with target orientation, complementing Gaussianshaped convolution to better match the diverse directional properties of small … view at source ↗

**Figure 4.** Figure 4: Overview of the proposed framework. The U-Net encoder–decoder backbone applies channel attention and Gaussian-shaped spatial attention within each residual block. The 7 × 7 spatial attention kernel is constructed by combining a Gaussian prior (learnable σ) with a learnable rotated pinwheel mask whose orientation θ is optimized via a straight-through estimator, as illustrated in the upper branch. The final … view at source ↗

**Figure 5.** Figure 5: Visualization of the four scale weighting functions. Each row corresponds to one variant (Diff-based, Var-based, Mobius, Var-denominator). Left: contour map over the (Ap, At) plane. Middle: 3D surface. Right: anti-diagonal cross-section with Ap+At = 10, showing the weight value as Ap varies. Only the Diff-based weight decays strictly and monotonically away from Ap = At, while the Var-based weight is non-mo… view at source ↗

**Figure 6.** Figure 6: Qualitative segmentation comparison. Each row shows one test scene. Zoomed insets highlight the target region. Green pixels indicate true positives, red pixels indicate false positives, and yellow pixels indicate false negatives. L1-GP-Rotated consistently produces clean segmentation masks with fewer false alarms and missed detections compared to competing methods. 4.3 Plug-and-play Comparisons To assess… view at source ↗

**Figure 7.** Figure 7: 3D prediction heatmap comparison. Each bar represents a detected blob: green bars are true detections at the correct location, red bars are false alarms. L1-GPRotated yields a single clean green bar aligned with the ground-truth target, demonstrating significantly better false alarm suppression. 4.4 High False Alarm Source We examine whether the location regularizer Lloc is the structural cause of eleva… view at source ↗

read the original abstract

Infrared small target detection still faces two persistent challenges: training instability from non-monotonic scale loss functions, and inadequate spatial attention due to generic convolution kernels that ignore the physical imaging characteristics of small targets. In this paper, we revisit both aspects. For the loss side, we propose a \emph{diff-based scale loss} that weights predictions according to the signed area difference between the predicted mask and the ground truth, yielding strictly monotonic gradients and stable convergence. We further analyze a family of four scale loss variants to understand how their geometric properties affect detection behavior. For the spatial side, we introduce \emph{Gaussian-shaped convolution} with a learnable scale parameter to match the center-concentrated intensity profile of infrared small targets, and augment it with a \emph{rotated pinwheel mask} that adaptively aligns the kernel with target orientation via a straight-through estimator. Extensive experiments on IRSTD-1k, NUDT-SIRST, and SIRST-UAVB demonstrate consistent improvements in $mIoU$, $P_d$, and $F_a$ over state-of-the-art methods. We release our anonymous code and pretrained models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The diff-based loss and Gaussian pinwheel kernel give measurable gains on standard IR small-target benchmarks, but the monotonic-gradient claim rests on an assumption that may not survive all discrete mask cases.

read the letter

The paper's core moves are a diff-based scale loss that weights by signed area difference between prediction and ground truth, plus Gaussian-shaped convolution with a learnable scale and a rotated pinwheel mask passed through a straight-through estimator. Both are presented as fixes for non-monotonic training and for kernels that ignore the center-heavy profile of infrared targets. They also walk through four geometric variants of the scale loss. On IRSTD-1k, NUDT-SIRST, and SIRST-UAVB the method reports better mIoU, Pd, and Fa than recent baselines, and the authors release code and models, which helps reproducibility. That combination of targeted engineering and public artifacts is the useful part. The soft spot is the monotonicity guarantee. The loss is said to produce strictly monotonic gradients for any mask configuration, but the stress-test concern is real: on a discrete grid, partial overlaps or non-convex predictions can make the signed difference non-monotonic with respect to overlap quality, so the gradient can plateau or reverse even as IoU improves. The paper analyzes the four variants geometrically but does not appear to include exhaustive enumeration or counter-example search over mask topologies, leaving the “any configuration” claim as an assumption rather than a verified property. No derivation of the gradient behavior is supplied in the abstract, and the full text would need to show that the discrete case is handled. This is a narrow but operationally relevant task, so the empirical gains matter more than a first-principles derivation would. The work is for people already working on infrared small-target detection who need incremental reliability improvements rather than a new theoretical framework. It is coherent on its own terms and shows honest engagement with the usual baselines, so it deserves a serious referee even if the monotonicity point requires tightening or qualification in revision.

Referee Report

2 major / 2 minor

Summary. The paper claims that a diff-based scale loss using signed area difference between predicted and ground-truth masks produces strictly monotonic gradients for stable training in infrared small target detection; it geometrically analyzes four scale-loss variants, introduces Gaussian-shaped convolution with a learnable scale parameter plus a rotated pinwheel mask aligned via straight-through estimator to better match target intensity profiles, and reports consistent gains in mIoU, Pd, and Fa over SOTA on IRSTD-1k, NUDT-SIRST, and SIRST-UAVB.

Significance. If the monotonicity property and Gaussian-profile assumption hold, the approach could stabilize training and improve spatial attention for small-target tasks in surveillance and remote sensing; the multi-dataset evaluation and release of code/pretrained models are positive for reproducibility and allow direct comparison.

major comments (2)

[diff-based scale loss and variant analysis] The central claim that signed-area-difference weighting yields strictly monotonic gradients for any mask configuration (abstract and loss-function section) is load-bearing for the stability and performance assertions, yet the geometric analysis of the four variants supplies no exhaustive enumeration, counter-example search, or discrete-grid verification; partial overlaps, boundary pixels, or non-convex predictions can produce non-monotonic loss values or gradient reversals even as IoU improves.
[experimental results] Table or ablation results (experimental section) do not isolate the incremental contribution of the diff-based loss versus the Gaussian convolution and pinwheel mask; without such breakdowns it is difficult to confirm that the reported gains on the three datasets are attributable to the proposed mechanisms rather than dataset-specific tuning or baseline differences.

minor comments (2)

[Gaussian-shaped convolution] The precise formulation of the rotated pinwheel mask and its straight-through estimator integration would benefit from an explicit equation or algorithm box to aid implementation.
[method overview] Notation for the learnable scale parameter and signed area difference should be introduced with a single consistent symbol set rather than varying across text and figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for major revision. We address each major comment point by point below, providing clarifications and committing to revisions where appropriate to strengthen the manuscript.

read point-by-point responses

Referee: [diff-based scale loss and variant analysis] The central claim that signed-area-difference weighting yields strictly monotonic gradients for any mask configuration (abstract and loss-function section) is load-bearing for the stability and performance assertions, yet the geometric analysis of the four variants supplies no exhaustive enumeration, counter-example search, or discrete-grid verification; partial overlaps, boundary pixels, or non-convex predictions can produce non-monotonic loss values or gradient reversals even as IoU improves.

Authors: We appreciate the referee's careful scrutiny of the monotonicity claim, which is indeed central to our contribution. The geometric analysis in the loss-function section shows that the signed area difference produces monotonic gradients with respect to the scale parameter for the considered variants. To rigorously address potential issues in discrete settings and complex overlaps, we will augment the analysis with discrete-grid verifications, exhaustive checks on small mask configurations, and a search for counter-examples in the revised manuscript. This will either confirm the property or allow us to qualify the claim appropriately. revision: partial
Referee: [experimental results] Table or ablation results (experimental section) do not isolate the incremental contribution of the diff-based loss versus the Gaussian convolution and pinwheel mask; without such breakdowns it is difficult to confirm that the reported gains on the three datasets are attributable to the proposed mechanisms rather than dataset-specific tuning or baseline differences.

Authors: We agree that additional ablation studies would better isolate the contributions of each component. In the revised manuscript, we will include new tables and experiments that ablate the diff-based scale loss, the Gaussian-shaped convolution, and the rotated pinwheel mask individually across the three benchmarks. This will provide clear evidence of their incremental impacts on mIoU, Pd, and Fa. revision: yes

Circularity Check

0 steps flagged

No circularity: proposals are new modules validated empirically, not reductions to inputs by construction

full rationale

The paper proposes a diff-based scale loss (weighted by signed area difference) and Gaussian-shaped convolution (with learnable scale and rotated pinwheel mask via straight-through estimator) as solutions to stated challenges. These are motivated by geometric analysis and physical imaging assumptions rather than derived from prior equations or self-citations that presuppose the results. Performance claims rest on experiments across three datasets (IRSTD-1k, NUDT-SIRST, SIRST-UAVB) showing gains in mIoU, Pd, and Fa, with no load-bearing step where a 'prediction' or uniqueness theorem reduces tautologically to fitted parameters or author prior work. The monotonic gradient assertion follows directly from the loss definition without circular redefinition.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The approach introduces two new algorithmic components whose justification rests on domain observations rather than derivation from first principles; the learnable scale is the only explicit free parameter.

free parameters (1)

learnable scale parameter
Width of the Gaussian kernel is optimized during training rather than fixed a priori.

axioms (1)

domain assumption Infrared small targets exhibit a center-concentrated intensity profile that can be approximated by a Gaussian.
Invoked to motivate the choice of kernel shape.

invented entities (2)

diff-based scale loss no independent evidence
purpose: Provide strictly monotonic gradients by weighting according to signed area difference.
Newly defined loss function.
rotated pinwheel mask no independent evidence
purpose: Adaptively align the convolution kernel with target orientation.
New masking mechanism using straight-through estimator.

pith-pipeline@v0.9.0 · 5500 in / 1440 out tokens · 66745 ms · 2026-05-10T16:24:06.111596+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

Infrared Physics & Technology101, 78–87 (2019)

Aghaziyarati, S., Moradi, S., Talebi, H.: Small infrared target detection using ab- solute average difference weighted by cumulative directional derivatives. Infrared Physics & Technology101, 78–87 (2019)

work page 2019
[2]

Journal of infrared, millimeter, and terahertz waves31(6), 735–743 (2010)

Bae, T.W., Sohng, K.I.: Small target detection using bilateral filter based on edge component. Journal of infrared, millimeter, and terahertz waves31(6), 735–743 (2010)

work page 2010
[3]

Optics & Laser Technology43(7), 1084–1090 (2011)

Bai, X., Zhou, F.: Hit-or-miss transform based infrared dim small target enhance- ment. Optics & Laser Technology43(7), 1084–1090 (2011)

work page 2011
[4]

IEEE transactions on geoscience and remote sensing 52(1), 574–581 (2013)

Chen, C.P., Li, H., Wei, Y., Xia, T., Tang, Y.Y.: A local contrast method for small infrared target detection. IEEE transactions on geoscience and remote sensing 52(1), 574–581 (2013)

work page 2013
[5]

In: Proceedings of the IEEE/CVF winter conference on applications of computer vision

Dai, Y., Wu, Y., Zhou, F., Barnard, K.: Asymmetric contextual modulation for in- frared small target detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp. 950–959 (2021)

work page 2021
[6]

IEEE transactions on geoscience and remote sensing 59(11), 9813–9824 (2021)

Dai, Y., Wu, Y., Zhou, F., Barnard, K.: Attentional local contrast networks for in- frared small target detection. IEEE transactions on geoscience and remote sensing 59(11), 9813–9824 (2021)

work page 2021
[7]

Multimedia Tools and Appli- cations77(9), 10539–10551 (2018)

Deng, L., Zhu, H., Zhou, Q., Li, Y.: Adaptive top-hat filter based on quantum genetic algorithm for infrared small target detection. Multimedia Tools and Appli- cations77(9), 10539–10551 (2018)

work page 2018
[8]

Advances in neural information processing systems34, 20230–20242 (2021)

He, J., Erfani, S., Ma, X., Bailey, J., Chi, Y., Hua, X.S.:α-iou: A family of power intersection over union losses for bounding box regression. Advances in neural information processing systems34, 20230–20242 (2021)

work page 2021
[9]

Pattern recognition 143, 109788 (2023)

Kou, R., Wang, C., Peng, Z., Zhao, Z., Chen, Y., Han, J., Huang, F., Yu, Y., Fu, Q.: Infrared small target segmentation networks: A survey. Pattern recognition 143, 109788 (2023)

work page 2023
[10]

IEEE Transactions on Image Processing32, 1745–1758 (2022)

Li, B., Xiao, C., Wang, L., Wang, Y., Lin, Z., Li, M., An, W., Guo, Y.: Dense nested attention network for infrared small target detection. IEEE Transactions on Image Processing32, 1745–1758 (2022)

work page 2022
[11]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, Q., Liu, R., Zheng, B., Wang, H., Fu, Y.: Infrared small target detection with scale and location sensitivity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 17490–17499 (2024)

work page 2024
[12]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: General- ized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 658–666 (2019)

work page 2019
[13]

Remote Sensing13(16), 3200 (2021)

Tong, X., Sun, B., Wei, J., Zuo, Z., Su, S.: Eaau-net: Enhanced asymmetric at- tention u-net for infrared small target detection. Remote Sensing13(16), 3200 (2021)

work page 2021
[14]

false alarm: Adversarial learning for small object segmentation in infrared images

Wang, H., Zhou, L., Wang, L.: Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 8509–8518 (2019)

work page 2019
[15]

IEEE Transactions on Geoscience and Remote Sensing60, 1–13 (2022)

Wang, K., Du, S., Liu, C., Cao, Z.: Interior attention-aware network for infrared small target detection. IEEE Transactions on Geoscience and Remote Sensing60, 1–13 (2022)

work page 2022
[16]

IEEE Transactions on Geoscience and Remote Sensing61, 1–15 (2023) 18 Hao Li and Man Fung Zhuo

Wu, T., Li, B., Luo, Y., Wang, Y., Xiao, C., Liu, T., Yang, J., An, W., Guo, Y.: Mtu-net: Multilevel transunet for space-based infrared tiny ship detection. IEEE Transactions on Geoscience and Remote Sensing61, 1–15 (2023) 18 Hao Li and Man Fung Zhuo

work page 2023
[17]

IEEE Transactions on Image Processing32, 364–376 (2022)

Wu, X., Hong, D., Chanussot, J.: Uiu-net: U-net in u-net for infrared small object detection. IEEE Transactions on Image Processing32, 364–376 (2022)

work page 2022
[18]

IEEE Sensors Journal (2025)

Xu, Y., Liu, P., Qian, W., Zhang, J., Kong, X., Wan, M.: Small and dim target detection under strong clutter based on similarly of gaussian and motion outlier significance using moving infrared camera. IEEE Sensors Journal (2025)

work page 2025
[19]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Yang, J., Liu, S., Wu, J., Su, X., Hai, N., Huang, X.: Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 9202–9210 (2025)

work page 2025
[20]

IEEE Transactions on Geoscience and Remote Sensing62, 1–15 (2024)

Yuan, S., Qin, H., Yan, X., Akhtar, N., Mian, A.: Sctransnet: Spatial-channel cross transformer network for infrared small target detection. IEEE Transactions on Geoscience and Remote Sensing62, 1–15 (2024)

work page 2024
[21]

Infrared Physics & Technology107, 103290 (2020)

Zhang, H., Zhou, Z.: Small target detection based on automatic roi extraction and local directional gray&entropy contrast map. Infrared Physics & Technology107, 103290 (2020)

work page 2020
[22]

Remote Sensing 10(11), 1821 (2018)

Zhang, L., Peng, L., Zhang, T., Cao, S., Peng, Z.: Infrared small target detection via non-convex rank approximation minimization joint l 2, 1 norm. Remote Sensing 10(11), 1821 (2018)

work page 2018
[23]

In: Proceedings of the 30th ACM International Conference on Multimedia

Zhang, M., Yue, K., Zhang, J., Li, Y., Gao, X.: Exploring feature compensation and cross-level correlation for infrared small target detection. In: Proceedings of the 30th ACM International Conference on Multimedia. pp. 1857–1865 (2022)

work page 2022
[24]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Zhang, M., Zhang, R., Yang, Y., Bai, H., Zhang, J., Guo, J.: Isnet: Shape matters for infrared small target detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 877–886 (2022)

work page 2022
[25]

Remote Sensing 11(5), 559 (2019)

Zhang, T., Wu, H., Liu, Y., Peng, L., Yang, C., Peng, Z.: Infrared small target de- tection based on non-convex optimization with lp-norm constraint. Remote Sensing 11(5), 559 (2019)

work page 2019
[26]

In: 2nd International Con- ference on Computer Engineering, Information Science & Application Technology (ICCIA 2017)

Zhang, X., Chi, J., Hu, J., Liu, L., Xing, Y.: Infrared small target detection using modified order morphology and weighted local entropy. In: 2nd International Con- ference on Computer Engineering, Information Science & Application Technology (ICCIA 2017). pp. 356–365. Atlantis Press (2016)

work page 2017
[27]

In: 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT)

Zhang, Y., Li, Z.: A gaussian weighted multi-scale method for infrared small tar- get detection. In: 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT). pp. 465–469. IEEE (2025)

work page 2025
[28]

IEEE Transactions on Geoscience and Remote Sensing63, 1–15 (2025)

Zhang, Y., Li, Z., Siddique, A., Azeem, A., Chen, W., Cao, D.: Infrared small target detection based on interpretation weighted sparse method. IEEE Transactions on Geoscience and Remote Sensing63, 1–15 (2025)

work page 2025
[29]

IEEE geoscience and remote sensing magazine10(2), 87–119 (2022)

Zhao, M., Li, W., Li, L., Hu, J., Ma, P., Tao, R.: Single-frame infrared small-target detection: A survey. IEEE geoscience and remote sensing magazine10(2), 87–119 (2022)

work page 2022
[30]

arXiv preprint arXiv:2001.05852 (2019)

Zhao, M., Cheng, L., Yang, X., Feng, P., Liu, L., Wu, N.: Tbc-net: A real-time de- tector for infrared small target detection using semantic constraint. arXiv preprint arXiv:2001.05852 (2019)

work page arXiv 2001
[31]

In: Proceedings of the AAAI conference on artificial intelligence

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-iou loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 12993–13000 (2020)

work page 2020
[32]

IEEE Transactions on Image Processing29, 9546–9558 (2020)

Zhu, H., Ni, H., Liu, S., Xu, G., Deng, L.: Tnlrs: Target-aware non-local low-rank modeling with saliency filtering regularization for infrared small target detection. IEEE Transactions on Image Processing29, 9546–9558 (2020)

work page 2020

[1] [1]

Infrared Physics & Technology101, 78–87 (2019)

Aghaziyarati, S., Moradi, S., Talebi, H.: Small infrared target detection using ab- solute average difference weighted by cumulative directional derivatives. Infrared Physics & Technology101, 78–87 (2019)

work page 2019

[2] [2]

Journal of infrared, millimeter, and terahertz waves31(6), 735–743 (2010)

Bae, T.W., Sohng, K.I.: Small target detection using bilateral filter based on edge component. Journal of infrared, millimeter, and terahertz waves31(6), 735–743 (2010)

work page 2010

[3] [3]

Optics & Laser Technology43(7), 1084–1090 (2011)

Bai, X., Zhou, F.: Hit-or-miss transform based infrared dim small target enhance- ment. Optics & Laser Technology43(7), 1084–1090 (2011)

work page 2011

[4] [4]

IEEE transactions on geoscience and remote sensing 52(1), 574–581 (2013)

Chen, C.P., Li, H., Wei, Y., Xia, T., Tang, Y.Y.: A local contrast method for small infrared target detection. IEEE transactions on geoscience and remote sensing 52(1), 574–581 (2013)

work page 2013

[5] [5]

In: Proceedings of the IEEE/CVF winter conference on applications of computer vision

Dai, Y., Wu, Y., Zhou, F., Barnard, K.: Asymmetric contextual modulation for in- frared small target detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp. 950–959 (2021)

work page 2021

[6] [6]

IEEE transactions on geoscience and remote sensing 59(11), 9813–9824 (2021)

Dai, Y., Wu, Y., Zhou, F., Barnard, K.: Attentional local contrast networks for in- frared small target detection. IEEE transactions on geoscience and remote sensing 59(11), 9813–9824 (2021)

work page 2021

[7] [7]

Multimedia Tools and Appli- cations77(9), 10539–10551 (2018)

Deng, L., Zhu, H., Zhou, Q., Li, Y.: Adaptive top-hat filter based on quantum genetic algorithm for infrared small target detection. Multimedia Tools and Appli- cations77(9), 10539–10551 (2018)

work page 2018

[8] [8]

Advances in neural information processing systems34, 20230–20242 (2021)

He, J., Erfani, S., Ma, X., Bailey, J., Chi, Y., Hua, X.S.:α-iou: A family of power intersection over union losses for bounding box regression. Advances in neural information processing systems34, 20230–20242 (2021)

work page 2021

[9] [9]

Pattern recognition 143, 109788 (2023)

Kou, R., Wang, C., Peng, Z., Zhao, Z., Chen, Y., Han, J., Huang, F., Yu, Y., Fu, Q.: Infrared small target segmentation networks: A survey. Pattern recognition 143, 109788 (2023)

work page 2023

[10] [10]

IEEE Transactions on Image Processing32, 1745–1758 (2022)

Li, B., Xiao, C., Wang, L., Wang, Y., Lin, Z., Li, M., An, W., Guo, Y.: Dense nested attention network for infrared small target detection. IEEE Transactions on Image Processing32, 1745–1758 (2022)

work page 2022

[11] [11]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, Q., Liu, R., Zheng, B., Wang, H., Fu, Y.: Infrared small target detection with scale and location sensitivity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 17490–17499 (2024)

work page 2024

[12] [12]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: General- ized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 658–666 (2019)

work page 2019

[13] [13]

Remote Sensing13(16), 3200 (2021)

Tong, X., Sun, B., Wei, J., Zuo, Z., Su, S.: Eaau-net: Enhanced asymmetric at- tention u-net for infrared small target detection. Remote Sensing13(16), 3200 (2021)

work page 2021

[14] [14]

false alarm: Adversarial learning for small object segmentation in infrared images

Wang, H., Zhou, L., Wang, L.: Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 8509–8518 (2019)

work page 2019

[15] [15]

IEEE Transactions on Geoscience and Remote Sensing60, 1–13 (2022)

Wang, K., Du, S., Liu, C., Cao, Z.: Interior attention-aware network for infrared small target detection. IEEE Transactions on Geoscience and Remote Sensing60, 1–13 (2022)

work page 2022

[16] [16]

IEEE Transactions on Geoscience and Remote Sensing61, 1–15 (2023) 18 Hao Li and Man Fung Zhuo

Wu, T., Li, B., Luo, Y., Wang, Y., Xiao, C., Liu, T., Yang, J., An, W., Guo, Y.: Mtu-net: Multilevel transunet for space-based infrared tiny ship detection. IEEE Transactions on Geoscience and Remote Sensing61, 1–15 (2023) 18 Hao Li and Man Fung Zhuo

work page 2023

[17] [17]

IEEE Transactions on Image Processing32, 364–376 (2022)

Wu, X., Hong, D., Chanussot, J.: Uiu-net: U-net in u-net for infrared small object detection. IEEE Transactions on Image Processing32, 364–376 (2022)

work page 2022

[18] [18]

IEEE Sensors Journal (2025)

Xu, Y., Liu, P., Qian, W., Zhang, J., Kong, X., Wan, M.: Small and dim target detection under strong clutter based on similarly of gaussian and motion outlier significance using moving infrared camera. IEEE Sensors Journal (2025)

work page 2025

[19] [19]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Yang, J., Liu, S., Wu, J., Su, X., Hai, N., Huang, X.: Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 9202–9210 (2025)

work page 2025

[20] [20]

IEEE Transactions on Geoscience and Remote Sensing62, 1–15 (2024)

Yuan, S., Qin, H., Yan, X., Akhtar, N., Mian, A.: Sctransnet: Spatial-channel cross transformer network for infrared small target detection. IEEE Transactions on Geoscience and Remote Sensing62, 1–15 (2024)

work page 2024

[21] [21]

Infrared Physics & Technology107, 103290 (2020)

Zhang, H., Zhou, Z.: Small target detection based on automatic roi extraction and local directional gray&entropy contrast map. Infrared Physics & Technology107, 103290 (2020)

work page 2020

[22] [22]

Remote Sensing 10(11), 1821 (2018)

Zhang, L., Peng, L., Zhang, T., Cao, S., Peng, Z.: Infrared small target detection via non-convex rank approximation minimization joint l 2, 1 norm. Remote Sensing 10(11), 1821 (2018)

work page 2018

[23] [23]

In: Proceedings of the 30th ACM International Conference on Multimedia

Zhang, M., Yue, K., Zhang, J., Li, Y., Gao, X.: Exploring feature compensation and cross-level correlation for infrared small target detection. In: Proceedings of the 30th ACM International Conference on Multimedia. pp. 1857–1865 (2022)

work page 2022

[24] [24]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Zhang, M., Zhang, R., Yang, Y., Bai, H., Zhang, J., Guo, J.: Isnet: Shape matters for infrared small target detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 877–886 (2022)

work page 2022

[25] [25]

Remote Sensing 11(5), 559 (2019)

Zhang, T., Wu, H., Liu, Y., Peng, L., Yang, C., Peng, Z.: Infrared small target de- tection based on non-convex optimization with lp-norm constraint. Remote Sensing 11(5), 559 (2019)

work page 2019

[26] [26]

In: 2nd International Con- ference on Computer Engineering, Information Science & Application Technology (ICCIA 2017)

Zhang, X., Chi, J., Hu, J., Liu, L., Xing, Y.: Infrared small target detection using modified order morphology and weighted local entropy. In: 2nd International Con- ference on Computer Engineering, Information Science & Application Technology (ICCIA 2017). pp. 356–365. Atlantis Press (2016)

work page 2017

[27] [27]

In: 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT)

Zhang, Y., Li, Z.: A gaussian weighted multi-scale method for infrared small tar- get detection. In: 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT). pp. 465–469. IEEE (2025)

work page 2025

[28] [28]

IEEE Transactions on Geoscience and Remote Sensing63, 1–15 (2025)

Zhang, Y., Li, Z., Siddique, A., Azeem, A., Chen, W., Cao, D.: Infrared small target detection based on interpretation weighted sparse method. IEEE Transactions on Geoscience and Remote Sensing63, 1–15 (2025)

work page 2025

[29] [29]

IEEE geoscience and remote sensing magazine10(2), 87–119 (2022)

Zhao, M., Li, W., Li, L., Hu, J., Ma, P., Tao, R.: Single-frame infrared small-target detection: A survey. IEEE geoscience and remote sensing magazine10(2), 87–119 (2022)

work page 2022

[30] [30]

arXiv preprint arXiv:2001.05852 (2019)

Zhao, M., Cheng, L., Yang, X., Feng, P., Liu, L., Wu, N.: Tbc-net: A real-time de- tector for infrared small target detection using semantic constraint. arXiv preprint arXiv:2001.05852 (2019)

work page arXiv 2001

[31] [31]

In: Proceedings of the AAAI conference on artificial intelligence

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-iou loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 12993–13000 (2020)

work page 2020

[32] [32]

IEEE Transactions on Image Processing29, 9546–9558 (2020)

Zhu, H., Ni, H., Liu, S., Xu, G., Deng, L.: Tnlrs: Target-aware non-local low-rank modeling with saliency filtering regularization for infrared small target detection. IEEE Transactions on Image Processing29, 9546–9558 (2020)

work page 2020