Layer-Guided UAV Tracking: Enhancing Efficiency and Occlusion Robustness
Pith reviewed 2026-05-15 22:21 UTC · model grok-4.3
The pith
LGTrack combines dynamic layer selection with lightweight GGCA and SGLA modules to track UAV objects at 258.7 FPS while keeping 82.8 percent precision.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LGTrack is a unified UAV tracking framework that integrates dynamic layer selection, the lightweight Global-Grouped Coordinate Attention (GGCA) module for global context with minimal overhead, and the Similarity-Guided Layer Adaptation (SGLA) module for robust representation learning. This combination yields state-of-the-art real-time speed of 258.7 FPS on the UAVDT dataset while preserving competitive tracking accuracy of 82.8 percent precision, as shown across three benchmark datasets.
What carries the argument
Dynamic layer selection guided by the GGCA module for efficient global feature enhancement and the SGLA module for similarity-based adaptation, which together replace knowledge distillation and support occlusion robustness.
If this is right
- Real-time tracking becomes feasible on low-power UAV platforms without sacrificing much accuracy.
- Occlusion handling improves through the similarity-guided adaptation that avoids full distillation overhead.
- The framework maintains competitive precision across multiple UAV tracking benchmarks.
- Inference speed reaches 258.7 FPS on UAVDT while using only the proposed lightweight modules.
Where Pith is reading between the lines
- The layer-selection idea could be tested on other real-time vision tasks such as drone-based surveillance or autonomous navigation.
- Replacing distillation with SGLA might simplify training pipelines for similar lightweight trackers.
- The approach may extend to video object tracking outside UAV settings if the same layer-guidance logic holds.
- Hardware-specific speed measurements would clarify whether the reported FPS transfers to different embedded processors.
Load-bearing premise
The GGCA and SGLA modules actually deliver the stated speed gains and occlusion handling without hidden accuracy costs that appear only in full tests.
What would settle it
Re-running the released code on UAVDT under the paper's occlusion test protocol and obtaining either under 200 FPS or under 70 percent precision on the same hardware.
read the original abstract
Visual object tracking (VOT) plays a pivotal role in unmanned aerial vehicle (UAV) applications. Addressing the trade-off between accuracy and efficiency, especially under challenging conditions like unpredictable occlusion, remains a significant challenge. This paper introduces LGTrack, a unified UAV tracking framework that integrates dynamic layer selection, efficient feature enhancement, and robust representation learning for occlusions. By employing a novel lightweight Global-Grouped Coordinate Attention (GGCA) module, LGTrack captures long-range dependencies and global contexts, enhancing feature discriminability with minimal computational overhead. Additionally, a lightweight Similarity-Guided Layer Adaptation (SGLA) module replaces knowledge distillation, achieving an optimal balance between tracking precision and inference efficiency. Experiments on three datasets demonstrate LGTrack's state-of-the-art real-time speed (258.7 FPS on UAVDT) while maintaining competitive tracking accuracy (82.8\% precision). Code is available at https://github.com/XiaoMoc/LGTrack
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces LGTrack, a unified UAV tracking framework combining dynamic layer selection, a lightweight Global-Grouped Coordinate Attention (GGCA) module to capture long-range dependencies with low overhead, and a Similarity-Guided Layer Adaptation (SGLA) module that replaces knowledge distillation for occlusion robustness. Experiments on three datasets report state-of-the-art real-time performance of 258.7 FPS and 82.8% precision on UAVDT, with code released at https://github.com/XiaoMoc/LGTrack.
Significance. If the reported speed-accuracy trade-off holds under the described conditions, the work is significant for real-time UAV applications where occlusion handling and efficiency are critical. The lightweight design of GGCA and SGLA, together with public code release, supports reproducibility and practical adoption; the internal consistency of the module descriptions and experimental coverage across datasets strengthens the contribution.
minor comments (3)
- Abstract: the performance claims would be easier to assess if the abstract briefly named the main baselines against which 258.7 FPS and 82.8% precision are compared.
- §3 (Method): the interaction between dynamic layer selection and the SGLA module could be illustrated with a single diagram or pseudocode line to clarify the forward pass.
- Table 1 or equivalent results section: report standard deviations or multiple runs for the FPS and precision numbers to quantify variability.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation for minor revision. We appreciate the recognition of LGTrack's real-time performance, lightweight modules, and reproducibility via public code release.
Circularity Check
No significant circularity detected
full rationale
The paper presents LGTrack as an engineering framework combining dynamic layer selection with two lightweight modules (GGCA and SGLA) whose designs are described directly in the text rather than derived from prior results. No equations, uniqueness theorems, fitted parameters renamed as predictions, or self-citation chains appear that would reduce any claimed performance gain to an input by construction. The reported FPS and precision figures are empirical measurements on standard benchmarks, not outputs of a closed-form derivation. The argument is therefore self-contained and externally falsifiable via the released code and datasets.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Visual Computer42(1) (2025) https://doi.org/10.1007/ s00371-025-04309-6
Lu, M.: Ureptrack: single-branch poolformer for unified attention-free rgb-event visual object tracking. Visual Computer42(1) (2025) https://doi.org/10.1007/ s00371-025-04309-6
work page 2025
-
[2]
Visual Computer41(9), 6631–6644 (2025) https://doi.org/10.1007/s00371-025-03964-z
Yang, K., Zhang, W., Li, P., Liang, J., Peng, T., Chen, J., Li, L., Hu, X., Liu, J.: Vit-bf: vision transformer with border-aware features for visual tracking. Visual Computer41(9), 6631–6644 (2025) https://doi.org/10.1007/s00371-025-03964-z
-
[3]
Visual Computer40(12), 8987–9003 (2024) https://doi.org/10.1007/s00371-024-03290-w
Chen, Z., Liu, L., Yu, Z.: Toward robust visual tracking for uav with adaptive spatial-temporal weighted regularization. Visual Computer40(12), 8987–9003 (2024) https://doi.org/10.1007/s00371-024-03290-w
-
[4]
Visual Computer 41(11), 8627–8644 (2025) https://doi.org/10.1007/s00371-025-03888-8
Karakostas, I., Mygdalis, V., Nikolaidis, N., Pitas, I.: Enhancing visual object tracking robustness through a lightweight denoising module. Visual Computer 41(11), 8627–8644 (2025) https://doi.org/10.1007/s00371-025-03888-8
-
[5]
Sensors25(20), 6403 (2025) https://doi.org/10.3390/s25206403
Gharsa, O., Touba, M.M., Boumehraz, M., Agram, N.: Autonomous vision-based object detection and tracking system for quadrotor unmanned aerial vehicles. Sensors25(20), 6403 (2025) https://doi.org/10.3390/s25206403
-
[6]
PoseNet: A convolutional network for real-time 6-dof camera relocalization,
Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: 2015 IEEE International Conference on 21 Computer Vision (ICCV), pp. 4310–4318 (2015). https://doi.org/10.1109/iccv. 2015.490
-
[7]
Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence39(8), 1561–1575 (2017) https://doi.org/10.1109/tpami.2016.2609928
-
[8]
and Caseiro, Rui and Martins, Pedro and Batista, Jorge , year=
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence37(3), 583–596 (2015) https://doi.org/10.1109/tpami.2014.2345390
-
[9]
Shao, Y., Yang, H., Gao, R., Li, F.: Three-dimensional obstacle avoidance path planning for agricultural UAV based on improved ant colony algorithm. Inter- national Journal of Network Dynamics and Intelligence4(4), 100028 (2025) https://doi.org/10.53941/ijndi.2025.100028
-
[10]
In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/cvpr.2018.00935
-
[11]
Wu, Y., Li, Y., Liu, M., Wang, X., Yang, X., Ye, H., Zeng, D., Zhao, Q., Li, S.: Learning an adaptive and view-invariant vision transformer for real-time uav tracking. IEEE Transactions on Circuits and Systems for Video Technology, 1–1 (2025) https://doi.org/10.1109/tcsvt.2025.3599856
-
[13]
Ye, B., Chang, H., Ma, B., Shan, S., Chen, X.: Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework, pp. 341–357 (2022). https://doi.org/10.1007/978-3-031-20047-2 20
-
[14]
In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
Li, S., Yang, Y., Zeng, D., Wang, X.: Adaptive and background-aware vision transformer for real-time uav tracking. In: 2023 IEEE/CVF International Con- ference on Computer Vision (ICCV), pp. 13943–13954 (2023). https://doi.org/ 10.1109/iccv51070.2023.01286
-
[15]
Freeman, Frédo Durand, Eli Shechtman, and Xun Huang
Xue, C., Zhong, B., Liang, Q., Zheng, Y., Li, N., Xue, Y., Song, S.: Similarity- guided layer-adaptive vision transformer for uav tracking. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6730–6740 (2025). https://doi.org/10.1109/cvpr52734.2025.00631
-
[16]
Freeman, Frédo Durand, Eli Shechtman, and Xun Huang
Wu, Y., Wang, X., Yang, X., Liu, M., Zeng, D., Ye, H., Li, S.: Learn- ing occlusion-robust vision transformers for real-time uav tracking. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22 pp. 17103–17113 (2025). https://doi.org/10.1109/cvpr52734.2025.01594
-
[17]
In: Advances in Neural Information Processing Systems 37
Shen, F., Tang, J.: Imagpose: A unified conditional framework for pose-guided person generation. In: Advances in Neural Information Processing Systems 37. NeurIPS 2024, pp. 6246–6266 (2024). https://doi.org/10.52202/079017-0202
-
[18]
In: Computer Animation and Virtual Worlds,36, (2025)
Lin, C., Zou, C., Xu, H.: SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt. In: Computer Animation and Virtual Worlds,36, (2025). https://doi.org/10.1002/cav.70030
-
[19]
International Journal of Network Dynamics and Intelligence4(8), 100018 (2025) https://doi.org/10
Qiang, Z., Tao, W.: Enhancing visual SLAM localization accuracy through dynamic object detection and adaptive feature filtering. International Journal of Network Dynamics and Intelligence4(8), 100018 (2025) https://doi.org/10. 53941/ijndi.2025.100018
-
[20]
In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Eco: Efficient convolution operators for tracking. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6931–6939 (2017). https://doi.org/10.1109/cvpr.2017. 733
-
[21]
In: Proceedings of the IEEE/CVF Conference on Computer 25 Vision and Pattern Recognition, pp
Li, Y., Fu, C., Ding, F., Huang, Z., Lu, G.: Autotrack: Towards high-performance visual tracking for uav with automatic spatio-temporal regularization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11920–11929 (2020). https://doi.org/10.1109/cvpr42600.2020.01194
-
[22]
Chen, L., Wu, P., Tan, W., Li, H., Chen, H., Zeng, N.: A novel UAV-based road damage detection algorithm with lightweight convolution and attention mecha- nism. International Journal of Network Dynamics and Intelligence4(4), 100025 (2025) https://doi.org/10.53941/ijndi.2025.100025
-
[23]
Xu, Y., Wang, Z., Li, Z., Yuan, Y., Yu, G.: Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. Proceedings of the AAAI Conference on Artificial Intelligence34(07), 12549–12556 (2020) https: //doi.org/10.1609/aaai.v34i07.6944
-
[24]
In: Virtual Reality & Intelligent Hardware, pp
Zhao, Y., Zhang, H., Lu, P., Li, P., Wu, E., Sheng, B.: DSD-MatchingNet: Deformable sparse-to-dense feature matching for learning accurate correspon- dences. In: Virtual Reality & Intelligent Hardware, pp. 432–443 (2022). https: //doi.org/10.1016/j.vrih.2022.08.007
-
[25]
Hu, X., Zhong, B., Liang, Q., Zhang, S., Li, N., Li, X., Ji, R.: Transformer track- ing via frequency fusion. IEEE Transactions on Circuits and Systems for Video Technology34(2), 1020–1031 (2024) https://doi.org/10.1109/tcsvt.2023.3289624
-
[26]
Shi, L., Zhong, B., Liang, Q., Li, N., Zhang, S., Li, X.: Explicit visual prompts for visual object tracking. Proceedings of the AAAI Conference on Artificial Intelligence38(5), 4838–4846 (2024) https://doi.org/10.1609/aaai.v38i5.28286 23
-
[27]
Emogen: Emotional image content generation with text-to-image diffusion models,
Xie, J., Zhong, B., Mo, Z., Zhang, S., Shi, L., Song, S., Ji, R.: Autoregres- sive queries for adaptive tracking with spatio-temporal transformers. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19300–19309 (2024). https://doi.org/10.1109/cvpr52733.2024.01826
-
[28]
Yin, H., Vahdat, A., Alvarez, J.M., Mallya, A., Kautz, J., Molchanov, P.: A-vit: Adaptive tokens for efficient vision transformer. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022). https://doi.org/ 10.1109/cvpr52688.2022.01054
-
[29]
In: Proceedings of the British Machine Vision Conference 2021
Bakhtiarnia, A., Zhang, Q., Iosifidis, A.: Multi-exit vision transformer for dynamic inference. In: Proceedings of the British Machine Vision Conference 2021. BMVC 2021 (2021). https://doi.org/10.5244/c.35.338
-
[30]
Park, J., Oh, Y., Moon, G., Choi, H., Lee, K.M.: Handoccnet: Occlusion-robust 3d hand mesh estimation network. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1486–1495 (2022). https://doi.org/ 10.1109/cvpr52688.2022.00155
-
[31]
In: IEEE Transactions on Pattern Analysis and Machine Intelligence,47, pp
Wang, X., Lu, X., Bennamoun, M., Sheng, B.: Non-Rigid Point Cloud Regis- tration via Anisotropic Hybrid Field Harmonization. In: IEEE Transactions on Pattern Analysis and Machine Intelligence,47, pp. 7898–7915 (2025). https: //doi.org/10.1109/tpami.2025.3572584
-
[32]
Neurocomputing569, 127107 (2024) https://doi.org/10.2139/ssrn.4342053
Jiang, M., Wang, Y., McKeown, M.J., Wang, Z.J.: Occlusion-robust FAU recog- nition by mining latent space of masked autoencoders. Neurocomputing569, 127107 (2024) https://doi.org/10.2139/ssrn.4342053
-
[33]
Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., Zou, X.: Pedhunter: Occlusion robust pedestrian detector in crowded scenes. Proceedings of the AAAI Conference on Artificial Intelligence34(07), 10639–10646 (2020) https://doi.org/10.1609/aaai. v34i07.6690
-
[34]
Das, S., Biswas, S.K., Purkayastha, B.: Occlusion robust sign language recogni- tion system for indian sign language using cnn and pose features. Multimedia Tools and Applications83(36), 84141–84160 (2024) https://doi.org/10.1007/ s11042-024-19068-0
work page 2024
-
[35]
International Journal of Advanced Intelligence Paradigms15(1), 63 (2020) https://doi.org/10
Askar, W.A., Elmowafy, O., Ralescu, A., Youssif, A.A., Elnashar, G.A.: Occlu- sion detection and processing using optical flow and particle filter. International Journal of Advanced Intelligence Paradigms15(1), 63 (2020) https://doi.org/10. 1504/ijaip.2020.104107
-
[36]
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/cvpr.2018.00745 24
-
[37]
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: Convolutional Block Attention Module, pp. 3–19 (2018). https://doi.org/10.1007/978-3-030-01234-2 1
-
[38]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), pp. 13708–13717 (2021). https://doi.org/10.1109/cvpr46437. 2021.01350
-
[39]
In: IEEE Transactions on Multimedia, pp
Wen, Y., Luo, B., Shi, W., Ji, J., Cao, W., Yang, X., Sheng, B.: SAT-Net: Structure-Aware Transformer-Based Attention Fusion Network for Low-Quality Retinal Fundus Images Enhancement. In: IEEE Transactions on Multimedia, pp. 6198–6210 (2025). https://doi.org/10.1109/tmm.2025.3565935
-
[40]
He, K., Chen, X., Xie, S., Li, Y., Dollar, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022). https://doi.org/10.1109/cvpr52688.2022. 01553
-
[41]
Journal of Microscopy 183, 257–257 (1996) https://doi.org/10.1046/j.1365-2818.1996.00654.x
Mattfeldt, T.: Stochastic geometry and its applications. Journal of Microscopy 183, 257–257 (1996) https://doi.org/10.1046/j.1365-2818.1996.00654.x
-
[42]
Florida State University, Tallahassee, FL (2016)
Chen, Y.: Thinning algorithms for simulating point processes. Florida State University, Tallahassee, FL (2016)
work page 2016
-
[43]
In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp
Fan, H., Lin, L., Yang, F., al., e.: Lasot: A high-quality benchmark for large-scale single object tracking. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5369–5378 (2019). https://doi.org/10.1109/ cvpr.2019.00552
-
[44]
In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T
Lin, T.-Y., Maire, M., al., e.: Microsoft COCO: Common Objects in Context, pp. 740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1 48
-
[45]
Mueller, M., Bibi, A., al., e.: TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild, pp. 310–327 (2018). https://doi.org/10.1007/ 978-3-030-01246-5 19
work page 2018
-
[46]
Huang, L., Zhao, X., Huang, K.: Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analy- sis and Machine Intelligence43(5), 1562–1577 (2021) https://doi.org/10.1109/ tpami.2019.2957464
-
[47]
Li, S., Yeung, D.-Y.: Visual object tracking for unmanned aerial vehicles: A bench- mark and new motion models. Proceedings of the AAAI Conference on Artificial Intelligence31(1) (2017) https://doi.org/10.1609/aaai.v31i1.11205
-
[48]
Yu, H., Li, G., Zhang, W., Huang, Q., Du, D., Tian, Q., Sebe, N.: The unmanned aerial vehicle benchmark: Object detection, tracking and baseline. International 25 Journal of Computer Vision128(5), 1141–1159 (2019) https://doi.org/10.1007/ s11263-019-01266-1
work page 2019
-
[49]
Mueller, M., Smith, N., Ghanem, B.: A Benchmark and Simulator for UAV Tracking, pp. 445–461 (2016). https://doi.org/10.1007/978-3-319-46448-0 27
-
[50]
Derf: Decomposed radiance fields,
Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer tracking. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8122–8131 (2021) https://doi.org/10.1109/CVPR46437.2021.00803
-
[51]
Pattern Recognition127, 108614 (2022) https://doi.org/10.1016/j.patcog.2022.108614
Li, S., Liu, Y., Zhao, Q., Feng, Z.: Learning residue-aware correlation filters and refining scale for real-time uav tracking. Pattern Recognition127, 108614 (2022) https://doi.org/10.1016/j.patcog.2022.108614
-
[52]
Walk in the cloud: Learning curves for point clouds shape analysis, pp
Cao, Z., Fu, C., Ye, J., Li, B., Li, Y.: Hift: Hierarchical feature transformer for aerial tracking. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15437–15446 (2021). https://doi.org/10.1109/iccv48922.2021.01517
-
[53]
Cao, Z., Huang, Z., Pan, L., Zhang, S., Liu, Z., Fu, C.: Tctrack: Temporal contexts for aerial tracking. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14778–14788 (2022). https://doi.org/10.1109/ cvpr52688.2022.01438
-
[54]
IEEE Robotics and Automation Letters8(2), 1101–1108 (2023) https://doi.org/10.1109/lra.2023.3236584
Zuo, H., Fu, C., Li, S., Lu, K., Li, Y., Feng, C.: Adversarial blur-deblur network for robust uav tracking. IEEE Robotics and Automation Letters8(2), 1101–1108 (2023) https://doi.org/10.1109/lra.2023.3236584
-
[55]
ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Yao, L., Fu, C., Li, S., Zheng, G., Ye, J.: Sgdvit: Saliency-guided dynamic vision transformer for uav tracking. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 3353–3359 (2023). https://doi.org/10. 1109/icra48891.2023.10161487
-
[56]
Fu, C., Lei, X., Zuo, H., Yao, L., Zheng, G., Pan, J.: Progressive representation learning for real-time uav tracking. In: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5072–5079 (2024). https://doi. org/10.1109/iros58592.2024.10803050
-
[57]
Wei, Q., Zeng, B., Liu, J., He, L., Zeng, G.: Litetrack: Layer pruning with asynchronous feature extraction for lightweight and efficient visual tracking. In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 4968–4975 (2024). https://doi.org/10.1109/icra57147.2024.10610022
-
[58]
In: 2024 IEEE/CVF Winter Conference on Applications of Com- puter Vision (W ACV), pp
Gopal, G.Y., Amer, M.A.: Separable self and mixed attention transformers for efficient object tracking. In: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 6694–6703 (2024). https://doi.org/10.1109/ wacv57701.2024.00657 26
-
[59]
DropBlock: A regularization method for convolutional networks
Ghiasi, G., Lin, T.-Y., Le, Q.V.: Dropblock: A regularization method for convo- lutional networks. Advances in neural information processing systems31(2018) https://doi.org/arXiv:1810.12890
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[60]
In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
Kirillov, A., Mintun, E., al., e.: Segment anything. In: 2023 IEEE/CVF Inter- national Conference on Computer Vision (ICCV), pp. 3992–4003 (2023). https: //doi.org/10.1109/iccv51070.2023.00371
-
[61]
Yun, S., Han, D., Chun, S., Oh, S.J., Yoo, Y., Choe, J.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6022–6031 (2019). https://doi.org/10.1109/iccv.2019.00612
-
[62]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Gao, S., Zhou, C., Zhang, J.: Generalized relation modeling for transformer track- ing. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 18686–18695 (2023). https://doi.org/10.1109/cvpr52729.2023. 01792 27
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.