SCT-MOT: Enhancing Air-to-Air Multiple UAVs Tracking with Swarm-Coupled Motion and Trajectory Guidance

Defu Lin; Ren Jin; Shaoming He; Siqing Cheng; Tao Song; Zhaochen Chu

arxiv: 2604.06883 · v1 · submitted 2026-04-08 · 💻 cs.CV

SCT-MOT: Enhancing Air-to-Air Multiple UAVs Tracking with Swarm-Coupled Motion and Trajectory Guidance

Zhaochen Chu , Tao Song , Ren Jin , Shaoming He , Defu Lin , Siqing Cheng This is my paper

Pith reviewed 2026-05-10 18:59 UTC · model grok-4.3

classification 💻 cs.CV

keywords UAV trackingmultiple object trackingswarm motiontrajectory predictionfeature fusionair-to-air trackingmotion modeling

0 comments

The pith

SCT-MOT tracks multiple UAVs in swarms more accurately by modeling their coupled motions and guiding visual features with predicted trajectories.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper seeks to improve tracking of small UAVs flying in groups from the air, where individual motions are interdependent and objects are hard to see clearly. It does so by introducing a method that predicts future positions while considering how all drones in the swarm move together, rather than separately. It also fuses those predictions back into the current image features to keep track of identities over time. If successful, this would mean fewer broken paths and fewer mistaken identity changes in challenging swarm videos, which matters for applications like drone monitoring or coordination.

Core claim

The authors establish that a Swarm Motion-Aware Trajectory Prediction module, which processes the swarm's historical trajectories and posture-aware appearance features together, forecasts nonlinear group trajectories more accurately, and that integrating these forecasts via a Trajectory-Guided Spatio-Temporal Feature Fusion module with current frame features strengthens temporal consistency for weak objects, leading to overall better tracking performance.

What carries the argument

Swarm Motion-Aware Trajectory Prediction (SMTP) that jointly models historical trajectories and posture-aware appearance features from a swarm-level perspective to forecast coupled group motions.

Load-bearing premise

That treating the UAVs as a coupled swarm system rather than independent objects will produce better motion forecasts and tracking consistency.

What would settle it

An experiment showing that trajectory prediction accuracy does not improve when using swarm-level modeling compared to per-object modeling on the AIRMOT dataset would disprove the core benefit of the SMTP module.

Figures

Figures reproduced from arXiv: 2604.06883 by Defu Lin, Ren Jin, Shaoming He, Siqing Cheng, Tao Song, Zhaochen Chu.

**Figure 2.** Figure 2: The overall architecture of the SCT-MOT framework. The swarm motion aware trajectory prediction, trajectory-guided spatio-temporal [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The overall architecture of the SMTP module. This module includes: a temporal-posture attention mechanism, a global-local [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: The architecture of the TG-STFF module. F m t ′ ∈ RDl×Cl represents the expanded feature map of the current frame, and F mtpred ∈ RDl×Cl denotes the Gaussian-distributed predicted feature map. We use cross-attention module to fuse them to generate the final spatio-temporal feature map. collected as: Mt = {F 1 tfuse , · · · , F m tfuse } (16) Since the fused features preserve the same spatial and channel di… view at source ↗

**Figure 5.** Figure 5: Trajectory prediction comparison between different modules [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of feature fusion in the TG-STFF module. From left to right: ground-truth frame, raw feature map, fused feature map [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Performance comparison of SCT-MOT with existing MOT methods on AIRMOT and UAVSwarm datasets, different objects are [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

read the original abstract

Air-to-air tracking of swarm UAVs presents significant challenges due to the complex nonlinear group motion and weak visual cues for small objects, which often cause detection failures, trajectory fragmentation, and identity switches. Although existing methods have attempted to improve performance by incorporating trajectory prediction, they model each object independently, neglecting the swarm-level motion dependencies. Their limited integration between motion prediction and appearance representation also weakens the spatio-temporal consistency required for tracking in visually ambiguous and cluttered environments, making it difficult to maintain coherent trajectories and reliable associations. To address these challenges, we propose SCT-MOT, a tracking framework that integrates Swarm-Coupled motion modeling and Trajectory-guided feature fusion. First, we develop a Swarm Motion-Aware Trajectory Prediction (SMTP) module jointly models historical trajectories and posture-aware appearance features from a swarm-level perspective, enabling more accurate forecasting of the nonlinear, coupled group trajectories. Second, we design a Trajectory-Guided Spatio-Temporal Feature Fusion (TG-STFF) module aligns predicted positions with historical visual cues and deeply integrates them with current frame features, enhancing temporal consistency and spatial discriminability for weak objects. Extensive experiments on three public air-to-air swarm UAV tracking datasets, including AIRMOT, MOT-FLY, and UAVSwarm, demonstrate that SMTP achieves more accurate trajectory forecasts and yields a 1.21\% IDF1 improvement over the state-of-the-art trajectory prediction module EqMotion when integrated into the same MOT framework. Overall, our SCT-MOT consistently achieves superior accuracy and robustness compared to state-of-the-art trackers across multiple metrics under complex swarm scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SCT-MOT adds swarm-level motion coupling to UAV tracking and shows small consistent gains backed by ablations and direct baselines.

read the letter

This paper's core move is to stop treating each UAV as an independent track and instead model the group's coupled nonlinear motion explicitly. The SMTP module pulls historical trajectories and posture-aware appearance features together at the swarm level to forecast where the whole formation is heading. TG-STFF then aligns those predictions with past visual cues and merges them into current-frame features so weak or small detections stay connected over time. That combination is presented as the new piece relative to prior independent-object predictors like EqMotion. The experiments back it up with ablations that isolate each module, a head-to-head swap of SMTP into an otherwise fixed tracker that yields the reported 1.21% IDF1 lift, and results across AIRMOT, MOT-FLY, and UAVSwarm that hold on multiple metrics. The gains are steady rather than dramatic, which matches the narrow problem of air-to-air swarms where group dynamics and tiny visual signals are the main pain points. The soft spots are proportionate. The absolute improvement is modest, so real-world value hinges on whether those extra points matter in deployment. The approach assumes swarm coupling is strong and predictable enough to exploit and that posture features add usable signal; those assumptions test out on the chosen datasets but would need checking in more varied lighting or formation changes. No load-bearing gaps in baselines or internal contradictions appear. This is useful reading for anyone working on aerial multi-object tracking or swarm applications. A broader MOT researcher might borrow the coupling idea but probably won't port the full pipeline. It deserves a serious referee because the claims are scoped tightly, the comparisons are direct, and the evidence is structured enough to let others verify or extend it. I would send it out for review.

Referee Report

0 major / 3 minor

Summary. The paper proposes SCT-MOT, a tracking framework for air-to-air multiple UAV swarms that integrates two new modules: Swarm Motion-Aware Trajectory Prediction (SMTP), which jointly models historical trajectories and posture-aware appearance features from a swarm-level perspective to forecast nonlinear coupled group motions, and Trajectory-Guided Spatio-Temporal Feature Fusion (TG-STFF), which aligns predicted positions with historical visual cues to improve temporal consistency and spatial discriminability for weak objects. Experiments on AIRMOT, MOT-FLY, and UAVSwarm datasets report consistent superiority over state-of-the-art trackers across multiple metrics, with SMTP yielding a 1.21% IDF1 improvement over EqMotion when substituted into the same MOT pipeline.

Significance. If the reported gains are reproducible, the work offers a meaningful advance in multi-object tracking for UAV swarms by explicitly incorporating swarm-level motion coupling and tighter motion-appearance integration, addressing key failure modes (trajectory fragmentation, identity switches) in visually ambiguous aerial scenarios. The provision of module ablations and a controlled replacement of EqMotion strengthens the evidential basis for the central claims.

minor comments (3)

[Abstract] Abstract: the claim of 'superior accuracy and robustness ... across multiple metrics' would be strengthened by naming the specific metrics (e.g., MOTA, IDF1, HOTA) and the magnitude of gains on each dataset rather than relying on the single 1.21% IDF1 figure.
[§3.2] The description of TG-STFF states that it 'aligns predicted positions with historical visual cues and deeply integrates them'; a short schematic or pseudocode in §3.2 would clarify the exact alignment operation and fusion depth.
[Experimental results tables] Tables reporting results on AIRMOT, MOT-FLY, and UAVSwarm should include standard deviations or confidence intervals for the key metrics to allow assessment of whether the observed deltas exceed experimental variability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation for minor revision. The recognition that SCT-MOT addresses key challenges in air-to-air swarm UAV tracking through swarm-level motion coupling and trajectory-guided feature fusion is appreciated, as is the note on the evidential support from ablations and controlled comparisons.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper proposes two new modules (SMTP for swarm-coupled trajectory prediction and TG-STFF for trajectory-guided feature fusion) integrated into an MOT framework. The central claims rest on empirical results from ablations and comparisons against baselines like EqMotion on three datasets, with reported gains such as 1.21% IDF1 improvement. No equations, derivations, or self-referential definitions are present that reduce the claimed performance improvements to fitted parameters or prior self-citations by construction. The method is described as an integration of novel components without load-bearing steps that collapse to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract alone supplies no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that swarm-level coupling exists and can be exploited via the described modules.

pith-pipeline@v0.9.0 · 5600 in / 1167 out tokens · 49499 ms · 2026-05-10T18:59:27.783686+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

SMTP module jointly models historical trajectories and posture-aware appearance features from a swarm-level perspective... temporal-posture attention... global-local spatial-posture attention... temporal residual module with dilated causal convolutions
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

TG-STFF... multi-head cross-attention... Gaussian kernel centered at predicted locations... fuses predictive feature maps with current frame features

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

Adaptive leader–follower formation control for swarms of unmanned aerial vehicles with motion constraints and unknown disturbances,

Y . Liang, Q. Dong, and Y . Zhao, “Adaptive leader–follower formation control for swarms of unmanned aerial vehicles with motion constraints and unknown disturbances,”Chin. J. Aeronaut., vol. 33, no. 11, pp. 2972–2988, 2020

work page 2020
[2]

State-of-the-art and future research challenges in uav swarms,

S. Javed, A. Hassan, R. Ahmad, W. Ahmed, R. Ahmed, A. Saadat, and M. Guizani, “State-of-the-art and future research challenges in uav swarms,”IEEE Internet of Things Journal, vol. 11, no. 11, pp. 19 023–19 045, 2024

work page 2024
[3]

Efficient and secured swarm pattern multi-uav communication,

G. Raja, S. Anbalagan, A. Ganapathisubramaniyan, M. S. Sel- vakumar, A. K. Bashir, and S. Mumtaz, “Efficient and secured swarm pattern multi-uav communication,”IEEE Transactions on Vehicular Technology, vol. 70, no. 7, pp. 7050–7058, 2021

work page 2021
[4]

Distributed cooperative strategy of uav swarm without speed measurement under saturation attack mission,

L. Wen, Z. Zhen, C. Tao, and J. Ding, “Distributed cooperative strategy of uav swarm without speed measurement under saturation attack mission,”IEEE Transactions on Aerospace and Electronic Systems, vol. 60, no. 4, pp. 4518–4529, 2024

work page 2024
[5]

Toward swarm coor- dination: Topology-aware inter-uav routing optimization,

L. Hong, H. Guo, J. Liu, and Y . Zhang, “Toward swarm coor- dination: Topology-aware inter-uav routing optimization,”IEEE Transactions on Vehicular Technology, vol. 69, no. 9, pp. 10 177– 10 187, 2020

work page 2020
[6]

Vision-based anti- uav detection and tracking,

J. Zhao, J. Zhang, D. Li, and D. Wang, “Vision-based anti- uav detection and tracking,”IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 12, pp. 25 323–25 334, 2022

work page 2022
[7]

Vision- based swarm tracking of multiple uavs in air-to-air scenarios,

Z. Chu, T. Song, R. Jin, D. Lin, H. Shen, and M. Lyu, “Vision- based swarm tracking of multiple uavs in air-to-air scenarios,” Chinese Journal of Aeronautics, p. 103558, 2025

work page 2025
[8]

Multiple object tracking: A literature review,

W.-H. Luo, J.-L. Xing, A. Milan, X.-Q. Zhang, W. Liu, and T.-K. Kim, “Multiple object tracking: A literature review,”Artif. Intell., vol. 293, p. 103448, 2021

work page 2021
[9]

Simple online and realtime tracking with a deep association metric,

N. Wojke, A. Bewley, and D. Paulus, “Simple online and realtime tracking with a deep association metric,” inICIP 2017: Proceed- ings of the IEEE international conference on image processing. IEEE, 2017, pp. 3645–3649

work page 2017
[10]

Giaotracker: A comprehensive framework for mcmot with global information and optimizing strategies in visdrone 2021,

Y .-H. Du, J.-F. Wan, Y .-Y . Zhao, B.-Y . Zhang, Z.-H. Tong, and J.- H. Dong, “Giaotracker: A comprehensive framework for mcmot with global information and optimizing strategies in visdrone 2021,” inICCVW 2021: Proceedings of the IEEE/CVF interna- tional conference on computer vision workshops, 2021, pp. 2809– 2819

work page 2021
[11]

Strongsort: Make deepsort great again,

Y .-H. Du, Z.-C. Zhao, Y . Song, Y .-Y . Zhao, F. Su, T. Gong, and H.-Y . Meng, “Strongsort: Make deepsort great again,”IEEE Trans. Multimedia., vol. 25, pp. 8725–8737, 2023

work page 2023
[12]

Chained- tracker: Chaining paired attentive regression results for end-to- end joint multiple-object detection and tracking,

J.-L. Peng, C.-A. Wang, F.-B. Wan, Y . Wu, Y .-B. Wang, Y . Tai, C.-J. Wang, J.-L. Li, F.-Y . Huang, and Y .-W. Fu, “Chained- tracker: Chaining paired attentive regression results for end-to- end joint multiple-object detection and tracking,” inECCV 2020: Proceedings of the European conference on computer vision. Springer, 2020, pp. 145–161

work page 2020
[13]

Qdtrack: Quasi-dense similarity learning for appearance-only multiple object tracking,

T. Fischer, T. E. Huang, J.-M. Pang, L.-L. Qiu, H.-F. Chen, T. Darrell, and F. Yu, “Qdtrack: Quasi-dense similarity learning for appearance-only multiple object tracking,”IEEE Trans. Pattern. Anal. Mach. Intell., vol. 45, no. 12, pp. 15 380–15 393, 2023

work page 2023
[14]

Attentiontrack: Multiple object tracking in traffic scenarios using features attention,

C. Zhang, S. Zheng, H. Wu, Z. Gu, W. Sun, and L. Yang, “Attentiontrack: Multiple object tracking in traffic scenarios using features attention,”IEEE Transactions on Intelligent Transporta- tion Systems, vol. 25, no. 2, pp. 1661–1674, 2024

work page 2024
[15]

Lightweight and computationally efficient yolo for rogue uav detection in complex backgrounds,

Z. Kaleem, “Lightweight and computationally efficient yolo for rogue uav detection in complex backgrounds,”IEEE Transactions on Aerospace and Electronic Systems, vol. 61, no. 2, pp. 5362– 5366, 2025

work page 2025
[16]

Anti-uav410: A thermal infrared benchmark and customized scheme for tracking drones in the wild,

B. Huang, J.-A. Li, J.-J. Chen, G. Wang, J. Zhao, and T.-F. Xu, “Anti-uav410: A thermal infrared benchmark and customized scheme for tracking drones in the wild,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 5, pp. 2852–2865, 2024

work page 2024
[17]

Uavswarm dataset: An unmanned aerial vehicle swarm dataset for multiple object tracking,

C. Wang, Y . Su, J. Wang, T. Wang, and Q. Gao, “Uavswarm dataset: An unmanned aerial vehicle swarm dataset for multiple object tracking,”Remote Sensing, vol. 14, no. 11, 2022

work page 2022
[18]

An interactively motion- assisted network for multiple object tracking in complex traffic scenes,

Z. Shen, K. Cai, P. Zhao, and X. Luo, “An interactively motion- assisted network for multiple object tracking in complex traffic scenes,”IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 2, pp. 1992–2004, 2024

work page 1992
[19]

Lttrack: Rethinking the tracking framework for long-term multi-object tracking,

J. Lin, G. Liang, and R. Zhang, “Lttrack: Rethinking the tracking framework for long-term multi-object tracking,”IEEE Transac- tions on Circuits and Systems for Video Technology, vol. 34, no. 10, pp. 9866–9881, 2024

work page 2024
[20]

One-shot multiple object tracking with robust id preservation,

W. Lv, N. Zhang, J. Zhang, and D. Zeng, “One-shot multiple object tracking with robust id preservation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 6, pp. 4473–4488, 2024

work page 2024
[21]

Yolo-3dmm for simultaneous multiple object detection and tracking in traffic scenarios,

L. Liu, X. Song, H. Song, S. Sun, X.-F. Han, N. Akhtar, and A. Mian, “Yolo-3dmm for simultaneous multiple object detection and tracking in traffic scenarios,”IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 8, pp. 9467–9481, 2024

work page 2024
[22]

Bytetrack: Multi- object tracking by associating every detection box,

Y .-F. Zhang, P.-Z. Sun, Y . Jiang, D.-D. Yu, F.-C. Weng, Z.-H. Yuan, P. Luo, W.-Y . Liu, and X.-G. Wang, “Bytetrack: Multi- object tracking by associating every detection box,” inECCV 2022: Proceedings of the European conference on computer vision. Springer, 2022, pp. 1–21

work page 2022
[23]

Observation-centric sort: Rethinking sort for robust multi-object tracking,

J.-K. Cao, J.-M. Pang, X.-S. Weng, R. Khirodkar, and K. Kitani, “Observation-centric sort: Rethinking sort for robust multi-object tracking,” inCVPR 2023: Proceedings of the IEEE /CVF confer- ence on computer vision and pattern recognition, 2023, pp. 9686– 9696

work page 2023
[24]

Hybrid-sort: Weak cues matter for online ZHAOCHEN CHU ET AL.: SCT-MOT 15 multi-object tracking,

M.-Z. Yang, G.-X. Han, B. Yan, W.-H. Zhang, J.-Q. Qi, H.-C. Lu, and D. Wang, “Hybrid-sort: Weak cues matter for online ZHAOCHEN CHU ET AL.: SCT-MOT 15 multi-object tracking,” inAAAI 2024: Proceedings of the AAAI conference on artificial intelligence, vol. 38, no. 7, 2024, pp. 6504– 6512

work page 2024
[25]

Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space,

F. Yang, S. Odashima, S. Masui, and S. Jiang, “Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space,” inCVPR 2023: Pro- ceedings of the IEEE /CVF winter conference on applications of computer vision, 2023, pp. 4799–4808

work page 2023
[26]

Social lstm: Human trajectory prediction in crowded spaces,

A. Alahi, K. Goel, V . Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 961–971

work page 2016
[27]

Contextual recurrent predictive model for long-term intent prediction of vulnerable road users,

K. Saleh, M. Hossny, and S. Nahavandi, “Contextual recurrent predictive model for long-term intent prediction of vulnerable road users,”IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 8, pp. 3398–3408, 2020

work page 2020
[28]

Analysis of recurrent neural networks for probabilistic modeling of driver be- havior,

J. Morton, T. A. Wheeler, and M. J. Kochenderfer, “Analysis of recurrent neural networks for probabilistic modeling of driver be- havior,”IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 5, pp. 1289–1298, 2017

work page 2017
[29]

Social attention: Modeling attention in human crowds,

A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in2018 IEEE International Confer- ence on Robotics and Automation (ICRA), 2018, pp. 4601–4607

work page 2018
[30]

Trajec- tory forecasting based on prior-aware directed graph convolutional neural network,

Y . Su, J. Du, Y . Li, X. Li, R. Liang, Z. Hua, and J. Zhou, “Trajec- tory forecasting based on prior-aware directed graph convolutional neural network,”IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 9, pp. 16 773–16 785, 2022

work page 2022
[31]

Collaborative uncertainty in multi-agent trajectory forecasting,

B. Tang, Y . Zhong, U. Neumann, G. Wang, S. Chen, and Y . Zhang, “Collaborative uncertainty in multi-agent trajectory forecasting,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 6328–6340

work page 2021
[32]

Long-short term spatio-temporal aggrega- tion for trajectory prediction,

C. Yang and Z. Pei, “Long-short term spatio-temporal aggrega- tion for trajectory prediction,”IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 4, pp. 4114–4126, 2023

work page 2023
[33]

Mantra: Memory augmented networks for multiple trajectory prediction,

F. Marchetti, F. Becattini, L. Seidenari, and A. Del Bimbo, “Mantra: Memory augmented networks for multiple trajectory prediction,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7141–7150

work page 2020
[34]

Evolvegraph: Multi-agent trajectory prediction with dynamic relational reason- ing,

J. Li, F. Yang, M. Tomizuka, and C. Choi, “Evolvegraph: Multi-agent trajectory prediction with dynamic relational reason- ing,” inAdvances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 19 783–19 794

work page 2020
[35]

Eqmotion: Equivariant multi-agent motion prediction with invariant interaction reasoning,

C. Xu, R. T. Tan, Y . Tan, S. Chen, Y . G. Wang, X. Wang, and Y . Wang, “Eqmotion: Equivariant multi-agent motion prediction with invariant interaction reasoning,” in2023 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1410–1420

work page 2023
[36]

Bactrack: Building appearance collection for aerial tracking,

X. Liu, T. Xu, Y . Wang, Z. Yu, X. Yuan, H. Qin, and J. Li, “Bactrack: Building appearance collection for aerial tracking,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 6, pp. 5002–5017, 2024

work page 2024
[37]

Multi-object tracking meets moving uav,

S. Liu, X. Li, H.-C. Lu, and Y . He, “Multi-object tracking meets moving uav,” inCVPR 2022: Proceedings of the IEEE /CVF conference on computer vision and pattern recognition, 2022, pp. 8876–8885

work page 2022
[38]

Sea you later: Metadata-guided long- term re-identification for uav-based multi-object tracking,

C.-Y . Yang, H.-W. Huang, Z. Jiang, H.-C. Kuo, J. Mei, C.-I. Huang, and J.-N. Hwang, “Sea you later: Metadata-guided long- term re-identification for uav-based multi-object tracking,” in2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2024, pp. 805–812

work page 2024
[39]

Ucmctrack: Multi-object tracking with uniform camera motion compensation,

K.-F. Yi, K. Luo, X.-L. Luo, J.-G. Huang, H. Wu, R.-D. Hu, and W. Hao, “Ucmctrack: Multi-object tracking with uniform camera motion compensation,” inAAAI 2023: Proceedings of the AAAI conference on artificial intelligence, vol. 38, no. 7, 2024, pp. 6702– 6710

work page 2023
[40]

Iterative scale-up expansioniou and deep features association for multi-object tracking in sports,

H.-W. Huang, C.-Y . Yang, J. Sun, P.-K. Kim, K.-J. Kim, K. Lee, C.-I. Huang, and J.-N. Hwang, “Iterative scale-up expansioniou and deep features association for multi-object tracking in sports,” inProceedings of the IEEE/CVF Winter Conference on Applica- tions of Computer Vision, 2024, pp. 163–172

work page 2024
[41]

Dc-mot: Motion deblurring and compensation for multi-object tracking in uav videos,

S. Cheng, M. Yao, and X. Xiao, “Dc-mot: Motion deblurring and compensation for multi-object tracking in uav videos,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 789–795

work page 2023
[42]

An experimental evaluation based on new air-to-air multi-uav tracking dataset,

Z. Chu, T. Song, R. Jin, and T. Jiang, “An experimental evaluation based on new air-to-air multi-uav tracking dataset,” inICUS 2023: Proceedings of the IEEE international conference on unmanned systems. IEEE, 2023, pp. 671–676

work page 2023
[43]

Multi-object continuous robust tracking algorithm for anti-uav swarm,

C. Wang, Y . Su, L. Wang, T. Wang, J. Wang, and Q. Gao, “Multi-object continuous robust tracking algorithm for anti-uav swarm,”Acta Aeronaut. Astronaut. Sin., vol. 45, no. 7, pp. 256– 269 [Chinese], 2024

work page 2024
[44]

Fairmot: On the fairness of detection and re-identification in multiple object tracking,

Y . Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, “Fairmot: On the fairness of detection and re-identification in multiple object tracking,”Int. J. Comput. Vis., vol. 129, pp. 3069–3087, 2021

work page 2021
[45]

Vision-based air-to-air multi- uavs tracking,

Z. Chu, T. Song, R. Jin, and D. Lin, “Vision-based air-to-air multi- uavs tracking,”Acta Aeronaut. Astronaut. Sin., vol. 45, no. 14, p. 629379 [Chinese], 2024

work page 2024
[46]

Motrv3: Release-fetch supervision for end-to-end multi-object tracking

E. Yu, T. Wang, Z. Li, Y . Zhang, X. Zhang, and W. Tao, “Motrv3: Release-fetch supervision for end-to-end multi-object tracking,” ArXiv, vol. abs/2305.14298, 2023. Zhaochen Chureceived the B.E. degree in Science in Flight Vehicle Design and Engi- neering from Beijing Institute of Technology, Beijing, China, in 2021. He is currently pur- suing the Ph.D. de...

work page arXiv 2023

[1] [1]

Adaptive leader–follower formation control for swarms of unmanned aerial vehicles with motion constraints and unknown disturbances,

Y . Liang, Q. Dong, and Y . Zhao, “Adaptive leader–follower formation control for swarms of unmanned aerial vehicles with motion constraints and unknown disturbances,”Chin. J. Aeronaut., vol. 33, no. 11, pp. 2972–2988, 2020

work page 2020

[2] [2]

State-of-the-art and future research challenges in uav swarms,

S. Javed, A. Hassan, R. Ahmad, W. Ahmed, R. Ahmed, A. Saadat, and M. Guizani, “State-of-the-art and future research challenges in uav swarms,”IEEE Internet of Things Journal, vol. 11, no. 11, pp. 19 023–19 045, 2024

work page 2024

[3] [3]

Efficient and secured swarm pattern multi-uav communication,

G. Raja, S. Anbalagan, A. Ganapathisubramaniyan, M. S. Sel- vakumar, A. K. Bashir, and S. Mumtaz, “Efficient and secured swarm pattern multi-uav communication,”IEEE Transactions on Vehicular Technology, vol. 70, no. 7, pp. 7050–7058, 2021

work page 2021

[4] [4]

Distributed cooperative strategy of uav swarm without speed measurement under saturation attack mission,

L. Wen, Z. Zhen, C. Tao, and J. Ding, “Distributed cooperative strategy of uav swarm without speed measurement under saturation attack mission,”IEEE Transactions on Aerospace and Electronic Systems, vol. 60, no. 4, pp. 4518–4529, 2024

work page 2024

[5] [5]

Toward swarm coor- dination: Topology-aware inter-uav routing optimization,

L. Hong, H. Guo, J. Liu, and Y . Zhang, “Toward swarm coor- dination: Topology-aware inter-uav routing optimization,”IEEE Transactions on Vehicular Technology, vol. 69, no. 9, pp. 10 177– 10 187, 2020

work page 2020

[6] [6]

Vision-based anti- uav detection and tracking,

J. Zhao, J. Zhang, D. Li, and D. Wang, “Vision-based anti- uav detection and tracking,”IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 12, pp. 25 323–25 334, 2022

work page 2022

[7] [7]

Vision- based swarm tracking of multiple uavs in air-to-air scenarios,

Z. Chu, T. Song, R. Jin, D. Lin, H. Shen, and M. Lyu, “Vision- based swarm tracking of multiple uavs in air-to-air scenarios,” Chinese Journal of Aeronautics, p. 103558, 2025

work page 2025

[8] [8]

Multiple object tracking: A literature review,

W.-H. Luo, J.-L. Xing, A. Milan, X.-Q. Zhang, W. Liu, and T.-K. Kim, “Multiple object tracking: A literature review,”Artif. Intell., vol. 293, p. 103448, 2021

work page 2021

[9] [9]

Simple online and realtime tracking with a deep association metric,

N. Wojke, A. Bewley, and D. Paulus, “Simple online and realtime tracking with a deep association metric,” inICIP 2017: Proceed- ings of the IEEE international conference on image processing. IEEE, 2017, pp. 3645–3649

work page 2017

[10] [10]

Giaotracker: A comprehensive framework for mcmot with global information and optimizing strategies in visdrone 2021,

Y .-H. Du, J.-F. Wan, Y .-Y . Zhao, B.-Y . Zhang, Z.-H. Tong, and J.- H. Dong, “Giaotracker: A comprehensive framework for mcmot with global information and optimizing strategies in visdrone 2021,” inICCVW 2021: Proceedings of the IEEE/CVF interna- tional conference on computer vision workshops, 2021, pp. 2809– 2819

work page 2021

[11] [11]

Strongsort: Make deepsort great again,

Y .-H. Du, Z.-C. Zhao, Y . Song, Y .-Y . Zhao, F. Su, T. Gong, and H.-Y . Meng, “Strongsort: Make deepsort great again,”IEEE Trans. Multimedia., vol. 25, pp. 8725–8737, 2023

work page 2023

[12] [12]

Chained- tracker: Chaining paired attentive regression results for end-to- end joint multiple-object detection and tracking,

J.-L. Peng, C.-A. Wang, F.-B. Wan, Y . Wu, Y .-B. Wang, Y . Tai, C.-J. Wang, J.-L. Li, F.-Y . Huang, and Y .-W. Fu, “Chained- tracker: Chaining paired attentive regression results for end-to- end joint multiple-object detection and tracking,” inECCV 2020: Proceedings of the European conference on computer vision. Springer, 2020, pp. 145–161

work page 2020

[13] [13]

Qdtrack: Quasi-dense similarity learning for appearance-only multiple object tracking,

T. Fischer, T. E. Huang, J.-M. Pang, L.-L. Qiu, H.-F. Chen, T. Darrell, and F. Yu, “Qdtrack: Quasi-dense similarity learning for appearance-only multiple object tracking,”IEEE Trans. Pattern. Anal. Mach. Intell., vol. 45, no. 12, pp. 15 380–15 393, 2023

work page 2023

[14] [14]

Attentiontrack: Multiple object tracking in traffic scenarios using features attention,

C. Zhang, S. Zheng, H. Wu, Z. Gu, W. Sun, and L. Yang, “Attentiontrack: Multiple object tracking in traffic scenarios using features attention,”IEEE Transactions on Intelligent Transporta- tion Systems, vol. 25, no. 2, pp. 1661–1674, 2024

work page 2024

[15] [15]

Lightweight and computationally efficient yolo for rogue uav detection in complex backgrounds,

Z. Kaleem, “Lightweight and computationally efficient yolo for rogue uav detection in complex backgrounds,”IEEE Transactions on Aerospace and Electronic Systems, vol. 61, no. 2, pp. 5362– 5366, 2025

work page 2025

[16] [16]

Anti-uav410: A thermal infrared benchmark and customized scheme for tracking drones in the wild,

B. Huang, J.-A. Li, J.-J. Chen, G. Wang, J. Zhao, and T.-F. Xu, “Anti-uav410: A thermal infrared benchmark and customized scheme for tracking drones in the wild,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 5, pp. 2852–2865, 2024

work page 2024

[17] [17]

Uavswarm dataset: An unmanned aerial vehicle swarm dataset for multiple object tracking,

C. Wang, Y . Su, J. Wang, T. Wang, and Q. Gao, “Uavswarm dataset: An unmanned aerial vehicle swarm dataset for multiple object tracking,”Remote Sensing, vol. 14, no. 11, 2022

work page 2022

[18] [18]

An interactively motion- assisted network for multiple object tracking in complex traffic scenes,

Z. Shen, K. Cai, P. Zhao, and X. Luo, “An interactively motion- assisted network for multiple object tracking in complex traffic scenes,”IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 2, pp. 1992–2004, 2024

work page 1992

[19] [19]

Lttrack: Rethinking the tracking framework for long-term multi-object tracking,

J. Lin, G. Liang, and R. Zhang, “Lttrack: Rethinking the tracking framework for long-term multi-object tracking,”IEEE Transac- tions on Circuits and Systems for Video Technology, vol. 34, no. 10, pp. 9866–9881, 2024

work page 2024

[20] [20]

One-shot multiple object tracking with robust id preservation,

W. Lv, N. Zhang, J. Zhang, and D. Zeng, “One-shot multiple object tracking with robust id preservation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 6, pp. 4473–4488, 2024

work page 2024

[21] [21]

Yolo-3dmm for simultaneous multiple object detection and tracking in traffic scenarios,

L. Liu, X. Song, H. Song, S. Sun, X.-F. Han, N. Akhtar, and A. Mian, “Yolo-3dmm for simultaneous multiple object detection and tracking in traffic scenarios,”IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 8, pp. 9467–9481, 2024

work page 2024

[22] [22]

Bytetrack: Multi- object tracking by associating every detection box,

Y .-F. Zhang, P.-Z. Sun, Y . Jiang, D.-D. Yu, F.-C. Weng, Z.-H. Yuan, P. Luo, W.-Y . Liu, and X.-G. Wang, “Bytetrack: Multi- object tracking by associating every detection box,” inECCV 2022: Proceedings of the European conference on computer vision. Springer, 2022, pp. 1–21

work page 2022

[23] [23]

Observation-centric sort: Rethinking sort for robust multi-object tracking,

J.-K. Cao, J.-M. Pang, X.-S. Weng, R. Khirodkar, and K. Kitani, “Observation-centric sort: Rethinking sort for robust multi-object tracking,” inCVPR 2023: Proceedings of the IEEE /CVF confer- ence on computer vision and pattern recognition, 2023, pp. 9686– 9696

work page 2023

[24] [24]

Hybrid-sort: Weak cues matter for online ZHAOCHEN CHU ET AL.: SCT-MOT 15 multi-object tracking,

M.-Z. Yang, G.-X. Han, B. Yan, W.-H. Zhang, J.-Q. Qi, H.-C. Lu, and D. Wang, “Hybrid-sort: Weak cues matter for online ZHAOCHEN CHU ET AL.: SCT-MOT 15 multi-object tracking,” inAAAI 2024: Proceedings of the AAAI conference on artificial intelligence, vol. 38, no. 7, 2024, pp. 6504– 6512

work page 2024

[25] [25]

Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space,

F. Yang, S. Odashima, S. Masui, and S. Jiang, “Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space,” inCVPR 2023: Pro- ceedings of the IEEE /CVF winter conference on applications of computer vision, 2023, pp. 4799–4808

work page 2023

[26] [26]

Social lstm: Human trajectory prediction in crowded spaces,

A. Alahi, K. Goel, V . Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 961–971

work page 2016

[27] [27]

Contextual recurrent predictive model for long-term intent prediction of vulnerable road users,

K. Saleh, M. Hossny, and S. Nahavandi, “Contextual recurrent predictive model for long-term intent prediction of vulnerable road users,”IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 8, pp. 3398–3408, 2020

work page 2020

[28] [28]

Analysis of recurrent neural networks for probabilistic modeling of driver be- havior,

J. Morton, T. A. Wheeler, and M. J. Kochenderfer, “Analysis of recurrent neural networks for probabilistic modeling of driver be- havior,”IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 5, pp. 1289–1298, 2017

work page 2017

[29] [29]

Social attention: Modeling attention in human crowds,

A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in2018 IEEE International Confer- ence on Robotics and Automation (ICRA), 2018, pp. 4601–4607

work page 2018

[30] [30]

Trajec- tory forecasting based on prior-aware directed graph convolutional neural network,

Y . Su, J. Du, Y . Li, X. Li, R. Liang, Z. Hua, and J. Zhou, “Trajec- tory forecasting based on prior-aware directed graph convolutional neural network,”IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 9, pp. 16 773–16 785, 2022

work page 2022

[31] [31]

Collaborative uncertainty in multi-agent trajectory forecasting,

B. Tang, Y . Zhong, U. Neumann, G. Wang, S. Chen, and Y . Zhang, “Collaborative uncertainty in multi-agent trajectory forecasting,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 6328–6340

work page 2021

[32] [32]

Long-short term spatio-temporal aggrega- tion for trajectory prediction,

C. Yang and Z. Pei, “Long-short term spatio-temporal aggrega- tion for trajectory prediction,”IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 4, pp. 4114–4126, 2023

work page 2023

[33] [33]

Mantra: Memory augmented networks for multiple trajectory prediction,

F. Marchetti, F. Becattini, L. Seidenari, and A. Del Bimbo, “Mantra: Memory augmented networks for multiple trajectory prediction,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7141–7150

work page 2020

[34] [34]

Evolvegraph: Multi-agent trajectory prediction with dynamic relational reason- ing,

J. Li, F. Yang, M. Tomizuka, and C. Choi, “Evolvegraph: Multi-agent trajectory prediction with dynamic relational reason- ing,” inAdvances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 19 783–19 794

work page 2020

[35] [35]

Eqmotion: Equivariant multi-agent motion prediction with invariant interaction reasoning,

C. Xu, R. T. Tan, Y . Tan, S. Chen, Y . G. Wang, X. Wang, and Y . Wang, “Eqmotion: Equivariant multi-agent motion prediction with invariant interaction reasoning,” in2023 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1410–1420

work page 2023

[36] [36]

Bactrack: Building appearance collection for aerial tracking,

X. Liu, T. Xu, Y . Wang, Z. Yu, X. Yuan, H. Qin, and J. Li, “Bactrack: Building appearance collection for aerial tracking,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 6, pp. 5002–5017, 2024

work page 2024

[37] [37]

Multi-object tracking meets moving uav,

S. Liu, X. Li, H.-C. Lu, and Y . He, “Multi-object tracking meets moving uav,” inCVPR 2022: Proceedings of the IEEE /CVF conference on computer vision and pattern recognition, 2022, pp. 8876–8885

work page 2022

[38] [38]

Sea you later: Metadata-guided long- term re-identification for uav-based multi-object tracking,

C.-Y . Yang, H.-W. Huang, Z. Jiang, H.-C. Kuo, J. Mei, C.-I. Huang, and J.-N. Hwang, “Sea you later: Metadata-guided long- term re-identification for uav-based multi-object tracking,” in2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 2024, pp. 805–812

work page 2024

[39] [39]

Ucmctrack: Multi-object tracking with uniform camera motion compensation,

K.-F. Yi, K. Luo, X.-L. Luo, J.-G. Huang, H. Wu, R.-D. Hu, and W. Hao, “Ucmctrack: Multi-object tracking with uniform camera motion compensation,” inAAAI 2023: Proceedings of the AAAI conference on artificial intelligence, vol. 38, no. 7, 2024, pp. 6702– 6710

work page 2023

[40] [40]

Iterative scale-up expansioniou and deep features association for multi-object tracking in sports,

H.-W. Huang, C.-Y . Yang, J. Sun, P.-K. Kim, K.-J. Kim, K. Lee, C.-I. Huang, and J.-N. Hwang, “Iterative scale-up expansioniou and deep features association for multi-object tracking in sports,” inProceedings of the IEEE/CVF Winter Conference on Applica- tions of Computer Vision, 2024, pp. 163–172

work page 2024

[41] [41]

Dc-mot: Motion deblurring and compensation for multi-object tracking in uav videos,

S. Cheng, M. Yao, and X. Xiao, “Dc-mot: Motion deblurring and compensation for multi-object tracking in uav videos,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 789–795

work page 2023

[42] [42]

An experimental evaluation based on new air-to-air multi-uav tracking dataset,

Z. Chu, T. Song, R. Jin, and T. Jiang, “An experimental evaluation based on new air-to-air multi-uav tracking dataset,” inICUS 2023: Proceedings of the IEEE international conference on unmanned systems. IEEE, 2023, pp. 671–676

work page 2023

[43] [43]

Multi-object continuous robust tracking algorithm for anti-uav swarm,

C. Wang, Y . Su, L. Wang, T. Wang, J. Wang, and Q. Gao, “Multi-object continuous robust tracking algorithm for anti-uav swarm,”Acta Aeronaut. Astronaut. Sin., vol. 45, no. 7, pp. 256– 269 [Chinese], 2024

work page 2024

[44] [44]

Fairmot: On the fairness of detection and re-identification in multiple object tracking,

Y . Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, “Fairmot: On the fairness of detection and re-identification in multiple object tracking,”Int. J. Comput. Vis., vol. 129, pp. 3069–3087, 2021

work page 2021

[45] [45]

Vision-based air-to-air multi- uavs tracking,

Z. Chu, T. Song, R. Jin, and D. Lin, “Vision-based air-to-air multi- uavs tracking,”Acta Aeronaut. Astronaut. Sin., vol. 45, no. 14, p. 629379 [Chinese], 2024

work page 2024

[46] [46]

Motrv3: Release-fetch supervision for end-to-end multi-object tracking

E. Yu, T. Wang, Z. Li, Y . Zhang, X. Zhang, and W. Tao, “Motrv3: Release-fetch supervision for end-to-end multi-object tracking,” ArXiv, vol. abs/2305.14298, 2023. Zhaochen Chureceived the B.E. degree in Science in Flight Vehicle Design and Engi- neering from Beijing Institute of Technology, Beijing, China, in 2021. He is currently pur- suing the Ph.D. de...

work page arXiv 2023