NOOUGAT: Towards Unified Online and Offline Multi-Object Tracking
Pith reviewed 2026-05-18 19:54 UTC · model grok-4.3
The pith
NOOUGAT unifies online and offline multi-object tracking by processing non-overlapping subclips with a graph neural network and autoregressive fusion layer.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NOOUGAT is the first tracker designed to operate with arbitrary temporal horizons. It leverages a unified Graph Neural Network framework that processes non-overlapping subclips and fuses them through a novel Autoregressive Long-term Tracking layer. The subclip size controls the trade-off between latency and temporal context, enabling a wide range of deployment scenarios from frame-by-frame to batch processing. It achieves state-of-the-art performance across both tracking regimes, improving online AssA by +2.3 on DanceTrack, +9.2 on SportsMOT, and +5.0 on MOT20, with even greater gains in offline mode.
What carries the argument
The Autoregressive Long-term Tracking (ALT) layer, which fuses object associations across non-overlapping subclips inside a graph neural network to maintain identity consistency over variable time spans.
If this is right
- Subclip length becomes a single tunable parameter that trades latency for temporal context without retraining or redesigning the tracker.
- The same trained model supports both low-latency frame-by-frame operation and higher-accuracy batch processing of entire videos.
- Long-term occlusions are handled through autoregressive fusion rather than hand-crafted association rules or post-hoc stitching.
- Performance gains appear in both online and offline regimes on standard benchmarks such as DanceTrack, SportsMOT, and MOT20.
Where Pith is reading between the lines
- The subclip-and-fuse design could extend naturally to streaming settings where data arrives in irregular bursts rather than fixed frames.
- Because the core representation is a graph neural network, additional cues such as appearance embeddings or 3D motion could be added as node or edge features without changing the overall architecture.
- Testing on videos lasting hours rather than minutes would reveal whether drift remains bounded or requires periodic global re-optimization.
Load-bearing premise
The Autoregressive Long-term Tracking layer can reliably fuse information across non-overlapping subclips without accumulating identity switches or drift over arbitrarily long sequences.
What would settle it
Measuring identity-switch rate and association accuracy on sequences many times longer than the chosen subclip size while keeping the same model and subclip size fixed.
read the original abstract
The long-standing division between \textit{online} and \textit{offline} Multi-Object Tracking (MOT) has led to fragmented solutions that fail to address the flexible temporal requirements of real-world deployment scenarios. Current \textit{online} trackers rely on frame-by-frame hand-crafted association strategies and struggle with long-term occlusions, whereas \textit{offline} approaches can cover larger time gaps, but still rely on heuristic stitching for arbitrarily long sequences. In this paper, we introduce NOOUGAT, the first tracker designed to operate with arbitrary temporal horizons. NOOUGAT leverages a unified Graph Neural Network (GNN) framework that processes non-overlapping subclips, and fuses them through a novel Autoregressive Long-term Tracking (ALT) layer. The subclip size controls the trade-off between latency and temporal context, enabling a wide range of deployment scenarios, from frame-by-frame to batch processing. NOOUGAT achieves state-of-the-art performance across both tracking regimes, improving \textit{online} AssA by +2.3 on DanceTrack, +9.2 on SportsMOT, and +5.0 on MOT20, with even greater gains in \textit{offline} mode.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces NOOUGAT, the first multi-object tracker designed to operate with arbitrary temporal horizons. It employs a unified Graph Neural Network (GNN) that processes fixed-size non-overlapping subclips and fuses their outputs via a novel Autoregressive Long-term Tracking (ALT) layer; subclip size is the sole control for the latency-context trade-off. The paper reports state-of-the-art results, with online AssA gains of +2.3 on DanceTrack, +9.2 on SportsMOT, and +5.0 on MOT20, and larger gains in offline mode.
Significance. If the unification and arbitrary-horizon claims hold, the work would meaningfully advance MOT by removing the online/offline dichotomy and enabling flexible deployment. The reported numeric gains on standard benchmarks indicate practical value for long-term association under occlusion.
major comments (2)
- [ALT layer description (methods)] ALT layer description (methods): the autoregressive fusion of non-overlapping subclips is asserted to support arbitrary horizons, yet the text supplies no explicit anti-drift mechanism (periodic global re-matching, uncertainty-aware propagation, or re-optimization) that would prevent compounding of independent association errors across subclips. This is load-bearing for the central claim.
- [Results section] Results section: while specific AssA deltas are stated, no ablation or error analysis is provided on how performance or identity-switch rate scales with the number of subclips (i.e., sequence length). Without such evidence the arbitrary-horizon guarantee remains unverified.
minor comments (2)
- [Abstract] Abstract: the phrase 'even greater gains in offline mode' is left unquantified; supplying the corresponding numeric improvements would aid immediate assessment.
- [Notation] Notation: confirm that 'AssA' and related metrics are defined consistently on first use and that all dataset names receive standard citations.
Simulated Author's Rebuttal
We thank the referee for their detailed and thoughtful review of our manuscript. The comments raise important points about the robustness of the ALT layer and the verification of arbitrary-horizon performance. We address each comment below and outline the revisions we plan to make.
read point-by-point responses
-
Referee: [ALT layer description (methods)] ALT layer description (methods): the autoregressive fusion of non-overlapping subclips is asserted to support arbitrary horizons, yet the text supplies no explicit anti-drift mechanism (periodic global re-matching, uncertainty-aware propagation, or re-optimization) that would prevent compounding of independent association errors across subclips. This is load-bearing for the central claim.
Authors: We appreciate the referee highlighting this aspect of the ALT layer. The manuscript describes the ALT layer as an autoregressive mechanism that fuses outputs from consecutive subclips by using the association graph from the previous subclip to initialize the next. This design allows the model to carry forward identity information across subclips without resetting. While we do not introduce an additional explicit anti-drift module such as periodic re-matching, the end-to-end training of the GNN on sequences with varying lengths enables the network to learn robust propagation that minimizes error accumulation. The SOTA results on long sequences in MOT20 and other datasets provide empirical support for this. To strengthen the paper, we will revise the methods section to explicitly discuss potential error propagation and how the architecture addresses it through learned representations. revision: yes
-
Referee: [Results section] Results section: while specific AssA deltas are stated, no ablation or error analysis is provided on how performance or identity-switch rate scales with the number of subclips (i.e., sequence length). Without such evidence the arbitrary-horizon guarantee remains unverified.
Authors: We agree that demonstrating how performance scales with the number of subclips would further validate the arbitrary-horizon claim. In the current manuscript, we focus on reporting results for the full sequences in the benchmarks, which inherently involve multiple subclips for longer videos. However, we did not provide a dedicated ablation varying the subclip count or analyzing identity switches as a function of sequence length. We will add such an analysis in the revised version, either through additional experiments or by breaking down the results on existing data by sequence length where possible. This will help verify the scaling behavior. revision: yes
Circularity Check
No circularity: method is architectural design with empirical validation
full rationale
The paper introduces NOOUGAT as a unified GNN-based tracker with a novel Autoregressive Long-term Tracking (ALT) layer that processes non-overlapping subclips for arbitrary horizons. The subclip size is presented as an explicit design knob trading latency for context, and performance gains are reported as experimental results on DanceTrack, SportsMOT, and MOT20. No equations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations appear in the derivation chain. The central construction (GNN + ALT fusion) is a proposed architecture whose correctness is left to empirical evaluation rather than reducing to its own inputs by definition.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
NOOUGAT leverages a unified Graph Neural Network (GNN) framework that processes non-overlapping subclips, and fuses them through a novel Autoregressive Long-term Tracking (ALT) layer. The subclip size controls the trade-off between latency and temporal context
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce the ALT layer, a fully learnable and data-driven GNN association module that dynamically uses the most relevant cues across various temporal contexts
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Ding S, Schneider L, Cordts M, Gall J.: ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
-
[2]
Explor- ing Simple 3D Multi-Object Tracking for Autonomous Driving
Luo C, Yang X, Yuille A. Explor- ing Simple 3D Multi-Object Tracking for Autonomous Driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2021. p. 10488– 10497
work page 2021
-
[3]
BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learn- ing
Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, et al. BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learn- ing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
-
[4]
SPAMming Labels: Efficient Anno- tations for the Trackers of Tomorrow
Cetintas O, Meinhardt T, Bras´ o G, Leal- Taix´ e L. SPAMming Labels: Efficient Anno- tations for the Trackers of Tomorrow. In: European Conference on Computer Vision (ECCV); 2024
work page 2024
-
[5]
Effi- ciently Scaling Up Video Annotation with Crowdsourced Marketplaces
Vondrick C, Ramanan D, Patterson D. Effi- ciently Scaling Up Video Annotation with Crowdsourced Marketplaces. In: Daniilidis K, Maragos P, Paragios N, editors. Computer Vision – ECCV 2010. Berlin, Heidelberg: Springer Berlin Heidelberg; 2010. p. 610–623
work page 2010
-
[6]
DiffMOT: A Real-time Diffusion- based Multiple Object Tracker with Non- linear Prediction
Lv W, Huang Y, Zhang N, Lin RS, Han M, Zeng D. DiffMOT: A Real-time Diffusion- based Multiple Object Tracker with Non- linear Prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024. p. 19321– 19330
work page 2024
-
[7]
Available from: https:// arxiv.org/abs/2303.10404
Qin Z, Zhou S, Wang L, Duan J, Hua G, Tang W.: MotionTrack: Learning Robust Short-term and Long-term Motions for Multi- Object Tracking. Available from: https:// arxiv.org/abs/2303.10404
-
[8]
Observation-centric sort: Rethinking sort for robust multi-object tracking
Cao J, Pang J, Weng X, Khirodkar R, Kitani K. Observation-centric sort: Rethinking sort for robust multi-object tracking. In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
-
[9]
Dendorfer P, Yugay V, Osep A, Leal-Taix´ e L. Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking? In: Oh AH, Agarwal A, Belgrave D, Cho K, editors. Advances in Neural Information Pro- cessing Systems; 2022. Available from: https: //openreview.net/forum?id=3r0yLLCo4fF
work page 2022
-
[10]
Learning an Image-Based Motion Context for Multiple People Track- ing
Leal-Taix´ e L, Fenzi M, Kuznetsova A, Rosen- hahn B, Savarese S. Learning an Image-Based Motion Context for Multiple People Track- ing. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition; 2014. p. 3542–3549
work page 2014
-
[11]
Available from: https://arxiv.org/abs/2003.08177
Wang G, Yang S, Liu H, Wang Z, Yang Y, Wang S, et al.: High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification. Available from: https://arxiv.org/abs/2003.08177
-
[12]
Unsupervised Pre-training for Person Re-identification
Fu D, Chen D, Bao J, Yang H, Yuan L, Zhang L, et al. Unsupervised Pre-training for Person Re-identification. Proceedings of the IEEE conference on computer vision and pattern recognition. 2021
work page 2021
-
[13]
TransReID: Transformer-Based Object Re-Identification
He S, Luo H, Wang P, Wang F, Li H, Jiang W. TransReID: Transformer-Based Object Re-Identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2021. p. 15013– 15022
work page 2021
-
[14]
Key- point Promptable Re-Identification
Somers V, Alahi A, Vleeschouwer CD. Key- point Promptable Re-Identification. In: Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, Septem- ber 29-October 4, 2024, Proceedings, Part LXXIX. vol. 15137 of Lecture Notes in Computer Science. Springer; 2024. p. 216–
work page 2024
- [15]
-
[16]
Simple online and realtime tracking
Bewley A, Ge Z, Ott L, Ramos F, Upcroft B. Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP); 2016. p. 3464–3468
work page 2016
-
[17]
Simple Online and Realtime Tracking with a Deep Association Metric
Wojke N, Bewley A, Paulus D. Simple Online and Realtime Tracking with a Deep Association Metric. In: 2017 IEEE Inter- national Conference on Image Processing (ICIP). IEEE; 2017. p. 3645–3649
work page 2017
-
[18]
ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Zhang Y, Sun P, Jiang Y, Yu D, Weng F, Yuan Z, et al. ByteTrack: Multi-Object Tracking by Associating Every Detection Box. Proceedings of the European Conference on Computer Vision (ECCV). 2022
work page 2022
-
[19]
Hybrid-sort: Weak cues matter for online multi-object tracking
Yang M, Han G, Yan B, Zhang W, Qi J, Lu H, et al. Hybrid-sort: Weak cues matter for online multi-object tracking. In: Proceed- ings of the AAAI Conference on Artificial Intelligence. vol. 38; 2024. p. 6504–6512
work page 2024
-
[20]
MOTR: End-to-End Multiple- Object Tracking with Transformer
Zeng F, Dong B, Zhang Y, Wang T, Zhang X, Wei Y. MOTR: End-to-End Multiple- Object Tracking with Transformer. In: 14 Computer Vision – ECCV 2022: 17th Euro- pean Conference, Tel Aviv, Israel, Octo- ber 23–27, 2022, Proceedings, Part XXVII. Berlin, Heidelberg: Springer-Verlag; 2022. p. 659–675. Available from: https://doi.org/10. 1007/978-3-031-19812-0 38
work page 2022
-
[21]
MOTRv2: Boot- strapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
Zhang Y, Wang T, Zhang X. MOTRv2: Boot- strapping End-to-End Multi-Object Tracking by Pretrained Object Detectors. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE
work page 2023
-
[22]
p. 22056–22065. Available from: http:// dx.doi.org/10.1109/CVPR52729.2023.02112
-
[23]
MeMOTR: Long- Term Memory-Augmented Transformer for Multi-Object Tracking
Gao R, Wang L. MeMOTR: Long- Term Memory-Augmented Transformer for Multi-Object Tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2023. p. 9901– 9910
work page 2023
-
[24]
Available from: https://arxiv.org/abs/2403.16848
Gao R, Qi J, Wang L.: Multiple Object Tracking as ID Prediction. Available from: https://arxiv.org/abs/2403.16848
-
[25]
Learning a Neural Solver for Multiple Object Tracking
Bras´ o G, Leal-Taix´ e L. Learning a Neural Solver for Multiple Object Tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2020
work page 2020
-
[26]
Multi- Object Tracking and Segmentation Via Neu- ral Message Passing
Bras´ o G, Cetintas O, Leal-Taix´ e L. Multi- Object Tracking and Segmentation Via Neu- ral Message Passing. International Journal of Computer Vision. 2022;https://doi.org/10. 1007/s11263-022-01678-6
work page 2022
-
[27]
Uni- fying Short and Long-Term Tracking With Graph Hierarchies
Cetintas O, Bras´ o G, Leal-Taix´ e L. Uni- fying Short and Long-Term Tracking With Graph Hierarchies. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2023. p. 22877–22887
work page 2023
-
[28]
Gao Y, Xu H, Li J, Wang N, Gao X. Multi- scene generalized trajectory global graph solver with composite nodes for multiple object tracking. In: Proceedings of the Thirty-Eighth AAAI Conference on Artifi- cial Intelligence and Thirty-Sixth Confer- ence on Innovative Applications of Artifi- cial Intelligence and Fourteenth Symposium on Educational Advances...
-
[29]
The Architectural Implications of Autonomous Driving: Con- straints and Acceleration
Lin SC, Zhang Y, Hsu CH, Skach M, Haque ME, Tang L, et al. The Architectural Implications of Autonomous Driving: Con- straints and Acceleration. SIGPLAN Not. 2018 Mar;53(2):751–766. https://doi.org/10. 1145/3296957.3173191
-
[30]
VETRA: A Dataset for Vehicle Tracking in Aerial Imagery – New Challenges for Multi-Object Tracking
Hellekes J, M¨ uhlhaus M, Bahmanyar R, Azimi SM, Kurz F. VETRA: A Dataset for Vehicle Tracking in Aerial Imagery – New Challenges for Multi-Object Tracking. In: Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, Septem- ber 29–October 4, 2024, Proceedings, Part LXXXV. Berlin, Heidelberg: Springer-Verlag
work page 2024
-
[31]
p. 52–70. Available from: https://doi. org/10.1007/978-3-031-73013-9 4
-
[32]
KIT - Insti- tute of Photogrammetry and Remote Sensing (IPF)
Schmidt F.: Data Set for Tracking Vehicles in Aerial Image Sequences. KIT - Insti- tute of Photogrammetry and Remote Sensing (IPF). https://www.ipf.kit.edu/downloads data set AIS vehicle tracking.php
-
[33]
Available from: https://arxiv.org/ abs/2405.15755
Han X, Oishi N, Tian Y, Ucurum E, Young R, Chatwin C, et al.: ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking. Available from: https://arxiv.org/ abs/2405.15755
-
[34]
MambaTrack: A Simple Baseline for Multiple Object Track- ing with State Space Model
Xiao C, Cao Q, Luo Z, Lan L. MambaTrack: A Simple Baseline for Multiple Object Track- ing with State Space Model. In: Proceedings of the 32nd ACM International Conference on Multimedia. MM ’24. New York, NY, USA: Association for Computing Machinery; 2024. p. 4082–4091. Available from: https://doi. org/10.1145/3664647.3680944
-
[35]
Focus On Details: Online Multi- Object Tracking with Diverse Fine-Grained Representation
Ren H, Han S, Ding H, Zhang Z, Wang H, Wang F. Focus On Details: Online Multi- Object Tracking with Diverse Fine-Grained Representation. In: 2023 IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR); 2023. p. 11289–11298. 15
work page 2023
-
[36]
Learning by tracking: Siamese CNN for robust target association
Leal-Taix´ e L, Canton-Ferrer C, Schindler K. Learning by tracking: Siamese CNN for robust target association. CoRR. 2016;abs/1604.07866. 1604.07866
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[37]
BoT- SORT: Robust associations multi-pedestrian tracking
Aharon N, Orfaig R, Bobrovsky BZ.: BoT- SORT: Robust Associations Multi-Pedestrian Tracking. Available from: https://arxiv.org/ abs/2206.14651
-
[38]
Available from: https://arxiv.org/abs/2206.04656
Seidenschwarz J, Bras´ o G, Serrano VC, Elezi I, Leal-Taix´ e L.: Simple Cues Lead to a Strong Multi-Object Tracker. Available from: https://arxiv.org/abs/2206.04656
-
[39]
End-to- End Object Detection with Transform- ers
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. End-to- End Object Detection with Transform- ers. In: Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I. Berlin, Heidelberg: Springer-Verlag; 2020. p. 213–229. Available from: https://doi.org/10. 1007/978-3-030-58452-8 13
work page 2020
-
[40]
TrackFormer: Multi-Object Tracking with Transformers
Meinhardt T, Kirillov A, Leal-Taixe L, Feichtenhofer C. TrackFormer: Multi-Object Tracking with Transformers. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2022
work page 2022
-
[41]
MeMOT: Multi-Object Tracking with Memory
Cai J, Xu M, Li W, Xiong Y, Xia W, Tu Z, et al. MeMOT: Multi-Object Tracking with Memory. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2022. p. 8080–8090
work page 2022
-
[42]
Tracking without bells and whistles
Bergmann P, Meinhardt T, Leal-Taix´ e L. Tracking without bells and whistles. In: The IEEE International Conference on Computer Vision (ICCV); 2019
work page 2019
-
[43]
How To Train Your Deep Multi-Object Tracker
Xu Y, Osep A, Ban Y, Horaud R, Leal- Taix´ e L, Alameda-Pineda X. How To Train Your Deep Multi-Object Tracker. In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
-
[44]
Strongsort: Make deepsort great again
Du Y, Zhao Z, Song Y, Zhao Y, Su F, Gong T, et al. Strongsort: Make deepsort great again. IEEE Transactions on Multimedia. 2023
work page 2023
-
[45]
Zhou X, Yin T, Koltun V, Kr¨ ahenb¨ uhl P. Global Tracking Transformers. In: CVPR
-
[46]
Multiple object tracking using k- shortest paths optimization
Berclaz J, Fleuret F, Turetken E, Fua P. Multiple object tracking using k- shortest paths optimization. IEEE TPAMI. 2011;33(9):1806–1819
work page 2011
-
[47]
Global data associ- ation for multi-object tracking using network flows
Zhang L, Li Y, Nevatia R. Global data associ- ation for multi-object tracking using network flows. In: CVPR; 2008
work page 2008
-
[48]
Multiple People Tracking by Lifted Multi- cut and Person Re-Identification
Tang S, Andriluka M, Andres B, Schiele B. Multiple People Tracking by Lifted Multi- cut and Person Re-Identification. In: CVPR
-
[49]
Gmcp- tracker: Global multi-object tracking using generalized minimum clique graphs
Zamir AR, Dehghan A, Shah M. Gmcp- tracker: Global multi-object tracking using generalized minimum clique graphs. In: ECCV. Springer; 2012. p. 343–356
work page 2012
-
[50]
Subgraph Decomposition for Multi-Target Tracking
Tang S, Andres B, Andriluka M, Schiele B. Subgraph Decomposition for Multi-Target Tracking. In: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR); 2015
work page 2015
-
[51]
Lifted disjoint paths with appli- cation in multiple object tracking
Hornakova A, Henschel R, Rosenhahn B, Swoboda P. Lifted disjoint paths with appli- cation in multiple object tracking. In: ICML. PMLR; 2020. p. 4364–4375
work page 2020
-
[52]
Making Higher Order MOT Scalable: An Efficient Approx- imate Solver for Lifted Disjoint Paths
Hornakova A, Kaiser T, Swoboda P, Rolinek M, Rosenhahn B, Henschel R. Making Higher Order MOT Scalable: An Efficient Approx- imate Solver for Lifted Disjoint Paths. In: ICCV; 2021. p. 6330–6340
work page 2021
-
[53]
Multi-target Tracking by Lagrangian Relaxation to Min-Cost Network Flow
Butt A, Collins R. Multi-target Tracking by Lagrangian Relaxation to Min-Cost Network Flow. CVPR. 2013
work page 2013
-
[54]
End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models
Xiang J, Xu G, Ma C, Hou J. End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models. IEEE Trans- actions on Circuits and Systems for Video Technology. 2021;31(1):275–288. https://doi. org/10.1109/TCSVT.2020.2975842. 16
-
[55]
Multi-Object Tracking Using Color, Texture and Motion
Takala V, Pietikainen M. Multi-Object Tracking Using Color, Texture and Motion. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition; 2007. p. 1–7
work page 2007
-
[56]
Learning by Tracking: Siamese CNN for Robust Target Association
Leal-Taixe L, Canton-Ferrer C, Schindler K. Learning by Tracking: Siamese CNN for Robust Target Association. In: CVPRW
-
[57]
Multi- Object Tracking With Quadruplet Convolu- tional Neural Networks
Son J, Baek M, Cho M, Han B. Multi- Object Tracking With Quadruplet Convolu- tional Neural Networks. In: CVPR; 2017
work page 2017
-
[58]
Features for Multi- Target Multi-Camera Tracking and Re- Identification
Ristani E, Tomasi C. Features for Multi- Target Multi-Camera Tracking and Re- Identification. In: CVPR; 2018
work page 2018
-
[59]
Tracking the Untrackable: Learning to Track Multi- ple Cues With Long-Term Dependencies
Sadeghian A, Alahi A, Savarese S. Tracking the Untrackable: Learning to Track Multi- ple Cues With Long-Term Dependencies. In: ICCV; 2017
work page 2017
-
[60]
Online Multi-Target Track- ing Using Recurrent Neural Networks
Milan A, Rezatofighi SH, Dick A, Reid I, Schindler K. Online Multi-Target Track- ing Using Recurrent Neural Networks. In: Proceedings of the Thirty-First AAAI Con- ference on Artificial Intelligence; 2017
work page 2017
-
[61]
Learning a Proposal Classifier for Mul- tiple Object Tracking
Dai P, Weng R, Choi W, Zhang C, He Z, Ding W. Learning a Proposal Classifier for Mul- tiple Object Tracking. In: CVPR; 2021. p. 2443–2452
work page 2021
-
[62]
He J, Huang Z, Wang N, Zhang Z. Learn- able graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In: CVPR; 2021. p. 5299–5309
work page 2021
-
[63]
Graph Networks for Multiple Object Tracking
Li J, Gao X, Jiang T. Graph Networks for Multiple Object Tracking. In: Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision (WACV); 2020
work page 2020
-
[64]
GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi- Feature Learning
Weng X, Wang Y, Man Y, Kitani KM. GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi- Feature Learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020
work page 2020
-
[65]
GSM: Graph Similarity Model for Multi-Object Tracking
Liu Q, Chu Q, Liu B, Yu N. GSM: Graph Similarity Model for Multi-Object Tracking. In: IJCAI; 2020. p. 530–536
work page 2020
-
[66]
Multi- Object Tracking and Segmentation Via Neu- ral Message Passing
Bras´ o G, Cetintas O, Leal-Taix´ e L. Multi- Object Tracking and Segmentation Via Neu- ral Message Passing. International Journal of Computer Vision. 2022;130(12):3035–3053
work page 2022
-
[67]
Wang Y, Kitani K, Weng X. Joint Object Detection and Multi-Object Tracking with Graph Neural Networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE Press; 2021. p. 13708–13715. Available from: https://doi. org/10.1109/ICRA48506.2021.9561110
-
[68]
Graph Networks for Multiple Object Tracking
Li J, Gao X, Jiang T. Graph Networks for Multiple Object Tracking. In: 2020 IEEE Winter Conference on Applications of Com- puter Vision (WACV); 2020. p. 708–717
work page 2020
-
[69]
The Hungarian Method for the Assignment Problem
Kuhn HW. The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly. 1955 March;2(1–2):83–
work page 1955
-
[70]
https://doi.org/10.1002/nav.3800020109
-
[71]
He J, Huang Z, Wang N, Zhang Z. Learnable Graph Matching: Incorporating Graph Parti- tioning With Deep Feature Learning for Mul- tiple Object Tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2021. p. 5299–5309
work page 2021
-
[72]
Global data associ- ation for multi-object tracking using network flows
Zhang L, Li Y, Nevatia R. Global data associ- ation for multi-object tracking using network flows. In: 2008 IEEE Conference on Com- puter Vision and Pattern Recognition. IEEE
work page 2008
-
[73]
DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion
Sun P, Cao J, Jiang Y, Yuan Z, Bai S, Kitani K, et al. DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2022. . 17
work page 2022
-
[74]
SportsMOT: A Large Multi-Object Track- ing Dataset in Multiple Sports Scenes
Cui Y, Zeng C, Zhao X, Yang Y, Wu G, Wang L. SportsMOT: A Large Multi-Object Track- ing Dataset in Multiple Sports Scenes. arXiv preprint arXiv:230405170. 2023
work page 2023
-
[75]
Dendorfer P, Oˇ sep A, Milan A, Schindler K, Cremers D, Reid I, et al.: MOTChallenge: A Benchmark for Single-Camera Multiple Tar- get Tracking. Available from: https://arxiv. org/abs/2010.07548
-
[76]
MOT20: A benchmark for multi object tracking in crowded scenes
Dendorfer P, Rezatofighi H, Milan A, Shi J, Cremers D, Reid I, et al. MOT20: A benchmark for multi object tracking in crowded scenes. arXiv:200309003[cs]. 2020 Mar;ArXiv: 2003.09003
-
[77]
HOTA: A Higher Order Metric for Evaluating Multi- object Tracking
Luiten J, Osep A, Dendorfer P, Torr P, Geiger A, Leal-Taix´ e L, et al. HOTA: A Higher Order Metric for Evaluating Multi- object Tracking. Int J Comput Vision. 2021 Feb;129(2):548–578. https://doi.org/10. 1007/s11263-020-01375-2
work page 2021
-
[78]
Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking
Ristani E, Solera F, Zou RS, Cucchiara R, Tomasi C. Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. CoRR. 2016;abs/1609.01775. 1609.01775
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[79]
Kasturi R, Goldgof D, Soundararajan P, Manohar V, Garofolo J, Bowers R, et al. Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Track- ing in Video: Data, Metrics, and Proto- col. IEEE Transactions on Pattern Analy- sis and Machine Intelligence. 2009;31(2):319–
work page 2009
-
[80]
https://doi.org/10.1109/TPAMI.2008. 57
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.