Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

· 2018 · cs.CV · arXiv 1811.11167

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Accurate detection and tracking of objects is vital for effective video understanding. In previous work, the two tasks have been combined in a way that tracking is based heavily on detection, but the detection benefits marginally from the tracking. To increase synergy, we propose to more tightly integrate the tasks by conditioning the object detection in the current frame on tracklets computed in prior frames. With this approach, the object detection results not only have high detection responses, but also improved coherence with the existing tracklets. This greater coherence leads to estimated object trajectories that are smoother and more stable than the jittered paths obtained without tracklet-conditioned detection. Over extensive experiments, this approach is shown to achieve state-of-the-art performance in terms of both detection and tracking accuracy, as well as noticeable improvements in tracking stability.

representative citing papers

MR2-ByteTrack: CNN and Transformer-based Video Object Detection for AI-augmented Embedded Vision Sensor Nodes

cs.CV · 2026-05-14 · conditional · novelty 5.0

MR2-ByteTrack maintains high accuracy in video object detection on MCUs by combining multi-resolution processing, ByteTrack for frame linking, and Rescore for confidence aggregation, achieving up to 55% energy savings and real-time performance for both CNN and Transformer models.

citing papers explorer

Showing 1 of 1 citing paper.

MR2-ByteTrack: CNN and Transformer-based Video Object Detection for AI-augmented Embedded Vision Sensor Nodes cs.CV · 2026-05-14 · conditional · none · ref 47 · internal anchor
MR2-ByteTrack maintains high accuracy in video object detection on MCUs by combining multi-resolution processing, ByteTrack for frame linking, and Rescore for confidence aggregation, achieving up to 55% energy savings and real-time performance for both CNN and Transformer models.

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

fields

years

verdicts

representative citing papers

citing papers explorer