MVTrackTrans uses view-ground interactions in a Transformer to improve multi-view crowd tracking on large scenes and outperforms prior methods on two newly collected large datasets.
Refergpt: Towards zero-shot referring multi-object tracking
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
STORM is an end-to-end MLLM for referring multi-object tracking that uses task-composition learning to leverage sub-task data and introduces the STORM-Bench dataset, achieving SOTA results.
citing papers explorer
-
Multi-view Crowd Tracking Transformer with View-Ground Interactions Under Large Real-World Scenes
MVTrackTrans uses view-ground interactions in a Transformer to improve multi-view crowd tracking on large scenes and outperforms prior methods on two newly collected large datasets.
-
STORM: End-to-End Referring Multi-Object Tracking in Videos
STORM is an end-to-end MLLM for referring multi-object tracking that uses task-composition learning to leverage sub-task data and introduces the STORM-Bench dataset, achieving SOTA results.