TACO proposes Relative Structure Distillation and a lightweight specialization projection to mitigate inconsistency between fine-tuning and evaluation objectives in open-vocabulary video recognition, claiming state-of-the-art results on cross-dataset and base-to-novel benchmarks.
Learning to generalize without bias for open-vocabulary action recognition.arXiv preprint arXiv:2502.20158, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TACO: Towards Task-Consistent Open-Vocabulary Adaptation in Video Recognition
TACO proposes Relative Structure Distillation and a lightweight specialization projection to mitigate inconsistency between fine-tuning and evaluation objectives in open-vocabulary video recognition, claiming state-of-the-art results on cross-dataset and base-to-novel benchmarks.