Video-STAR proposes sub-motion decomposition combined with tool-augmented reinforcement learning and a hierarchical reward to achieve state-of-the-art open-vocabulary action recognition on standard video datasets.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools
Video-STAR proposes sub-motion decomposition combined with tool-augmented reinforcement learning and a hierarchical reward to achieve state-of-the-art open-vocabulary action recognition on standard video datasets.