DemaFormer pairs energy-based modeling with a damped-EMA Transformer to localize video moments matching language queries and reports gains over baselines on four datasets.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2023 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
DemaFormer: Damped Exponential Moving Average Transformer with Energy-Based Modeling for Temporal Language Grounding
DemaFormer pairs energy-based modeling with a damped-EMA Transformer to localize video moments matching language queries and reports gains over baselines on four datasets.