Fact: Frame-action cross- attention temporal modeling for efficient action segmenta- tion

Zijia Lu, Ehsan Elhamifar

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

TemporalVLM: Video LLMs for Temporal Reasoning in Long Videos

cs.CV · 2024-12-04 · unverdicted · novelty 5.0

TemporalVLM adds timestamp-aware clip encoding and BiLSTM global aggregation to video LLMs, introduces the IndustryASM factory dataset, and reports outperformance on dense captioning, temporal grounding, highlight detection, and action segmentation.

citing papers explorer

Showing 1 of 1 citing paper.

TemporalVLM: Video LLMs for Temporal Reasoning in Long Videos cs.CV · 2024-12-04 · unverdicted · none · ref 26
TemporalVLM adds timestamp-aware clip encoding and BiLSTM global aggregation to video LLMs, introduces the IndustryASM factory dataset, and reports outperformance on dense captioning, temporal grounding, highlight detection, and action segmentation.

Fact: Frame-action cross- attention temporal modeling for efficient action segmenta- tion

fields

years

verdicts

representative citing papers

citing papers explorer