Something Something

Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, Joanna Materzynska, Susanne Westphal, Heuna Kim, Valentin Haenel, Ingo Fründ, Peter Yianilos, Moritz Mueller-Freitag, Florian Hoppe, Christian Thurau, Ingo Bax, Roland Memisevic · 2017 · DOI 10.1109/iccv.2017.622

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open at publisher browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Learning to Deny: Action Denial in Multimodal Large Language Models

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

MLLMs drop from over 85% accuracy on action presence to under 50% on matched action-denial videos, exposing a causal verification gap that causal graph prompts partially close.

TimeProVe: Propose, then Verify for Efficient Long Video Temporal Reasoning in Activities of Daily Living

cs.CV · 2026-06-18 · unverdicted · novelty 6.0

TimeProVe proposes a propose-then-verify framework using lightweight action-based candidate evidence generation followed by targeted VLM verification for efficient long video temporal reasoning, achieving 7.3% improvement on OTB with 75% fewer VLM calls.

Causal Scaffolding for Physical Reasoning: A Benchmark for Causally-Informed Physical World Understanding in VLMs

cs.DB · 2026-06-04 · unverdicted · novelty 6.0

Introduces CausalPhys benchmark with causal graphs and CRFT fine-tuning to improve VLMs' causal physical reasoning accuracy and interpretability.

From Pixels to Temporal Correlations: Learning Informative Representations for Reinforcement Learning Pre-training

cs.LG · 2026-07-01 · unverdicted · novelty 5.0

MTCL learns multi-scale temporal correlations in videos via contrastive learning to produce more informative representations that improve sample efficiency and performance in downstream RL tasks.

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

cs.RO · 2025-07-02 · unverdicted · novelty 5.0

The survey frames VLA models as pipelines that generate progressively grounded action tokens and classifies those tokens into eight types to guide future development.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Learning to Deny: Action Denial in Multimodal Large Language Models cs.CV · 2026-06-30 · unverdicted · none · ref 23
MLLMs drop from over 85% accuracy on action presence to under 50% on matched action-denial videos, exposing a causal verification gap that causal graph prompts partially close.
TimeProVe: Propose, then Verify for Efficient Long Video Temporal Reasoning in Activities of Daily Living cs.CV · 2026-06-18 · unverdicted · none · ref 222
TimeProVe proposes a propose-then-verify framework using lightweight action-based candidate evidence generation followed by targeted VLM verification for efficient long video temporal reasoning, achieving 7.3% improvement on OTB with 75% fewer VLM calls.

Something Something

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer