Leadqa: Llm-driven context-aware temporal grounding for video ques- tion answering,

· 2025 · arXiv 2507.14784

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Counterfactual Reasoning for Fine-Grained Evidence Disentanglement in VideoQA

cs.CV · 2026-06-08 · unverdicted · novelty 6.0

CREDiT applies counterfactual reasoning via structural causal models to decompose video representations into causal and non-causal parts for more reliable VideoQA on datasets like NExT-GQA and SportsQA.

VTI-CoT: Visual-Textual Interleaved Chain of Thought for Video Reasoning

cs.CV · 2026-06-04 · unverdicted · novelty 6.0

VTI-CoT proposes a visual-textual interleaved chain-of-thought method for video reasoning, built via automated annotation and OCR compression, claiming SOTA performance and better training efficiency on same-scale models.

UpstreamQA: A Modular Framework for Explicit Reasoning on Video Question Answering Tasks

cs.CV · 2026-04-25 · unverdicted · novelty 5.0

UpstreamQA disentangles video reasoning by using LRMs for explicit upstream object identification and scene context before downstream LMM VideoQA, improving performance and interpretability on OpenEQA and NExTQA in some cases.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Counterfactual Reasoning for Fine-Grained Evidence Disentanglement in VideoQA cs.CV · 2026-06-08 · unverdicted · none · ref 8
CREDiT applies counterfactual reasoning via structural causal models to decompose video representations into causal and non-causal parts for more reliable VideoQA on datasets like NExT-GQA and SportsQA.
VTI-CoT: Visual-Textual Interleaved Chain of Thought for Video Reasoning cs.CV · 2026-06-04 · unverdicted · none · ref 7
VTI-CoT proposes a visual-textual interleaved chain-of-thought method for video reasoning, built via automated annotation and OCR compression, claiming SOTA performance and better training efficiency on same-scale models.
UpstreamQA: A Modular Framework for Explicit Reasoning on Video Question Answering Tasks cs.CV · 2026-04-25 · unverdicted · none · ref 13
UpstreamQA disentangles video reasoning by using LRMs for explicit upstream object identification and scene context before downstream LMM VideoQA, improving performance and interpretability on OpenEQA and NExTQA in some cases.

Leadqa: Llm-driven context-aware temporal grounding for video ques- tion answering,

fields

years

verdicts

representative citing papers

citing papers explorer