Viqagent: Zero-shot video question answering via agent with open-vocabulary grounding validation.arXiv preprint arXiv:2505.15928, 2025

Tony Montes, Fernando Lozano · 2025 · arXiv 2505.15928

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

A Cost-Aware, Paired Protocol for Auditing Dynamic Tool Synthesis in Agentic Video Question Answering

cs.CV · 2026-07-01 · unverdicted · novelty 7.0

Introduces a cost-aware paired protocol with six outcome groups and applies it to Dynamic-SAGE versus SAGE, reporting 7.5-point accuracy gain, 28% fewer tool calls, but 34% higher token use.

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

cs.CV · 2026-06-05 · unverdicted · novelty 4.0

This is a survey that frames video MLLM research via a human-view formulation of perceptual representations, memory states, reasoning traces, and predictions, then reviews methods, datasets, benchmarks, and open problems.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Watch, Remember, Reason: Human-View Video Understanding with MLLMs cs.CV · 2026-06-05 · unverdicted · none · ref 214
This is a survey that frames video MLLM research via a human-view formulation of perceptual representations, memory states, reasoning traces, and predictions, then reviews methods, datasets, benchmarks, and open problems.

Viqagent: Zero-shot video question answering via agent with open-vocabulary grounding validation.arXiv preprint arXiv:2505.15928, 2025

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer