Title resolution pending

Bin Lin, Yang Ye, Bin Zhu, Jiaxi Cui, Munan Ning, Peng Jin, Li Yuan · 2024

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Where to Focus: Query-Modulated Multimodal Keyframe Selection for Long Video Understanding

cs.CV · 2026-04-19 · unverdicted · novelty 6.0

Q-Gate dynamically routes keyframe selection in long videos via query-modulated gating across visual grounding, global matching, and contextual alignment experts to improve MLLM performance.

STEAR: Layer-Aware Spatiotemporal Evidence Intervention for Hallucination Mitigation in Video Large Language Models

cs.CV · 2026-04-03 · unverdicted · novelty 6.0

STEAR reduces spatial and temporal hallucinations in Video-LLMs via layer-aware evidence intervention from middle decoder layers in a single-encode pass.

Zoom In, Reason Out: Efficient Far-field Anomaly Detection in Expressway Surveillance Videos via Focused VLM Reasoning Guided by Bayesian Inference

cs.CV · 2026-04-26 · unverdicted · novelty 4.0

VIBES uses Bayesian inference to trigger focused VLM reasoning on localized far-field regions in expressway videos, improving anomaly detection accuracy and efficiency.

citing papers explorer

Showing 3 of 3 citing papers.

Where to Focus: Query-Modulated Multimodal Keyframe Selection for Long Video Understanding cs.CV · 2026-04-19 · unverdicted · none · ref 19
Q-Gate dynamically routes keyframe selection in long videos via query-modulated gating across visual grounding, global matching, and contextual alignment experts to improve MLLM performance.
STEAR: Layer-Aware Spatiotemporal Evidence Intervention for Hallucination Mitigation in Video Large Language Models cs.CV · 2026-04-03 · unverdicted · none · ref 28
STEAR reduces spatial and temporal hallucinations in Video-LLMs via layer-aware evidence intervention from middle decoder layers in a single-encode pass.
Zoom In, Reason Out: Efficient Far-field Anomaly Detection in Expressway Surveillance Videos via Focused VLM Reasoning Guided by Bayesian Inference cs.CV · 2026-04-26 · unverdicted · none · ref 21
VIBES uses Bayesian inference to trigger focused VLM reasoning on localized far-field regions in expressway videos, improving anomaly detection accuracy and efficiency.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer