Vad-r1: Towards video anomaly reasoning via perception-to-cognition chain-of-thought

Chao Huang, Benfeng Wang, Jie Wen, Chengliang Liu, Wei Wang, Li Shen, Xiaochun Cao · 2025 · arXiv 2505.19877

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

MAVEN pipeline generates multi-scale spatio-temporal event descriptions from videos using agentic adaptation and refinement, then produces training data that lets a fine-tuned 8B model outperform Gemini baselines on private CCTV and AccidentBench tasks.

ESOM: Efficiently Understanding Streaming Video Anomalies with Open-world Dynamic Definitions

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

ESOM is a training-free streaming model for open-world video anomaly detection with dynamic definitions that achieves real-time single-GPU efficiency and state-of-the-art results on a new benchmark.

citing papers explorer

Showing 2 of 2 citing papers.

MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks cs.CV · 2026-05-21 · unverdicted · none · ref 13
MAVEN pipeline generates multi-scale spatio-temporal event descriptions from videos using agentic adaptation and refinement, then produces training data that lets a fine-tuned 8B model outperform Gemini baselines on private CCTV and AccidentBench tasks.
ESOM: Efficiently Understanding Streaming Video Anomalies with Open-world Dynamic Definitions cs.CV · 2026-04-09 · unverdicted · none · ref 15
ESOM is a training-free streaming model for open-world video anomaly detection with dynamic definitions that achieves real-time single-GPU efficiency and state-of-the-art results on a new benchmark.

Vad-r1: Towards video anomaly reasoning via perception-to-cognition chain-of-thought

fields

years

verdicts

representative citing papers

citing papers explorer