SceneGraphVLM generates dynamic scene graphs from video using compact VLMs, TOON serialization, and hallucination-aware RL to improve precision and achieve one-second latency.
Compile scene graphs with reinforcement learning
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2representative citing papers
citing papers explorer
-
SceneGraphVLM: Dynamic Scene Graph Generation from Video with Vision-Language Models
SceneGraphVLM generates dynamic scene graphs from video using compact VLMs, TOON serialization, and hallucination-aware RL to improve precision and achieve one-second latency.
- SenBen: Sensitive Scene Graphs for Explainable Content Moderation