Soap2Soap uses a multi-agent system with dual-bridge consistency via JSON screenplays and visual anchors plus batch keyframe generation to achieve better long-term consistency in cinematic video remaking than commercial APIs.
Mm-vid: Advancing video understanding with gpt-4v (ision)
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3representative citing papers
VideoStir introduces a spatio-temporal graph-based structure and intent-aware retrieval for long-video RAG, achieving competitive performance with SOTA methods via a new IR-600K dataset.
AI drafts for audio description reduce editing time and cognitive load only when they exceed a content-dependent quality threshold, unlike unguided baseline drafts.
citing papers explorer
-
Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration
Soap2Soap uses a multi-agent system with dual-bridge consistency via JSON screenplays and visual anchors plus batch keyframe generation to achieve better long-term consistency in cinematic video remaking than commercial APIs.
-
VideoStir: Understanding Long Videos via Spatio-Temporally Structured and Intent-Aware RAG
VideoStir introduces a spatio-temporal graph-based structure and intent-aware retrieval for long-video RAG, achieving competitive performance with SOTA methods via a new IR-600K dataset.
-
Making AI Drafts Count: A Quality Threshold in Audio Description Workflows
AI drafts for audio description reduce editing time and cognitive load only when they exceed a content-dependent quality threshold, unlike unguided baseline drafts.