LangDriveCTRL decomposes driving videos into 3D scene graphs and uses an agentic pipeline with specialized multi-modal agents to perform language-controlled object and behavior edits, achieving nearly 2x higher instruction alignment than prior state-of-the-art methods.
Langsplat: 3d language gaus- sian splatting
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LangDriveCTRL: Natural Language Controllable Driving Scene Editing with Multi-modal Agents
LangDriveCTRL decomposes driving videos into 3D scene graphs and uses an agentic pipeline with specialized multi-modal agents to perform language-controlled object and behavior edits, achieving nearly 2x higher instruction alignment than prior state-of-the-art methods.