Coma: Compositional human motion generation with multi-modal agents

· 2024 · arXiv 2412.07320

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

LangDriveCTRL: Natural Language Controllable Driving Scene Editing with Multi-modal Agents

cs.CV · 2025-12-19 · unverdicted · novelty 7.0

LangDriveCTRL decomposes driving videos into 3D scene graphs and uses an agentic pipeline with specialized multi-modal agents to perform language-controlled object and behavior edits, achieving nearly 2x higher instruction alignment than prior state-of-the-art methods.

Multi-Modal Manipulation via Multi-Modal Policy Consensus

cs.RO · 2025-09-27 · unverdicted · novelty 7.0

A policy that factorizes into modality-specific diffusion models combined by a learned router network for adaptive multi-modal robotic manipulation.

citing papers explorer

Showing 2 of 2 citing papers.

LangDriveCTRL: Natural Language Controllable Driving Scene Editing with Multi-modal Agents cs.CV · 2025-12-19 · unverdicted · none · ref 43
LangDriveCTRL decomposes driving videos into 3D scene graphs and uses an agentic pipeline with specialized multi-modal agents to perform language-controlled object and behavior edits, achieving nearly 2x higher instruction alignment than prior state-of-the-art methods.
Multi-Modal Manipulation via Multi-Modal Policy Consensus cs.RO · 2025-09-27 · unverdicted · none · ref 34
A policy that factorizes into modality-specific diffusion models combined by a learned router network for adaptive multi-modal robotic manipulation.

Coma: Compositional human motion generation with multi-modal agents

fields

years

verdicts

representative citing papers

citing papers explorer