CLS-DP distills privileged multi-agent dynamics into a collaborative latent space that each agent infers from local RGB observations to condition diffusion-based actions, achieving 38% mean success on six RoboFactory tasks versus 20% for the best centralized baseline.
An initial introduction to cooperative multi-agent reinforcement learning
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.
citing papers explorer
-
Distilling Collaborative Dynamics into Latent Space for Implicit Coordination in Decentralized Multi-Agent Manipulation
CLS-DP distills privileged multi-agent dynamics into a collaborative latent space that each agent infers from local RGB observations to condition diffusion-based actions, achieving 38% mean success on six RoboFactory tasks versus 20% for the best centralized baseline.
-
Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning
MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.