A self-supervised multimodal alignment step plus equivariant GNN-based MARL yields over twofold sensing accuracy and 50% performance gains in decentralized V2I rate maximization.
Monotonic value function factorisation for deep multi-agent reinforcement learning
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
CMAT uses a transformer decoder to produce a high-level consensus vector in latent space, enabling simultaneous order-independent actions by all agents and optimization via single-agent PPO, with superior results on StarCraft II, Multi-Agent MuJoCo, and Google Research Football.
A method infers resilience-promoting reward functions via trajectory scoring and integrates them into MARL, with hybrid incentives shown to reduce collapse in disrupted resource environments.
citing papers explorer
-
Equivariant Multi-agent Reinforcement Learning for Multimodal Vehicle-to-Infrastructure Systems
A self-supervised multimodal alignment step plus equivariant GNN-based MARL yields over twofold sensing accuracy and 50% performance gains in decentralized V2I rate maximization.
-
Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus
CMAT uses a transformer decoder to produce a high-level consensus vector in latent space, enabling simultaneous order-independent actions by all agents and optimization via single-agent PPO, with superior results on StarCraft II, Multi-Agent MuJoCo, and Google Research Football.
-
Learning Incentive Structures for Cooperative Resilience in Multi-Agent Systems under Social Dilemmas
A method infers resilience-promoting reward functions via trajectory scoring and integrates them into MARL, with hybrid incentives shown to reduce collapse in disrupted resource environments.