A corrective double deep Q-network framework uses encoded message-passing to refine delayed and noisy global states for improved multi-agent control policies.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
representative citing papers
AmelPredSto, a stochastic self-predictive representation model, outperforms other state representation learning approaches when combined with actor-critic RL for object-goal navigation in UAVs.
A new robust Q-CBF framework synthesized via adversarial RL enables safety enforcement on the maximal robust safe set for black-box nonlinear systems.