Journal of Machine Learning Research , volume=

Monotonic value function factorisation for deep multi-agent reinforcement learning , author=

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Submodular Multi-Agent Policy Learning for Online Distributed Task Allocation in Open Multi-Agent Systems

eess.SY · 2026-05-13 · unverdicted · novelty 7.0

SubMAPG uses a new Partition Multilinear Extension to derive unbiased policy gradients from submodular difference rewards, delivering 1/2-approximation and sublinear dynamic regret for online distributed task allocation in open multi-agent systems.

Randomness is sometimes necessary for coordination

cs.AI · 2026-05-07 · conditional · novelty 7.0

Structured per-agent randomness via ranked masking in attention allows symmetric agents to break ties and coordinate, achieving perfect success on symmetric tasks where deterministic policies fail and enabling zero-shot transfer across team sizes.

NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

NonZero introduces an interaction score and bandit-formalized proposal rule for local agent deviations in multi-agent MCTS, delivering a sublinear local-regret guarantee and improved sample efficiency on game benchmarks without full joint-action enumeration.

TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

TeamTR is a trust-region framework for multi-agent LLM fine-tuning that resamples trajectories after each update to convert quadratic compounding occupancy shift into linear scaling and yields per-update improvement lower bounds.

ERPPO: Entropy Regularization-based Proximal Policy Optimization

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

ERPPO adds a DSA-based ambiguity estimator to MAPPO and switches between L1 and L2 entropy regularization to improve exploration and stability in non-stationary multi-dimensional observations.

citing papers explorer

Showing 5 of 5 citing papers.

Submodular Multi-Agent Policy Learning for Online Distributed Task Allocation in Open Multi-Agent Systems eess.SY · 2026-05-13 · unverdicted · none · ref 108
SubMAPG uses a new Partition Multilinear Extension to derive unbiased policy gradients from submodular difference rewards, delivering 1/2-approximation and sublinear dynamic regret for online distributed task allocation in open multi-agent systems.
Randomness is sometimes necessary for coordination cs.AI · 2026-05-07 · conditional · none · ref 14
Structured per-agent randomness via ranked masking in attention allows symmetric agents to break ties and coordinate, achieving perfect success on symmetric tasks where deterministic policies fail and enabling zero-shot transfer across team sizes.
NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search cs.LG · 2026-05-01 · unverdicted · none · ref 38
NonZero introduces an interaction score and bandit-formalized proposal rule for local agent deviations in multi-agent MCTS, delivering a sublinear local-regret guarantee and improved sample efficiency on game benchmarks without full joint-action enumeration.
TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination cs.LG · 2026-05-01 · unverdicted · none · ref 30
TeamTR is a trust-region framework for multi-agent LLM fine-tuning that resamples trajectories after each update to convert quadratic compounding occupancy shift into linear scaling and yields per-update improvement lower bounds.
ERPPO: Entropy Regularization-based Proximal Policy Optimization cs.LG · 2026-05-13 · unverdicted · none · ref 8
ERPPO adds a DSA-based ambiguity estimator to MAPPO and switches between L1 and L2 entropy regularization to improve exploration and stability in non-stationary multi-dimensional observations.

Journal of Machine Learning Research , volume=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer