Multi-agent actor-critic for mixed cooperative-competitive environments.Advances in neural information processing systems, 30

Ryan Lowe, Yi I Wu, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, Igor Mordatch · 2017

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 3 dataset 1

citation-polarity summary

background 3 use dataset 1

representative citing papers

Beyond Partner Diversity: An Influence-Based Team Steering Framework for Zero-Shot Human-Machine Teaming

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

IBTS framework uses influence shaping to improve zero-shot human-machine teaming beyond partner diversity alone, with gains shown in Overcooked-AI simulations and a 30-subject human study.

ADKO: Agentic Decentralized Knowledge Optimization

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

ADKO is a decentralized framework where agents share compact GP-derived tokens and LM insights to achieve collaborative Bayesian optimization with a decomposed regret bound that includes compression and approximation losses.

Interactive Inverse Reinforcement Learning of Interaction Scenarios via Bi-level Optimization

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

Interactive IRL is cast as bi-level optimization with an inner loop learning expert rewards and an outer loop learning interaction policies, solved by the convergent BISIRL algorithm.

Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems

cs.MA · 2026-04-03 · unverdicted · novelty 6.0

LLM agent societies develop power-law coordination cascades and intellectual elites through an integration bottleneck that grows with system size.

Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints

cs.MA · 2026-05-20 · unverdicted · novelty 5.0

SLIM decouples inter-agent communication from policy execution in MARL via a dedicated pathway and a normalized bandwidth budget β, yielding robust performance under tight communication limits on standard benchmarks.

Quantifying the Utility of User Simulators for Building Collaborative LLM Assistants

cs.CL · 2026-05-10 · unverdicted · novelty 5.0

Fine-tuned simulators grounded in real human data produce LLM assistants that win more often against real users than those trained against role-playing simulators.

Cross-Modal Navigation with Multi-Agent Reinforcement Learning

cs.RO · 2026-05-07 · unverdicted · novelty 5.0

CRONA is a MARL framework that uses modality-specialized agents with auxiliary beliefs and a centralized multi-modal critic to achieve better performance and efficiency than single-agent baselines on visual-acoustic navigation tasks.

citing papers explorer

Showing 7 of 7 citing papers.

Beyond Partner Diversity: An Influence-Based Team Steering Framework for Zero-Shot Human-Machine Teaming cs.AI · 2026-05-14 · unverdicted · none · ref 24
IBTS framework uses influence shaping to improve zero-shot human-machine teaming beyond partner diversity alone, with gains shown in Overcooked-AI simulations and a 30-subject human study.
ADKO: Agentic Decentralized Knowledge Optimization cs.LG · 2026-05-08 · unverdicted · none · ref 34
ADKO is a decentralized framework where agents share compact GP-derived tokens and LM insights to achieve collaborative Bayesian optimization with a decomposed regret bound that includes compression and approximation losses.
Interactive Inverse Reinforcement Learning of Interaction Scenarios via Bi-level Optimization cs.LG · 2026-05-01 · unverdicted · none · ref 22
Interactive IRL is cast as bi-level optimization with an inner loop learning expert rewards and an outer loop learning interaction policies, solved by the convergent BISIRL algorithm.
Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems cs.MA · 2026-04-03 · unverdicted · none · ref 38
LLM agent societies develop power-law coordination cascades and intellectual elites through an integration bottleneck that grows with system size.
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints cs.MA · 2026-05-20 · unverdicted · none · ref 23
SLIM decouples inter-agent communication from policy execution in MARL via a dedicated pathway and a normalized bandwidth budget β, yielding robust performance under tight communication limits on standard benchmarks.
Quantifying the Utility of User Simulators for Building Collaborative LLM Assistants cs.CL · 2026-05-10 · unverdicted · none · ref 49
Fine-tuned simulators grounded in real human data produce LLM assistants that win more often against real users than those trained against role-playing simulators.
Cross-Modal Navigation with Multi-Agent Reinforcement Learning cs.RO · 2026-05-07 · unverdicted · none · ref 74
CRONA is a MARL framework that uses modality-specialized agents with auxiliary beliefs and a centralized multi-modal critic to achieve better performance and efficiency than single-agent baselines on visual-acoustic navigation tasks.

Multi-agent actor-critic for mixed cooperative-competitive environments.Advances in neural information processing systems, 30

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer