The surprising effectiveness of PPO in cooperative multi-agent games,

· 2022

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

pcsp is a shared RL policy using LLM persona embeddings, low-rank projection, and PPO+InfoNCE+KL training that delivers 17x above-chance zero-shot persona identification and 22x faster inference on a 300-persona benchmark.

Safe and Policy-Compliant Multi-Agent Orchestration for Enterprise AI

cs.AI · 2026-04-19 · unverdicted · novelty 5.0

CAMCO enforces policy constraints on multi-agent AI at deployment time via convex projection, risk-weighted Lagrangian shaping, and bounded-convergence negotiation, yielding zero violations and 92-97% utility in tested enterprise scenarios.

citing papers explorer

Showing 2 of 2 citing papers.

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents cs.AI · 2026-05-22 · unverdicted · none · ref 26
pcsp is a shared RL policy using LLM persona embeddings, low-rank projection, and PPO+InfoNCE+KL training that delivers 17x above-chance zero-shot persona identification and 22x faster inference on a 300-persona benchmark.
Safe and Policy-Compliant Multi-Agent Orchestration for Enterprise AI cs.AI · 2026-04-19 · unverdicted · none · ref 11
CAMCO enforces policy constraints on multi-agent AI at deployment time via convex projection, risk-weighted Lagrangian shaping, and bounded-convergence negotiation, yielding zero violations and 92-97% utility in tested enterprise scenarios.

The surprising effectiveness of PPO in cooperative multi-agent games,

fields

years

verdicts

representative citing papers

citing papers explorer