Pith Number

pith:YI6ODTLD

pith:2026:YI6ODTLDHTCQ2NWXXRNRXKPDI2

not attested not anchored not stored refs resolved

DelAC: A Multi-agent Reinforcement Learning of Team-Symmetric Stochastic Games

Duan-Shin Lee, Yu-Hsiu Hung

Team-symmetric stochastic games always have a team-symmetric Nash equilibrium that a new actor-critic algorithm can locate efficiently.

arxiv:2605.12555 v1 · 2026-05-11 · cs.MA · cs.GT

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{YI6ODTLDHTCQ2NWXXRNRXKPDI2}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We show that team-symmetric games always have a team-symmetric Nash equilibrium. We develop and solve a linear complementarity problem of team-symmetric Nash equilibria. ... this multi-agent reinforcement learning algorithm performs much better than many existing algorithms.

C2weakest assumption

The assumption that players within a team have perfectly symmetric identities and identical payoff functions holds in the target applications, and that simulation results generalize beyond the tested environments.

C3one line summary

Team-symmetric games always have team-symmetric Nash equilibria solvable via linear complementarity problems, and the DelAC actor-critic MARL algorithm outperforms existing methods in simulations.

References

35 extracted · 35 resolved · 3 Pith anchors

[1] S. V. Albrecht, F. Christianos, and L. Sch ¨afer,Multi-Agent Reinforcement Learning Foundations and Modern Approaches. Cambridge, Massachusetts: The MIT Press, 2024 2024

[2] If multi-agent learning is the answer, what is the question? 2007

[3] Nash Q-learning for general-sum stochastic games, 2003

[4] The complexity of computing a Nash equilibrium, 2006

[5] Settling the complexity of two-player Nash equilibrium, 2006

Receipt and verification

First computed	2026-05-18T03:10:02.069195Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

c23ce1cd633cc50d36d7bc5b1ba9e34696a3888045a5da3051dd49eb88143e08

Aliases

arxiv: 2605.12555 · arxiv_version: 2605.12555v1 · doi: 10.48550/arxiv.2605.12555 · pith_short_12: YI6ODTLDHTCQ · pith_short_16: YI6ODTLDHTCQ2NWX · pith_short_8: YI6ODTLD

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/YI6ODTLDHTCQ2NWXXRNRXKPDI2 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c23ce1cd633cc50d36d7bc5b1ba9e34696a3888045a5da3051dd49eb88143e08

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "02c6b6054327cdbfc335de78e18242affb2f67baa6fed6c151a1b154a8a1c57b",
    "cross_cats_sorted": [
      "cs.GT"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.MA",
    "submitted_at": "2026-05-11T12:00:27Z",
    "title_canon_sha256": "c4586bc3129a7d398f8d3cc4794cd96fd5395be9c969d76d4e91a4e4aafabf41"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12555",
    "kind": "arxiv",
    "version": 1
  }
}