pith. sign in
Pith Number

pith:ZWDREX3O

pith:2026:ZWDREX3OKTCODS7LHJHCN6SCJ3
not attested not anchored not stored refs resolved

Reliability and Effectiveness of Autonomous AI Agents in Supply Chain Management

Andre P. Calmon, Carol Xuan Long, David Simchi-Levi, Feng Zhu, Flavio P. Calmon, Huangyuan Su

Autonomous AI agents with optimized reasoning models outperform human teams in supply chain management by reducing costs up to 67 percent, but require post-training to control decision unreliability.

arxiv:2605.17036 v1 · 2026-05-16 · cs.AI · cs.LG · cs.MA · cs.SY · eess.SY

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ZWDREX3OKTCODS7LHJHCN6SCJ3}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Model capability is the dominant factor: an out-of-the-box reasoning model exceeds human-level performance, and optimized reasoning models reduce costs by up to 67% relative to human teams. GRPO post-training substantially reduces tail events, curtails agent bullwhip, and improves the reliability of autonomous supply-chain agents.

C2weakest assumption

The MIT Beer Game simulation with its ordering rules and information delays sufficiently captures the coordination dynamics of real multi-echelon supply chains (the entire experimental and theoretical framework is constructed on this testbed).

C3one line summary

Autonomous AI agents outperform humans in supply chain simulations but exhibit an inherent agent bullwhip effect of amplified decision unreliability, mitigated by GRPO reinforcement learning post-training.

References

18 extracted · 18 resolved · 4 Pith anchors

[1] Leonard Boussioux, Andrew Chen, Ming Fan, and Apurva Jain · doi:10.1287/opre.1050.0238
[2] Large Language Monkeys: Scaling Inference Compute with Repeated Sampling · arXiv:2407.21787
[3] Guillermo Gallego and Ilkyeong Moon · doi:10.1023/a:1008195614074
[4] Javier García and Fernando Fernández 1993 · doi:10.1057/jors.1993.141
[5] Nature645(8081), 633–638 (2025) https://doi.org/10.1038/s41586-025-09422-z · doi:10.1038/s41586-025-09422-z

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:03:37.101429Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

cd87125f6e54c4e1cbeb3a4e26fa424ec408c40096bf3bd2486d3afcba3cf7fd

Aliases

arxiv: 2605.17036 · arxiv_version: 2605.17036v1 · doi: 10.48550/arxiv.2605.17036 · pith_short_12: ZWDREX3OKTCO · pith_short_16: ZWDREX3OKTCODS7L · pith_short_8: ZWDREX3O
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ZWDREX3OKTCODS7LHJHCN6SCJ3 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: cd87125f6e54c4e1cbeb3a4e26fa424ec408c40096bf3bd2486d3afcba3cf7fd
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "6f28bfdd88d2fc0f3a3dee7e4b6715b21c75c6554a7758d40bb370656070c6be",
    "cross_cats_sorted": [
      "cs.LG",
      "cs.MA",
      "cs.SY",
      "eess.SY"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-16T15:11:35Z",
    "title_canon_sha256": "1eb7a4f4d7f8c595979ef1accca31a12fea55af2dce6d104b8957465a2b09f9a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.17036",
    "kind": "arxiv",
    "version": 1
  }
}