pith. sign in
Pith Number

pith:J2QAIDUV

pith:2026:J2QAIDUV7JSXVG22VIX4DVT76T
not attested not anchored not stored refs pending

SAGA: Workflow-Atomic Scheduling for AI Agent Inference on GPU Clusters

Dongxin Guo, Jikun Wu, Siu Ming Yiu

Scheduling AI agent workflows as single units reduces task completion time by 1.64x on GPU clusters

arxiv:2605.00528 v2 · 2026-05-01 · cs.DC · cs.AI · cs.LG · cs.OS

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{J2QAIDUV7JSXVG22VIX4DVT76T}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

On a 64-GPU cluster serving SWE-bench coding agents and WebArena browser tasks, SAGA reduces task completion time by 1.64x (geometric mean, p < 0.001) over vLLM v0.15.1 with prefix caching and affinity routing, while improving GPU memory utilization by 1.22x and achieving 99.2% SLO attainment under multi-tenant interference.

C2weakest assumption

That real-world agent workflows possess enough predictable structure for Agent Execution Graphs to accurately forecast KV cache reuse across tool-call boundaries, and that the dominant use case is latency-sensitive interactive work where a 30% throughput reduction is acceptable.

C3one line summary

SAGA reduces AI agent task completion time by 1.64x on 64-GPU clusters by scheduling at the full workflow level with execution graphs, affinity batching, and completion-time fairness.

Receipt and verification
First computed 2026-06-23T01:12:07.813331Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

4ea0040e95fa657a9b5aaa2fc1d67ff4c6b0255c65fad3a8c048cc30ad808bbf

Aliases

arxiv: 2605.00528 · arxiv_version: 2605.00528v2 · doi: 10.48550/arxiv.2605.00528 · pith_short_12: J2QAIDUV7JSX · pith_short_16: J2QAIDUV7JSXVG22 · pith_short_8: J2QAIDUV
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/J2QAIDUV7JSXVG22VIX4DVT76T \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4ea0040e95fa657a9b5aaa2fc1d67ff4c6b0255c65fad3a8c048cc30ad808bbf
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "4ee36fde351ce84edcf098394946a2a402b13395be3b520a29cca5a99fa7ae6b",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.LG",
      "cs.OS"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.DC",
    "submitted_at": "2026-05-01T09:05:28Z",
    "title_canon_sha256": "29c1169a251f1f9b1d2f4ce3ac03649d8a0ad6b41c66274100ccb236ea15aa1b"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.00528",
    "kind": "arxiv",
    "version": 2
  }
}