pith. sign in
Pith Number

pith:7Z62GUQF

pith:2026:7Z62GUQFPE4EE6LGC3POT2YUMR
not attested not anchored not stored refs resolved

HexAGenT: Efficient Agentic LLM Serving via Workflow- and Heterogeneity-Aware Scheduling

Binhang Yuan, Chen Wang, Jiawei Jiang, Ke Zhou, Wenshuang Li, Xu Xu, Youhe Jiang, You Peng

HexAGenT schedules agentic LLM workflows on heterogeneous GPU clusters to cut the SLO scale needed for timely end-to-end completion.

arxiv:2605.16637 v1 · 2026-05-15 · cs.DC

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7Z62GUQFPE4EE6LGC3POT2YUMR}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across representative agentic workloads and heterogeneous A100/H100/H200 clusters, HexAGenT reduces the SLO scale required for timely workflow completion by an average of 20.1% at 95% attainment and 33.0% at 99% attainment, with maximum reductions of 45.0% and 80.5%, respectively.

C2weakest assumption

The representative agentic workloads and the specific heterogeneous A100/H100/H200 cluster configurations used for evaluation are sufficiently similar to real production deployments that the reported reductions in required SLO scale will generalize.

C3one line summary

HexAGenT reduces the SLO scale required for timely agentic LLM workflow completion by an average of 20.1% at 95% attainment and 33.0% at 99% attainment on heterogeneous A100/H100/H200 clusters.

References

57 extracted · 57 resolved · 3 Pith anchors

[1] Gulavani, Alexey Tumanov, and Ramachandran Ramjee 2024
[2] Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing, pages 593–602, New York, NY, USA, 199 1994 · doi:10.1145/195058.195412
[3] Graph of thoughts: Solving elaborate problems with large language models 2024
[4] Gonzalez, Matei Zaharia, and Ion Stoica 2025 · doi:10.1145/3669940.3707267
[5] Gonzalez, Ion Stoica, and Eric P 2023

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:02:33.661896Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

fe7da35205793842796616dee9eb14647a7b849d3aeeba3fd5ac2dbca31611c1

Aliases

arxiv: 2605.16637 · arxiv_version: 2605.16637v1 · doi: 10.48550/arxiv.2605.16637 · pith_short_12: 7Z62GUQFPE4E · pith_short_16: 7Z62GUQFPE4EE6LG · pith_short_8: 7Z62GUQF
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7Z62GUQFPE4EE6LGC3POT2YUMR \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: fe7da35205793842796616dee9eb14647a7b849d3aeeba3fd5ac2dbca31611c1
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "77d375aa47b489130bc7f7a1d563f69c88050fe4ec8cbdc80b55a67c2779489a",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.DC",
    "submitted_at": "2026-05-15T21:09:34Z",
    "title_canon_sha256": "98083d893fbed3e51d3fd44140d85c745654be7d1d10774a8ce1d3b9a6baf02c"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16637",
    "kind": "arxiv",
    "version": 1
  }
}