pith:CYBVPCPJ
EnergyAgentBench: Benchmarking LLM Agents on Live Energy Infrastructure Data
EnergyAgentBench tests LLM agents on live electricity market data to select optimal sites for AI datacenters.
arxiv:2605.15230 v1 · 2026-05-13 · econ.EM
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{CYBVPCPJRDCPAVFG3ZM7GGZOVH}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We introduce EnergyAgentBench, the first agentic benchmark grounded in live electricity market data for this problem class. Claude Sonnet 4.6 achieves the highest overall score (0.900) at one-quarter the cost of Claude Opus 4.7 (0.889).
Ground truth derived from trained XGBoost cost-surface models (R^2 0.967--0.995) and the NREL Annual Technology Baseline 2024 accurately captures real-world cost-carbon dynamics and causal grid relationships for the 70 task variants.
EnergyAgentBench is a new benchmark with 70 task variants that evaluates LLM agents on live energy data for datacenter siting, long-horizon optimization, and causal grid diagnosis.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:00:47.440632Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
16035789e988c4f054a6de59f31b2ea9f4b7fa3168ba6e158d6f063c50e3eb71
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/CYBVPCPJRDCPAVFG3ZM7GGZOVH \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 16035789e988c4f054a6de59f31b2ea9f4b7fa3168ba6e158d6f063c50e3eb71
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "13e7432fe5cd21f95273672411efd36aa0f31af80c278fac221e3ad739eb5a31",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
"primary_cat": "econ.EM",
"submitted_at": "2026-05-13T18:03:51Z",
"title_canon_sha256": "79d6ebde1d678f7be08f43296983cb94b01f83093fbcec8f514f84f03acc81d7"
},
"schema_version": "1.0",
"source": {
"id": "2605.15230",
"kind": "arxiv",
"version": 1
}
}