pith. sign in
Pith Number

pith:CYBVPCPJ

pith:2026:CYBVPCPJRDCPAVFG3ZM7GGZOVH
not attested not anchored not stored refs resolved

EnergyAgentBench: Benchmarking LLM Agents on Live Energy Infrastructure Data

Eliseo Curcio

EnergyAgentBench tests LLM agents on live electricity market data to select optimal sites for AI datacenters.

arxiv:2605.15230 v1 · 2026-05-13 · econ.EM

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{CYBVPCPJRDCPAVFG3ZM7GGZOVH}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We introduce EnergyAgentBench, the first agentic benchmark grounded in live electricity market data for this problem class. Claude Sonnet 4.6 achieves the highest overall score (0.900) at one-quarter the cost of Claude Opus 4.7 (0.889).

C2weakest assumption

Ground truth derived from trained XGBoost cost-surface models (R^2 0.967--0.995) and the NREL Annual Technology Baseline 2024 accurately captures real-world cost-carbon dynamics and causal grid relationships for the 70 task variants.

C3one line summary

EnergyAgentBench is a new benchmark with 70 task variants that evaluates LLM agents on live energy data for datacenter siting, long-horizon optimization, and causal grid diagnosis.

References

37 extracted · 37 resolved · 13 Pith anchors

[1] International Energy Agency. Energy and AI. IEA, Paris, 2025. Available: https://www.iea.org/reports/energy-and-ai 2025
[2] Energy and AI: Executive Summary 2025
[3] Data Centre Electricity Use Surged in 2025 2025
[4] How Much Electricity Does a Data Center Use? Complete 2025 Analysis 2025
[5] Curcio, Curcio, Eliseo, Risk-Aware AI-Driven Design Optimization of Grid-Connected Hydrogen Systems Under Stochastic Operating Conditions (March 23, 2026) 2026

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:00:47.440632Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

16035789e988c4f054a6de59f31b2ea9f4b7fa3168ba6e158d6f063c50e3eb71

Aliases

arxiv: 2605.15230 · arxiv_version: 2605.15230v1 · doi: 10.48550/arxiv.2605.15230 · pith_short_12: CYBVPCPJRDCP · pith_short_16: CYBVPCPJRDCPAVFG · pith_short_8: CYBVPCPJ
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/CYBVPCPJRDCPAVFG3ZM7GGZOVH \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 16035789e988c4f054a6de59f31b2ea9f4b7fa3168ba6e158d6f063c50e3eb71
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "13e7432fe5cd21f95273672411efd36aa0f31af80c278fac221e3ad739eb5a31",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "econ.EM",
    "submitted_at": "2026-05-13T18:03:51Z",
    "title_canon_sha256": "79d6ebde1d678f7be08f43296983cb94b01f83093fbcec8f514f84f03acc81d7"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15230",
    "kind": "arxiv",
    "version": 1
  }
}