pith. sign in

arxiv: 2606.25098 · v1 · pith:QTDBPT5Jnew · submitted 2026-06-23 · 💻 cs.DC · cs.AI· cs.PF· cs.SY· eess.SY

Power-Flexible AI Data Centers: A New Paradigm for Grid-Responsive Compute

Pith reviewed 2026-06-25 21:51 UTC · model grok-4.3

classification 💻 cs.DC cs.AIcs.PFcs.SYeess.SY
keywords power flexibilityAI data centersgrid-responsive computeworkload orchestrationdemand responseGPU clustercarbon-aware operation
0
0 comments X

The pith

AI data centers can act as flexible grid resources through software-controlled power adjustments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that GPU-based AI data centers can respond dynamically to grid conditions by orchestrating workloads to reduce or shift power consumption. This approach addresses the challenge of rising electricity demand from AI infrastructure, which traditional planning treats as inflexible. By combining grid signals with scheduling and telemetry, the system achieves rapid load reductions, sustained curtailment, and carbon-aware operation. Real-world tests on a 130 kW cluster confirm these capabilities while maintaining service for priority jobs and enabling load shifting across sites.

Core claim

Modern GPU-based AI data centers can operate as grid-interactive assets that respond dynamically to power system conditions. An architecture integrating grid signals, workload scheduling, and power telemetry enables fine-grained cluster power control. Experimental results from a real-world deployment on a 130 kW GPU cluster demonstrate rapid load reduction, sustained curtailment, and carbon-aware operation while preserving service levels for priority jobs, along with performance-aware load shifting across geographically distributed clusters.

What carries the argument

The architecture that integrates grid signals, workload scheduling, and power telemetry for fine-grained cluster power control.

Load-bearing premise

That software-based workload orchestration on GPU clusters can deliver fine-grained and rapid power adjustments while reliably preserving service levels for priority jobs under real grid conditions.

What would settle it

Observing a grid event where the cluster either cannot reduce load within the required time or where priority job performance degrades below acceptable levels.

Figures

Figures reproduced from arXiv: 2606.25098 by Andy Neale, Ayse Coskun, Brandon Records, Chris Williams, Ciaran Roberts, Daniel Wilson, Ethan Levine, Ethan Tiao, Frank Sharp, Harry Petty, Luke Wainwright, Nikhil Shirolkar, Philip Colangelo, Sarah Soares, Scott Underwood, Scott Wallace, Shayan Sengupta, Varun Sivaram.

Figure 1
Figure 1. Figure 1: Architecture of the grid-aware workload orchestration platform used in this study (Emerald [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: AI cluster power response timed to offset a TV pickup (“tea kettle”) demand spike, overlaid [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: AI cluster responds to a historical replay of a 2019 UK grid contingency associated with [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Workload performance for 10-hour experiment, showing preservation of performance [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: AI cluster power response to live grid events submitted by National Grid and EPRI, [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Carbon-aware load following and power tracking across grid conditions. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Rack power consumption and tokens/second throughput in the Chicago and Ashburn [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
read the original abstract

The rapid expansion of artificial intelligence (AI) infrastructure is driving unprecedented growth in electricity demand from data centers. Traditional power-system planning treats large computing facilities as inflexible peak loads, leading to costly infrastructure upgrades and long delays in grid interconnection. Recent work has shown that AI clusters can reduce electricity consumption during peak demand through software-based workload orchestration. This article explores how modern GPU-based AI data centers can operate as grid-interactive assets that respond dynamically to power system conditions. We describe an architecture integrating grid signals, workload scheduling, and power telemetry for fine-grained cluster power control. Experimental results from a real-world deployment on a 130 kW GPU cluster demonstrate multiple forms of flexibility, including rapid load reduction, sustained curtailment, and carbon-aware operation while preserving service levels for priority jobs. We further demonstrate performance-aware load shifting across geographically distributed clusters, enabling workloads to migrate toward regions with lower grid stress. Together, these capabilities transform AI infrastructure from static electricity consumers into flexible resources that support grid reliability, accelerate interconnection, and improve computing sustainability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes an architecture for grid-interactive AI data centers that integrates grid signals with workload scheduling and power telemetry to enable dynamic power control. It claims that a real-world deployment on a 130 kW GPU cluster demonstrates multiple flexibility modes—rapid load reduction, sustained curtailment, carbon-aware operation, and performance-aware load shifting across distributed clusters—while preserving service levels for priority jobs.

Significance. If the experimental claims hold under scrutiny, the work could meaningfully advance the integration of AI infrastructure with power grids by converting data centers from inflexible loads into responsive resources, potentially easing interconnection delays and supporting grid reliability. The emphasis on a real deployment rather than simulation is a constructive element.

major comments (1)
  1. [Experimental Results] Experimental Results (as described in the abstract and deployment claims): the positive outcomes for rapid load reduction, sustained curtailment, and service-level preservation are presented without methods details, baselines, error bars, data exclusion rules, or quantitative results. This absence is load-bearing because the central claim of demonstrated flexibility rests entirely on these unreported experiments.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and positive assessment of the work's potential impact. We address the major comment below and will revise the manuscript to strengthen the experimental reporting.

read point-by-point responses
  1. Referee: [Experimental Results] Experimental Results (as described in the abstract and deployment claims): the positive outcomes for rapid load reduction, sustained curtailment, and service-level preservation are presented without methods details, baselines, error bars, data exclusion rules, or quantitative results. This absence is load-bearing because the central claim of demonstrated flexibility rests entirely on these unreported experiments.

    Authors: We agree that the current presentation of the experimental results lacks sufficient methodological detail, baselines, error bars, data exclusion criteria, and quantitative metrics to fully support the claims. The manuscript will be revised to include an expanded experimental section with these elements, including descriptions of the 130 kW GPU cluster setup, workload orchestration methods, measurement protocols, comparison baselines where applicable, statistical reporting, and any data filtering rules applied during the real-world deployment. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central claims rest on experimental results from a described real-world deployment on a 130 kW GPU cluster, showing load reduction, curtailment, carbon-aware operation, and cross-cluster shifting while preserving priority jobs. No mathematical derivations, equations, fitted parameters presented as predictions, or self-citation chains appear in the abstract or structure; the architecture is presented as an integration of grid signals, scheduling, and telemetry, validated externally by the deployment rather than reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, mathematical derivations, or invented entities; the central premise is a domain assumption about software control feasibility.

axioms (1)
  • domain assumption Workload orchestration on GPU clusters can achieve dynamic, fine-grained power control in response to external grid signals.
    This premise underpins all flexibility claims and experimental interpretations in the abstract.

pith-pipeline@v0.9.1-grok · 5780 in / 1101 out tokens · 25142 ms · 2026-06-25T21:51:52.588432+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

11 extracted references · 2 canonical work pages

  1. [1]

    Energy and AI,

    IEA (2025), “Energy and AI,” IEA, Paris https://www.iea.org/reports/energy-and-ai

  2. [2]

    Electricity Demand and Grid Impacts of AI Data Centers: Challenges and Prospects,

    X. Chen, X. Wang, A. Colacelli, M. Lee, and L. Xie, “Electricity Demand and Grid Impacts of AI Data Centers: Challenges and Prospects,” arXiv:2509.07218, Sep. 2025

  3. [3]

    Providing load flexibility by reshaping power profiles of large language model workloads,

    Y. Wang, Q. Guo, and M. Chen, “Providing load flexibility by reshaping power profiles of large language model workloads,” Advances in Applied Energy, vol. 18, 2025

  4. [4]

    AI data centres as grid-interactive assets,

    P. Colangelo, A. K. Coskun, J. Megrue, C. Roberts, S. Sengupta, V. Sivaram, E. Tiao, A. Vijaykar, C. Williams, D. C. Wilson, B. Records, Z. MacFarland, D. Dreiling, N. Morey, A. Ratnayake, and B. Vairamohan, “AI data centres as grid-interactive assets,” Nature Energy, 11, 254–261, 2026

  5. [5]

    Data Center Demand Response: Avoid- ing the Coincident Peak via Workload Shifting and Local Generation,

    Z. Liu, A. Wierman, Y. Chen, B. Razon, and N. Chen, “Data Center Demand Response: Avoid- ing the Coincident Peak via Workload Shifting and Local Generation,” ACM SIGMETRICS Performance Evaluation Review, vol. 41, no. 1, 2013

  6. [6]

    Greening Geographical Load Balancing,

    Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. H. Andrew, “Greening Geographical Load Balancing,” Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), 2011. 13

  7. [7]

    Carbon-Aware Computing for Datacenters,

    A. Radovanović, R. Koningstein, I. Schneider, B. Chen, A. Duarte, B. Roy, D. Xiao, M. Hari- dasan, P. Hung, N. Care, S. Talukdar, E. Mullen, K. Smith, M. Cottman, and W. Cirne, “Carbon-Aware Computing for Datacenters,” IEEE Transactions on Power Systems, vol. 38, no. 2, pp. 1270–1280, 2023

  8. [8]

    HPC Data Center Participa- tion in Demand Response: An Adaptive Policy With QoS Assurance,

    Y. Zhang, D. C. Wilson, I. Ch. Paschalidis, and A. K. Coskun, “HPC Data Center Participa- tion in Demand Response: An Adaptive Policy With QoS Assurance,” IEEE Transactions on Sustainable Computing, vol. 7, no. 1, pp. 157–171, Jan.–Mar. 2022

  9. [9]

    Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads,

    D. Shukla, M. Sivathanu, S. Viswanatha, B. Gulavani, R. Nehme, A. Agrawal, C. Chen, N. Kwatra, R. Ramjee, P. Sharma, A. Katiyar, V. Modi, V. Sharma, A. Singh, S. Singhal, K. Welankar, L. Xun, R. Anupindi, K. Elangovan, H. Rahman, Z. Lin, R. Seetharaman, C. Xu, E. Ailijiang, S. Krishnappa, and M. Russinovich, “Singularity: Planet-Scale, Preemptive and Elas...

  10. [10]

    Curtail to Compute: Sit- ing Datacenters to Leverage California’s Stranded Renewable Energy,

    L. Yu, J. Xing, A. Vadlamani, and B. C. Lee, “Curtail to Compute: Sit- ing Datacenters to Leverage California’s Stranded Renewable Energy,” Next 10, 2026. https://www.next10.org/publications/curtail-to-compute

  11. [11]

    Zeus: Understanding and optimizing GPU energy consumption of DNN training,

    J. You, J.-W. Chung, and M. Chowdhury, “Zeus: Understanding and optimizing GPU energy consumption of DNN training,” inProc. 20th USENIX Symp. Networked Syst. Design Imple- mentation (NSDI), 2023, pp. 119–139. 14