Power-Flexible AI Data Centers: A New Paradigm for Grid-Responsive Compute

Andy Neale; Ayse Coskun; Brandon Records; Chris Williams; Ciaran Roberts; Daniel Wilson; Ethan Levine; Ethan Tiao; Frank Sharp; Harry Petty

arxiv: 2606.25098 · v1 · pith:QTDBPT5Jnew · submitted 2026-06-23 · 💻 cs.DC · cs.AI· cs.PF· cs.SY· eess.SY

Power-Flexible AI Data Centers: A New Paradigm for Grid-Responsive Compute

Chris Williams , Philip Colangelo , Ayse Coskun , Ethan Levine , Andy Neale , Ciaran Roberts , Shayan Sengupta , Nikhil Shirolkar

show 10 more authors

Varun Sivaram Sarah Soares Ethan Tiao Scott Underwood Daniel Wilson Frank Sharp Luke Wainwright Harry Petty Scott Wallace Brandon Records

This is my paper

Pith reviewed 2026-06-25 21:51 UTC · model grok-4.3

classification 💻 cs.DC cs.AIcs.PFcs.SYeess.SY

keywords power flexibilityAI data centersgrid-responsive computeworkload orchestrationdemand responseGPU clustercarbon-aware operation

0 comments

The pith

AI data centers can act as flexible grid resources through software-controlled power adjustments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that GPU-based AI data centers can respond dynamically to grid conditions by orchestrating workloads to reduce or shift power consumption. This approach addresses the challenge of rising electricity demand from AI infrastructure, which traditional planning treats as inflexible. By combining grid signals with scheduling and telemetry, the system achieves rapid load reductions, sustained curtailment, and carbon-aware operation. Real-world tests on a 130 kW cluster confirm these capabilities while maintaining service for priority jobs and enabling load shifting across sites.

Core claim

Modern GPU-based AI data centers can operate as grid-interactive assets that respond dynamically to power system conditions. An architecture integrating grid signals, workload scheduling, and power telemetry enables fine-grained cluster power control. Experimental results from a real-world deployment on a 130 kW GPU cluster demonstrate rapid load reduction, sustained curtailment, and carbon-aware operation while preserving service levels for priority jobs, along with performance-aware load shifting across geographically distributed clusters.

What carries the argument

The architecture that integrates grid signals, workload scheduling, and power telemetry for fine-grained cluster power control.

Load-bearing premise

That software-based workload orchestration on GPU clusters can deliver fine-grained and rapid power adjustments while reliably preserving service levels for priority jobs under real grid conditions.

What would settle it

Observing a grid event where the cluster either cannot reduce load within the required time or where priority job performance degrades below acceptable levels.

Figures

Figures reproduced from arXiv: 2606.25098 by Andy Neale, Ayse Coskun, Brandon Records, Chris Williams, Ciaran Roberts, Daniel Wilson, Ethan Levine, Ethan Tiao, Frank Sharp, Harry Petty, Luke Wainwright, Nikhil Shirolkar, Philip Colangelo, Sarah Soares, Scott Underwood, Scott Wallace, Shayan Sengupta, Varun Sivaram.

**Figure 2.** Figure 2: AI cluster power response timed to offset a TV pickup (“tea kettle”) demand spike, overlaid [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: AI cluster responds to a historical replay of a 2019 UK grid contingency associated with [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Workload performance for 10-hour experiment, showing preservation of performance [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: AI cluster power response to live grid events submitted by National Grid and EPRI, [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Carbon-aware load following and power tracking across grid conditions. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Rack power consumption and tokens/second throughput in the Chicago and Ashburn [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

read the original abstract

The rapid expansion of artificial intelligence (AI) infrastructure is driving unprecedented growth in electricity demand from data centers. Traditional power-system planning treats large computing facilities as inflexible peak loads, leading to costly infrastructure upgrades and long delays in grid interconnection. Recent work has shown that AI clusters can reduce electricity consumption during peak demand through software-based workload orchestration. This article explores how modern GPU-based AI data centers can operate as grid-interactive assets that respond dynamically to power system conditions. We describe an architecture integrating grid signals, workload scheduling, and power telemetry for fine-grained cluster power control. Experimental results from a real-world deployment on a 130 kW GPU cluster demonstrate multiple forms of flexibility, including rapid load reduction, sustained curtailment, and carbon-aware operation while preserving service levels for priority jobs. We further demonstrate performance-aware load shifting across geographically distributed clusters, enabling workloads to migrate toward regions with lower grid stress. Together, these capabilities transform AI infrastructure from static electricity consumers into flexible resources that support grid reliability, accelerate interconnection, and improve computing sustainability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports a real 130 kW deployment showing grid-responsive AI cluster behavior including cross-site shifting, but the abstract supplies no numbers or methods to judge the results.

read the letter

The main point for you is that this work moves past simulation or prior orchestration papers by describing an architecture that ties grid signals directly to workload scheduling and power telemetry, then claims results from an actual 130 kW GPU cluster deployment. They show rapid load reduction, sustained curtailment, carbon-aware operation, and performance-aware shifting across distributed clusters while keeping priority jobs running.

What the paper does well is lay out a concrete integration path and report that the system worked on real hardware for multiple flexibility modes. The distributed shifting result is a step beyond the single-site peak reduction work they cite.

The soft spot is the lack of any quantitative detail in the abstract: no measured reduction percentages, no time scales for the rapid response, no baselines, no error bars, and no description of how service levels were verified. Without those, it is difficult to tell whether the orchestration delivered fine-grained control reliably under actual grid conditions or whether the priority-job preservation held up as stated. The logic itself is consistent and not circular.

This paper is aimed at people working on data center power management, grid interconnection, and renewable alignment. A reader who wants to see a practical architecture and deployment claims would find it useful even if the numbers need filling in.

It deserves peer review because it brings real deployment evidence to a timely problem. The evidence looks thin from the abstract alone, but the topic and the reported outcomes are worth a referee's time to check the full methods and data.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes an architecture for grid-interactive AI data centers that integrates grid signals with workload scheduling and power telemetry to enable dynamic power control. It claims that a real-world deployment on a 130 kW GPU cluster demonstrates multiple flexibility modes—rapid load reduction, sustained curtailment, carbon-aware operation, and performance-aware load shifting across distributed clusters—while preserving service levels for priority jobs.

Significance. If the experimental claims hold under scrutiny, the work could meaningfully advance the integration of AI infrastructure with power grids by converting data centers from inflexible loads into responsive resources, potentially easing interconnection delays and supporting grid reliability. The emphasis on a real deployment rather than simulation is a constructive element.

major comments (1)

[Experimental Results] Experimental Results (as described in the abstract and deployment claims): the positive outcomes for rapid load reduction, sustained curtailment, and service-level preservation are presented without methods details, baselines, error bars, data exclusion rules, or quantitative results. This absence is load-bearing because the central claim of demonstrated flexibility rests entirely on these unreported experiments.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and positive assessment of the work's potential impact. We address the major comment below and will revise the manuscript to strengthen the experimental reporting.

read point-by-point responses

Referee: [Experimental Results] Experimental Results (as described in the abstract and deployment claims): the positive outcomes for rapid load reduction, sustained curtailment, and service-level preservation are presented without methods details, baselines, error bars, data exclusion rules, or quantitative results. This absence is load-bearing because the central claim of demonstrated flexibility rests entirely on these unreported experiments.

Authors: We agree that the current presentation of the experimental results lacks sufficient methodological detail, baselines, error bars, data exclusion criteria, and quantitative metrics to fully support the claims. The manuscript will be revised to include an expanded experimental section with these elements, including descriptions of the 130 kW GPU cluster setup, workload orchestration methods, measurement protocols, comparison baselines where applicable, statistical reporting, and any data filtering rules applied during the real-world deployment. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central claims rest on experimental results from a described real-world deployment on a 130 kW GPU cluster, showing load reduction, curtailment, carbon-aware operation, and cross-cluster shifting while preserving priority jobs. No mathematical derivations, equations, fitted parameters presented as predictions, or self-citation chains appear in the abstract or structure; the architecture is presented as an integration of grid signals, scheduling, and telemetry, validated externally by the deployment rather than reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, mathematical derivations, or invented entities; the central premise is a domain assumption about software control feasibility.

axioms (1)

domain assumption Workload orchestration on GPU clusters can achieve dynamic, fine-grained power control in response to external grid signals.
This premise underpins all flexibility claims and experimental interpretations in the abstract.

pith-pipeline@v0.9.1-grok · 5780 in / 1101 out tokens · 25142 ms · 2026-06-25T21:51:52.588432+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references · 2 canonical work pages

[1]

Energy and AI,

IEA (2025), “Energy and AI,” IEA, Paris https://www.iea.org/reports/energy-and-ai

2025
[2]

Electricity Demand and Grid Impacts of AI Data Centers: Challenges and Prospects,

X. Chen, X. Wang, A. Colacelli, M. Lee, and L. Xie, “Electricity Demand and Grid Impacts of AI Data Centers: Challenges and Prospects,” arXiv:2509.07218, Sep. 2025

work page arXiv 2025
[3]

Providing load flexibility by reshaping power profiles of large language model workloads,

Y. Wang, Q. Guo, and M. Chen, “Providing load flexibility by reshaping power profiles of large language model workloads,” Advances in Applied Energy, vol. 18, 2025

2025
[4]

AI data centres as grid-interactive assets,

P. Colangelo, A. K. Coskun, J. Megrue, C. Roberts, S. Sengupta, V. Sivaram, E. Tiao, A. Vijaykar, C. Williams, D. C. Wilson, B. Records, Z. MacFarland, D. Dreiling, N. Morey, A. Ratnayake, and B. Vairamohan, “AI data centres as grid-interactive assets,” Nature Energy, 11, 254–261, 2026

2026
[5]

Data Center Demand Response: Avoid- ing the Coincident Peak via Workload Shifting and Local Generation,

Z. Liu, A. Wierman, Y. Chen, B. Razon, and N. Chen, “Data Center Demand Response: Avoid- ing the Coincident Peak via Workload Shifting and Local Generation,” ACM SIGMETRICS Performance Evaluation Review, vol. 41, no. 1, 2013

2013
[6]

Greening Geographical Load Balancing,

Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. H. Andrew, “Greening Geographical Load Balancing,” Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), 2011. 13

2011
[7]

Carbon-Aware Computing for Datacenters,

A. Radovanović, R. Koningstein, I. Schneider, B. Chen, A. Duarte, B. Roy, D. Xiao, M. Hari- dasan, P. Hung, N. Care, S. Talukdar, E. Mullen, K. Smith, M. Cottman, and W. Cirne, “Carbon-Aware Computing for Datacenters,” IEEE Transactions on Power Systems, vol. 38, no. 2, pp. 1270–1280, 2023

2023
[8]

HPC Data Center Participa- tion in Demand Response: An Adaptive Policy With QoS Assurance,

Y. Zhang, D. C. Wilson, I. Ch. Paschalidis, and A. K. Coskun, “HPC Data Center Participa- tion in Demand Response: An Adaptive Policy With QoS Assurance,” IEEE Transactions on Sustainable Computing, vol. 7, no. 1, pp. 157–171, Jan.–Mar. 2022

2022
[9]

Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads,

D. Shukla, M. Sivathanu, S. Viswanatha, B. Gulavani, R. Nehme, A. Agrawal, C. Chen, N. Kwatra, R. Ramjee, P. Sharma, A. Katiyar, V. Modi, V. Sharma, A. Singh, S. Singhal, K. Welankar, L. Xun, R. Anupindi, K. Elangovan, H. Rahman, Z. Lin, R. Seetharaman, C. Xu, E. Ailijiang, S. Krishnappa, and M. Russinovich, “Singularity: Planet-Scale, Preemptive and Elas...

work page arXiv 2022
[10]

Curtail to Compute: Sit- ing Datacenters to Leverage California’s Stranded Renewable Energy,

L. Yu, J. Xing, A. Vadlamani, and B. C. Lee, “Curtail to Compute: Sit- ing Datacenters to Leverage California’s Stranded Renewable Energy,” Next 10, 2026. https://www.next10.org/publications/curtail-to-compute

2026
[11]

Zeus: Understanding and optimizing GPU energy consumption of DNN training,

J. You, J.-W. Chung, and M. Chowdhury, “Zeus: Understanding and optimizing GPU energy consumption of DNN training,” inProc. 20th USENIX Symp. Networked Syst. Design Imple- mentation (NSDI), 2023, pp. 119–139. 14

2023

[1] [1]

Energy and AI,

IEA (2025), “Energy and AI,” IEA, Paris https://www.iea.org/reports/energy-and-ai

2025

[2] [2]

Electricity Demand and Grid Impacts of AI Data Centers: Challenges and Prospects,

X. Chen, X. Wang, A. Colacelli, M. Lee, and L. Xie, “Electricity Demand and Grid Impacts of AI Data Centers: Challenges and Prospects,” arXiv:2509.07218, Sep. 2025

work page arXiv 2025

[3] [3]

Providing load flexibility by reshaping power profiles of large language model workloads,

Y. Wang, Q. Guo, and M. Chen, “Providing load flexibility by reshaping power profiles of large language model workloads,” Advances in Applied Energy, vol. 18, 2025

2025

[4] [4]

AI data centres as grid-interactive assets,

P. Colangelo, A. K. Coskun, J. Megrue, C. Roberts, S. Sengupta, V. Sivaram, E. Tiao, A. Vijaykar, C. Williams, D. C. Wilson, B. Records, Z. MacFarland, D. Dreiling, N. Morey, A. Ratnayake, and B. Vairamohan, “AI data centres as grid-interactive assets,” Nature Energy, 11, 254–261, 2026

2026

[5] [5]

Data Center Demand Response: Avoid- ing the Coincident Peak via Workload Shifting and Local Generation,

Z. Liu, A. Wierman, Y. Chen, B. Razon, and N. Chen, “Data Center Demand Response: Avoid- ing the Coincident Peak via Workload Shifting and Local Generation,” ACM SIGMETRICS Performance Evaluation Review, vol. 41, no. 1, 2013

2013

[6] [6]

Greening Geographical Load Balancing,

Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. H. Andrew, “Greening Geographical Load Balancing,” Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), 2011. 13

2011

[7] [7]

Carbon-Aware Computing for Datacenters,

A. Radovanović, R. Koningstein, I. Schneider, B. Chen, A. Duarte, B. Roy, D. Xiao, M. Hari- dasan, P. Hung, N. Care, S. Talukdar, E. Mullen, K. Smith, M. Cottman, and W. Cirne, “Carbon-Aware Computing for Datacenters,” IEEE Transactions on Power Systems, vol. 38, no. 2, pp. 1270–1280, 2023

2023

[8] [8]

HPC Data Center Participa- tion in Demand Response: An Adaptive Policy With QoS Assurance,

Y. Zhang, D. C. Wilson, I. Ch. Paschalidis, and A. K. Coskun, “HPC Data Center Participa- tion in Demand Response: An Adaptive Policy With QoS Assurance,” IEEE Transactions on Sustainable Computing, vol. 7, no. 1, pp. 157–171, Jan.–Mar. 2022

2022

[9] [9]

Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads,

D. Shukla, M. Sivathanu, S. Viswanatha, B. Gulavani, R. Nehme, A. Agrawal, C. Chen, N. Kwatra, R. Ramjee, P. Sharma, A. Katiyar, V. Modi, V. Sharma, A. Singh, S. Singhal, K. Welankar, L. Xun, R. Anupindi, K. Elangovan, H. Rahman, Z. Lin, R. Seetharaman, C. Xu, E. Ailijiang, S. Krishnappa, and M. Russinovich, “Singularity: Planet-Scale, Preemptive and Elas...

work page arXiv 2022

[10] [10]

Curtail to Compute: Sit- ing Datacenters to Leverage California’s Stranded Renewable Energy,

L. Yu, J. Xing, A. Vadlamani, and B. C. Lee, “Curtail to Compute: Sit- ing Datacenters to Leverage California’s Stranded Renewable Energy,” Next 10, 2026. https://www.next10.org/publications/curtail-to-compute

2026

[11] [11]

Zeus: Understanding and optimizing GPU energy consumption of DNN training,

J. You, J.-W. Chung, and M. Chowdhury, “Zeus: Understanding and optimizing GPU energy consumption of DNN training,” inProc. 20th USENIX Symp. Networked Syst. Design Imple- mentation (NSDI), 2023, pp. 119–139. 14

2023