arxiv: 1910.09700 · v2 · submitted 2019-10-21 · 💻 cs.CY · cs.LG

Recognition: 2 theorem links

· Lean Theorem

Quantifying the Carbon Emissions of Machine Learning

Alexandre Lacoste , Alexandra Luccioni , Victor Schmidt , Thomas Dandres

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:47 UTC · model grok-4.3

classification 💻 cs.CY cs.LG

keywords machine learningcarbon emissionsemissions calculatorenergy consumptionneural network trainingenvironmental impactsustainability

0 comments

The pith

A tool called the Machine Learning Emissions Calculator approximates the carbon emissions of training neural networks based on server location, energy grid, training duration, and hardware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a new online tool for estimating the carbon emissions produced when training machine learning models. It identifies server location and the carbon intensity of the local energy grid as major factors, along with how long the training runs and what hardware is used. The authors provide explanations of these factors and suggest specific actions that researchers and companies can take to lower their emissions. This matters because training large models consumes significant energy, yet the environmental cost has not been easy to quantify before. Users of the tool can input their details to get an estimate and then choose greener options.

Core claim

We present our Machine Learning Emissions Calculator, a tool for our community to better understand the environmental impact of training ML models. The calculator approximates emissions using factors including the location of the server and its energy grid, the length of the training procedure, and the make and model of hardware. We accompany this tool with an explanation of these factors as well as concrete actions that individual practitioners and organizations can take to mitigate their carbon emissions.

What carries the argument

The Machine Learning Emissions Calculator, which estimates carbon output by combining inputs on server location, energy grid carbon intensity, training length, and hardware specifications.

If this is right

Individual practitioners can use the calculator to estimate emissions for their training runs and identify high-impact factors to adjust.
Organizations can incorporate the tool into their decision-making to select lower-emission servers or optimize training procedures.
Greater awareness of energy grid differences may encourage training in regions with cleaner electricity sources.
Concrete mitigation steps include shortening training times through better algorithms and using more efficient hardware.
Reporting emissions alongside model performance could become a standard practice in machine learning research.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Integrating such calculators into popular ML frameworks could make emission tracking automatic during training.
This work could support the development of benchmarks that include environmental cost alongside accuracy metrics.
Future extensions might account for the full lifecycle of models, including inference and data collection phases.
Policy makers could use aggregated data from such tools to regulate data center energy use.

Load-bearing premise

The listed factors of server location, energy grid, training length, and hardware are sufficient to accurately approximate emissions in a way that reliably guides mitigation decisions.

What would settle it

A side-by-side comparison where actual measured carbon emissions from a real training run deviate substantially from the calculator's prediction for the same inputs.

read the original abstract

From an environmental standpoint, there are a few crucial aspects of training a neural network that have a major impact on the quantity of carbon that it emits. These factors include: the location of the server used for training and the energy grid that it uses, the length of the training procedure, and even the make and model of hardware on which the training takes place. In order to approximate these emissions, we present our Machine Learning Emissions Calculator, a tool for our community to better understand the environmental impact of training ML models. We accompany this tool with an explanation of the factors cited above, as well as concrete actions that individual practitioners and organizations can take to mitigate their carbon emissions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper ships a practical ML emissions calculator and mitigation checklist, but the estimates have no validation against real power or grid measurements.

read the letter

The core contribution is a simple online calculator that estimates CO2 from ML training runs using server location, grid carbon intensity, runtime, and hardware TDP. It also lists straightforward mitigation steps like picking lower-carbon regions or shorter experiments. That package is new enough in the ML community to be worth noticing; it turns general carbon-accounting ideas into something an engineer can actually open and use during a project. The write-up is clear on which variables matter most, which is helpful for people who have never thought about this before. Credit for that practical framing. The main weakness is the complete absence of validation. The tool relies on average grid intensities and assumed utilization rates, yet the paper shows no side-by-side checks against metered energy use or time-resolved grid data for any real training job. Without those numbers we cannot tell whether the outputs are accurate to 10% or off by a factor of two once cooling, dynamic power, and intra-day grid swings are included. That gap makes the mitigation recommendations harder to trust for anything beyond rough awareness. The work is aimed at ML practitioners who want a quick way to quantify their footprint and at groups starting to write internal sustainability guidelines. A reader already deep in energy systems or life-cycle assessment will find little new. I would bring it to a reading group for the discussion on adoption barriers, but not as a methods paper. It deserves peer review because the topic is timely and the tool format is usable, but only if the authors add at least basic empirical checks or clear uncertainty ranges before publication.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the Machine Learning Emissions Calculator, a practical tool to approximate the carbon emissions of training ML models. It identifies key influencing factors (server location and regional energy grid, training duration, and hardware type), explains how these affect emissions, and outlines mitigation actions for individual practitioners and organizations.

Significance. If the calculator's approximations can be shown to be reliable, the work would provide a timely, community-facing resource for quantifying and reducing the environmental impact of machine learning. The emphasis on actionable mitigation steps is a constructive contribution to an emerging area of concern.

major comments (2)

[Machine Learning Emissions Calculator] The section presenting the Machine Learning Emissions Calculator derives estimates from hardware TDP ratings, assumed utilization, regional average carbon intensities, and user-supplied runtime, yet reports no side-by-side comparison of these estimates against metered power draw or time-resolved grid emissions for any actual training run. This absence directly affects the central claim that the listed factors suffice for reliable approximations that can guide mitigation decisions.
[Explanation of factors] No sensitivity analysis or error bounds are provided for the approximation method (e.g., impact of dynamic power draw, cooling overhead, or intra-day grid variation), leaving open whether systematic over- or under-estimation by tens of percent occurs in practice.

minor comments (1)

[Abstract] The abstract would benefit from an explicit statement of the tool's intended scope and known limitations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the scope and limitations of the presented calculator. We address each major comment below and outline planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Machine Learning Emissions Calculator] The section presenting the Machine Learning Emissions Calculator derives estimates from hardware TDP ratings, assumed utilization, regional average carbon intensities, and user-supplied runtime, yet reports no side-by-side comparison of these estimates against metered power draw or time-resolved grid emissions for any actual training run. This absence directly affects the central claim that the listed factors suffice for reliable approximations that can guide mitigation decisions.

Authors: We agree that direct empirical validation against metered power draw would provide stronger evidence of reliability. The manuscript presents the calculator as a practical approximation tool based on standard methods (TDP values, average grid intensities, and assumed utilization) drawn from existing literature, rather than a precision measurement instrument. The central claim is that these factors are the dominant drivers and can be used for actionable estimates to guide mitigation, not that the tool matches real-time metering exactly. In revision we will add an explicit limitations subsection that discusses approximation error sources, cites prior studies performing such comparisons, and clarifies that the tool is intended for order-of-magnitude guidance and awareness rather than precise auditing. revision: partial
Referee: [Explanation of factors] No sensitivity analysis or error bounds are provided for the approximation method (e.g., impact of dynamic power draw, cooling overhead, or intra-day grid variation), leaving open whether systematic over- or under-estimation by tens of percent occurs in practice.

Authors: We accept this point and will incorporate a new sensitivity analysis subsection. The revision will quantify the effect of varying utilization rates, PUE values for cooling overhead, and temporal fluctuations in regional carbon intensity, providing explicit error bounds and showing how these propagate to the final emission estimate. This will allow readers to assess the robustness of the approximations for different use cases. revision: yes

Circularity Check

0 steps flagged

No circularity in emissions calculator derivation

full rationale

The paper presents a practical estimation tool whose inputs are server location and grid carbon intensity, training duration, and hardware TDP ratings drawn from external public data sources. No equations, fitted parameters, or predictions are defined that reduce to the tool's own outputs by construction, and no self-citations are invoked to establish uniqueness or to smuggle in ansatzes. The central claim is therefore an independent aggregation of standard factors rather than a self-referential loop, making the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are described in the provided text.

pith-pipeline@v0.9.0 · 5408 in / 935 out tokens · 31203 ms · 2026-05-15T05:47:52.538039+00:00 · methodology

discussion (0)

Forward citations

Cited by 20 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

An Amortized Efficiency Threshold for Comparing Neural and Heuristic Solvers in Combinatorial Optimization
cs.LG 2026-05 unverdicted novelty 7.0

The paper introduces the Amortized Efficiency Threshold (AET) to identify the deployment volume at which neural combinatorial optimization solvers become more energy-efficient overall than heuristic baselines after am...
Hidden Secrets in the arXiv: Discovering, Analyzing, and Preventing Unintentional Information Disclosure in Source Files of Scientific Preprints
cs.CR 2026-04 unverdicted novelty 7.0

Nearly every arXiv submission leaks hidden sensitive information through its source files, existing cleaners fail, and ALC-NG provides a more reliable fix.
Segment Anything
cs.CV 2023-04 unverdicted novelty 7.0

A promptable model trained on 1B masks achieves competitive zero-shot segmentation performance across tasks and is released publicly with its dataset.
OPT: Open Pre-trained Transformer Language Models
cs.CL 2022-05 unverdicted novelty 7.0

OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
Multitask Prompted Training Enables Zero-Shot Task Generalization
cs.LG 2021-10 conditional novelty 7.0

Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.
EnergyLens: Predictive Energy-Aware Exploration for Multi-GPU LLM Inference Optimization
cs.LG 2026-05 unverdicted novelty 6.0

EnergyLens predicts multi-GPU LLM inference energy consumption with 9-13% MAPE and identifies configurations with up to 52x energy efficiency differences.
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts
cs.CL 2026-05 unverdicted novelty 6.0

PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
Decomposing the Generalization Gap in PROTAC Activity Prediction: Variance Attribution and the Inter-Laboratory Ceiling
cs.LG 2026-05 accept novelty 6.0

Inter-laboratory measurement variance dominates the generalization gap in PROTAC activity prediction, capping LOTO AUROC near 0.67 across models and architectures.
SAM 2: Segment Anything in Images and Videos
cs.CV 2024-08 conditional novelty 6.0

SAM 2 delivers more accurate video segmentation with 3x fewer user interactions and 6x faster image segmentation than the original SAM by training a streaming-memory transformer on the largest video segmentation datas...
StarCoder 2 and The Stack v2: The Next Generation
cs.SE 2024-02 accept novelty 6.0

StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
cs.LG 2023-09 accept novelty 6.0

DeepSpeed-Ulysses keeps communication volume constant for sequence-parallel attention when sequence length and device count scale together, delivering 2.5x faster training on 4x longer sequences than prior SOTA.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
cs.CL 2022-11 unverdicted novelty 6.0

BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
Multi-Dimensional Model Integrity and Responsibility Assessment Index and Scoring Framework
cs.LG 2026-05 unverdicted novelty 5.0

MIRAI is a unified index that combines five responsibility dimensions into one score for tabular models, demonstrating that predictive performance does not ensure high overall integrity.
Position: LLM Inference Should Be Evaluated as Energy-to-Token Production
cs.CE 2026-05 unverdicted novelty 5.0

LLM inference should be reframed and evaluated as energy-to-token production with a Token Production Function that accounts for power, cooling, and efficiency ceilings.
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
cs.CL 2026-05 unverdicted novelty 5.0

UniSD unifies complementary self-distillation mechanisms for autoregressive LLMs and achieves up to +5.4 point gains over base models and +2.8 over baselines across six benchmarks and six models.
Agentic Insight Generation in VSM Simulations
cs.CL 2026-04 unverdicted novelty 5.0

A two-step agentic system for extracting insights from VSM simulations achieves up to 86% accuracy with top LLMs by using progressive data discovery and slim context.
Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds
cs.AI 2026-04 unverdicted novelty 5.0

A frugal zero-shot local-LLM pipeline extracts relations at F1 0.70 and reaches 0.55 EM on multi-hop QA through self-consistency, cross-model oracles, and confidence routing, while identifying an agreement paradox whe...
ChatGPT, is this real? The influence of generative AI on writing style in top-tier cybersecurity papers
cs.CR 2026-04 unverdicted novelty 5.0

Top-tier cybersecurity papers exhibit a post-2022 increase in AI marker words and higher lexical complexity, suggesting generative AI is influencing academic writing style.
StarCoder: may the source be with you!
cs.CL 2023-05 accept novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
From Cradle to Cloud: A Life Cycle Review of AI's Environmental Footprint
cs.CY 2026-05 unverdicted novelty 4.0

A review of AI sustainability studies finds inconsistent life cycle definitions and predominant reliance on coarse CO2e proxies, with limited coverage of water, materials, and multi-impact assessments.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · cited by 20 Pith papers · 8 internal anchors

[1]

Energy and Policy Considerations for Deep Learning in NLP

Emma Strubell, Ananya Ganesh, and Andrew McCallum. Energy and policy considerations for deep learning in nlp. arXiv preprint arXiv:1906.02243, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906
[2]

Green ai

Roy Schwartz, Jesse Dodge, Noah A Smith, and Oren Etzioni. Green ai. arXiv preprint arXiv:1907.10597, 2019

work page arXiv 1907
[3]

Institute for Global Environmental Strategies Hayama, Japan, 2006

Simon Eggleston, Leandro Buendia, Kyoko Miwa, Todd Ngara, and Kiyoto Tanabe.2006 IPCC guidelines for national greenhouse gas inventories, volume 5. Institute for Global Environmental Strategies Hayama, Japan, 2006

work page 2006
[4]

Ghg emissions from electricity consumption: A case study of hong kong from 2002 to 2015 and trends to 2030

WM To and Peter KC Lee. Ghg emissions from electricity consumption: A case study of hong kong from 2002 to 2015 and trends to 2030. Journal of cleaner production, 165:589–598, 2017

work page 2002
[5]

Electricity- speciﬁc emission factors for grid electricity.Ecometrica, Emissionfactors

Matthew Brander, Aman Sood, Charlotte Wylie, Amy Haughton, and Jessica Lovell. Electricity- speciﬁc emission factors for grid electricity.Ecometrica, Emissionfactors. com, 2011

work page 2011
[6]

Computa- tional physics on graphics processing units

Ari Harju, Topi Siro, Filippo Federici Canova, Samuli Hakala, and Teemu Rantalaiho. Computa- tional physics on graphics processing units. In Proceedings of the 11th international conference on Applied Parallel and Scientiﬁc Computing, pages 3–26. Springer-Verlag, 2012

work page 2012
[7]

Convolutional neural networks for medical image analysis: Full training or ﬁne tuning? IEEE transactions on medical imaging, 35(5):1299–1312, 2016

Nima Tajbakhsh, Jae Y Shin, Suryakanth R Gurudu, R Todd Hurst, Christopher B Kendall, Michael B Gotway, and Jianming Liang. Convolutional neural networks for medical image analysis: Full training or ﬁne tuning? IEEE transactions on medical imaging, 35(5):1299–1312, 2016

work page 2016
[8]

Food image recognition using deep convolutional network with pre-training and ﬁne-tuning

Keiji Yanai and Yoshiyuki Kawano. Food image recognition using deep convolutional network with pre-training and ﬁne-tuning. In 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pages 1–6. IEEE, 2015

work page 2015
[9]

Universal Language Model Fine-tuning for Text Classification

Jeremy Howard and Sebastian Ruder. Universal language model ﬁne-tuning for text classiﬁca- tion. arXiv preprint arXiv:1801.06146, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[10]

Random search for hyper-parameter optimization

James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(Feb):281–305, 2012

work page 2012
[12]

Hyperparameter optimization

Matthias Feurer and Frank Hutter. Hyperparameter optimization. In Automated Machine Learning, pages 3–33. Springer, 2019

work page 2019
[13]

Google environmental report 2018, 2018

Google. Google environmental report 2018, 2018

work page 2018
[14]

Beyond carbon neutral

Microsoft. Beyond carbon neutral. white paper, 2018

work page 2018
[15]

Aws & sustainability, 2019

Amazon Web Services. Aws & sustainability, 2019

work page 2019
[16]

Machine learning applications for data center optimization, 2014

Jim Gao. Machine learning applications for data center optimization, 2014

work page 2014
[17]

https://www.google.com/about/ datacenters/efficiency/internal/, 2019

Google Data Centers efﬁciency: How we do it. https://www.google.com/about/ datacenters/efficiency/internal/, 2019. Accessed: 2019-08-23

work page 2019
[18]

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[19]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[20]

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. Hy- perband: A novel bandit-based approach to hyperparameter optimization. arXiv preprint arXiv:1603.06560, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[21]

Massively parallel hyperparameter tuning

Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. Massively parallel hyperparameter tuning. arXiv preprint arXiv:1810.05934, 2018

work page arXiv 2018
[22]

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

Stefan Falkner, Aaron Klein, and Frank Hutter. Bohb: Robust and efﬁcient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[23]

Tearing apart google’s tpu 3.0 ai coprocessor

Paul Teich. Tearing apart google’s tpu 3.0 ai coprocessor. https://www.nextplatform. com/2018/05/10/tearing-apart-googles-tpu-3-0-ai-coprocessor/ , 2018. 5

work page 2018
[24]

NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks

Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, and Diana Marculescu. Neuralpower: Predict and deploy energy-efﬁcient convolutional neural networks.arXiv preprint arXiv:1710.05420, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[25]

Tackling climate change with machine learning

David Rolnick, Priya L Donti, Lynn H Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman- Brown, et al. Tackling climate change with machine learning. arXiv preprint arXiv:1906.05433, 2019

work page arXiv 1906
[26]

Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks

Victor Schmidt, Alexandra Luccioni, S. Karthik Mukkavilli, Narmada Balasooriya, Kris Sankaran, Jennifer Chayes, and Yoshua Bengio. Visualizing the consequences of climate change using cycle-consistent adversarial networks. CoRR, abs/1905.03709, 2019. 6 Appendix A: Energy Grid Data Used for the ML Emissions Calculator For clarity purposes, the data present...

work page internal anchor Pith review Pith/arXiv arXiv 1905