Energy Consumption of Dataframe Libraries for End-to-End Deep Learning Pipelines:A Comparative Analysis
Pith reviewed 2026-05-17 23:09 UTC · model grok-4.3
The pith
Dataframe libraries show distinct energy consumption when embedded in GPU deep learning pipelines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Through direct measurement the authors establish that Pandas, Polars, and Dask interact differently with GPU workloads during data loading, preprocessing, and batch feeding, producing quantifiable differences in runtime, memory, disk usage, and CPU plus GPU energy consumption across multiple machine learning models and datasets.
What carries the argument
End-to-end deep learning pipeline that embeds a dataframe library for data loading, preprocessing, and batch feeding, with simultaneous recording of CPU and GPU energy alongside runtime and memory metrics.
If this is right
- Library choice during data preparation directly influences total CPU and GPU energy consumed by a deep learning pipeline.
- Developers gain concrete data to select among Pandas, Polars, and Dask based on energy cost for their specific workloads.
- Pipeline designs can incorporate library-specific energy profiles when targeting lower overall power draw.
Where Pith is reading between the lines
- The same measurement approach could be applied to other data-processing libraries or to distributed training setups.
- Energy metrics might be added to standard benchmarking tools so that sustainability becomes a routine selection criterion.
- Cloud operators could use library rankings to guide default choices that lower electricity costs for customer workloads.
Load-bearing premise
That the chosen machine-learning models, datasets, and hardware configurations are representative of typical real-world deep-learning pipelines so that the measured energy differences generalize.
What would settle it
Re-running the identical pipelines on a different GPU model, a larger dataset, or an alternative set of models and observing whether the relative energy rankings among Pandas, Polars, and Dask stay the same.
Figures
read the original abstract
This paper presents a detailed comparative analysis of the performance of three major Python data manipulation libraries - Pandas, Polars, and Dask - specifically when embedded within complete deep learning (DL) training and inference pipelines. The research bridges a gap in existing literature by studying how these libraries interact with substantial GPU workloads during critical phases like data loading, preprocessing, and batch feeding. The authors measured key performance indicators including runtime, memory usage, disk usage, and energy consumption (both CPU and GPU) across various machine learning models and datasets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to perform a detailed comparative analysis of Pandas, Polars, and Dask libraries embedded in complete deep learning training and inference pipelines. It measures runtime, memory usage, disk usage, and energy consumption for both CPU and GPU across various machine learning models and datasets, aiming to bridge a gap by examining interactions with substantial GPU workloads in phases like data loading, preprocessing, and batch feeding.
Significance. Should the measurements prove robust upon detailed scrutiny, this study could offer practical insights for optimizing energy efficiency in DL pipelines by guiding the choice of dataframe libraries. It contributes by shifting focus from standalone library benchmarks to their performance within integrated GPU-accelerated workflows, potentially aiding developers in resource-constrained environments.
major comments (3)
- [Abstract and Experimental Methodology] The abstract states that runtime, memory, disk, and energy metrics were collected across models and datasets, but provides no information on experimental controls, statistical methods, error bars, or exclusion criteria. Without these details the measurements cannot be verified to support the comparative claims.
- [Results (energy attribution)] The central claim requires that Pandas/Polars/Dask choices measurably alter energy draw during GPU-heavy phases. If the experimental design records only total CPU+GPU joules per run and does not report per-phase breakdowns or control for data-transfer overhead to GPU, any observed differences could be driven entirely by CPU-side preprocessing rather than the claimed GPU interaction.
- [Hardware, Models, and Datasets] The assumption that the chosen machine-learning models, datasets, and hardware configurations are representative of typical real-world deep-learning pipelines is not sufficiently justified, limiting the generalizability of any measured energy differences.
minor comments (2)
- [Tables and Figures] Clarify notation for distinguishing CPU versus GPU energy metrics in tables and figures to improve readability.
- [Experimental Setup] Ensure all library versions, exact hardware specifications (e.g., GPU model, power measurement tools), and dataset sizes are explicitly listed for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below and have revised the manuscript to improve transparency and address concerns about methodology, attribution, and generalizability.
read point-by-point responses
-
Referee: [Abstract and Experimental Methodology] The abstract states that runtime, memory, disk, and energy metrics were collected across models and datasets, but provides no information on experimental controls, statistical methods, error bars, or exclusion criteria. Without these details the measurements cannot be verified to support the comparative claims.
Authors: We acknowledge that the abstract prioritizes conciseness and omits methodological details. The full manuscript describes these elements in Section 3 (Experimental Setup), including five repeated runs per configuration, reporting of means with standard deviations for error bars, and exclusion criteria for runs exhibiting hardware anomalies or timeouts. To improve verifiability without lengthening the abstract excessively, we have added a brief clause summarizing the use of repeated measurements and statistical controls. revision: yes
-
Referee: [Results (energy attribution)] The central claim requires that Pandas/Polars/Dask choices measurably alter energy draw during GPU-heavy phases. If the experimental design records only total CPU+GPU joules per run and does not report per-phase breakdowns or control for data-transfer overhead to GPU, any observed differences could be driven entirely by CPU-side preprocessing rather than the claimed GPU interaction.
Authors: This concern is valid and highlights a limitation in attribution. Our setup records separate CPU and GPU energy via RAPL and NVML while standardizing batch sizes, transfer mechanisms, and pipeline structure across libraries to minimize confounding from data movement. Observed differences appear in both preprocessing and overall pipeline energy. We have added an explicit discussion paragraph acknowledging that finer per-phase instrumentation would strengthen claims of direct GPU-phase interaction and noting this as a direction for future work; no new measurements were collected. revision: partial
-
Referee: [Hardware, Models, and Datasets] The assumption that the chosen machine-learning models, datasets, and hardware configurations are representative of typical real-world deep-learning pipelines is not sufficiently justified, limiting the generalizability of any measured energy differences.
Authors: We agree that stronger justification is needed. The selected models (ResNet-50, VGG-16, and a small transformer) and datasets (CIFAR-10, MNIST, ImageNet subset) are standard benchmarks frequently cited in DL literature, and the hardware (Intel Xeon CPU with NVIDIA RTX 3090) represents a common single-GPU workstation. We have expanded the Hardware, Models, and Datasets subsection with supporting references to prior studies and added a short limitations paragraph discussing scope and potential differences in multi-GPU or cloud-scale environments. revision: yes
Circularity Check
Empirical measurement study with no derivation chain or self-referential reductions
full rationale
The paper performs direct experimental measurements of runtime, memory, disk, CPU and GPU energy across Pandas/Polars/Dask in end-to-end DL pipelines. No equations, fitted parameters, or predictions are derived; all claims rest on observed values from controlled runs. No self-citation is used to justify uniqueness or load-bearing premises, and the work is self-contained against external benchmarks (the hardware runs themselves). This matches the default expectation of no circularity for measurement studies.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We integrate Pandas, Polars, and Dask into representative deep learning training and inference pipelines and conduct experiments across a wide range of various machine learning models and datasets, measuring key performance indicators such as runtime, memory usage, disk usage, and energy consumption (CPU and GPU).
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Polars consistently minimizes CPU energy consumption on larger workloads, while Pandas remains competitive for moderate sizes.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Sonia Bergamaschi et al. 2024. An Empirical Study on the Energy Usage and Performance of Pandas and Polars.ACM Transactions on Data Science(2024)
work page 2024
-
[2]
James Bornholt, Todd Mytkowicz, and Kathryn S McKinley. 2012. The model is not enough: Understanding energy consumption in mobile devices. In2012 IEEE hot chips 24 symposium (HCS). IEEE, 1–3
work page 2012
- [3]
-
[4]
NVIDIA Corporation. 2025. nvidia-ml-py: Python Bindings for the NVIDIA Management Library. https://pypi.org/project/nvidia-ml-py/. https://pypi.org/ project/nvidia-ml-py/ Accessed: 2025-10-20
work page 2025
-
[5]
Stefanos Georgiou, Maria Kechagia, Tushar Sharma, Federica Sarro, and Ying Zou
-
[6]
InProceedings of the 44th International Conference on Software Engineering
Green ai: Do deep learning frameworks have different costs?. InProceedings of the 44th International Conference on Software Engineering. 1082–1094
-
[7]
Pramod Gupta and Anupam Bagchi. 2024. Introduction to pandas. InEssentials of python for artificial intelligence and machine learning. Springer, 161–196
work page 2024
-
[8]
F Maxwell Harper and Joseph A Konstan. 2015. MovieLens 1M dataset.ACM Transactions on Interactive Intelligent Systems5 (2015), 1–19
work page 2015
-
[9]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. InECCV. Springer, 740–755
work page 2014
-
[10]
Linux Kernel Community. 2024. perf: Linux profiling with performance counters. https://perf.wiki.kernel.org/index.php/Main_Page. Accessed 2025-10-15
work page 2024
-
[11]
Ritchie Lutkebohmert et al. 2021. Polars: Blazingly fast dataframes in rust and python
work page 2021
-
[12]
Wes McKinney. 2010. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference, Vol. 445. Austin, TX, 51–56
work page 2010
- [14]
-
[15]
Felix Nahrstedt, Mehdi Karmouche, Karolina Bargieł, Pouyeh Banijamali, Apoorva Nalini Pradeep Kumar, and Ivano Malavolta. 2024. An empirical study on the energy usage and performance of pandas and polars data analysis Python libraries. InProceedings of the 28th international conference on evaluation and assessment in software engineering. 58–68. Energy Co...
work page 2024
-
[16]
NVIDIA Corporation. 2024. NVIDIA Management Library (NVML) and Python bindings (pynvml). https://docs.nvidia.com/deploy/nvml-api. Accessed 2025-10- 15
work page 2024
-
[17]
Lucas Oliveira et al. 2023. An Exploratory Study on Energy Consumption of Dataframe Processing Libraries. InProceedings of IEEE Conference
work page 2023
-
[18]
Matthew Rocklin. 2015. Dask: Parallel Computation with Blocked algorithms and Task Scheduling. InProceedings of the 14th Python in Science Conference. 126–132
work page 2015
-
[19]
Shriram Shanbhag and Sridhar Chimalakonda. 2023. An exploratory study on energy consumption of dataframe processing libraries. In2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR). IEEE, 284–295
work page 2023
-
[20]
Douglas Souza et al. 2023. An Empirical Study on the Energy Usage and Perfor- mance of Pandas and Polars Data Analysis Python Libraries. InProceedings of the ACM Conference
work page 2023
- [21]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.