pith. sign in

arxiv: 2605.24908 · v1 · pith:YVBKMWIRnew · submitted 2026-05-24 · 💻 cs.LG · cs.AI

On the Impact of Class Imbalance on the Learning Dynamics of Deep Neural Networks:An Intuitive Insight

Pith reviewed 2026-06-30 12:15 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords class imbalancedeep neural networkslearning dynamicsminority class underfittingmajority class biasgeneralization failuretraining loss minimization
0
0 comments X

The pith

Class imbalance drives deep neural networks to underfit minority classes early in training while producing non-generalizable representations later.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how class imbalance affects the training behavior of deep neural networks by tracking learning patterns on majority and minority classes across datasets with different imbalance ratios. It establishes that balanced data leads to similar learning across classes, but imbalance causes the model to underfit minority samples initially while focusing only on the majority class. Even after the model eventually learns the minority samples, the resulting representations fail to generalize at test time because they arise from overfitting aimed solely at lowering overall training loss. This dynamic explains poor performance on imbalanced data and highlights why specialized handling techniques are necessary.

Core claim

Experimental monitoring of DNN learning patterns shows that class imbalance has a severe deteriorating impact, driving the model to underfit the minority class samples in the early training epochs while simultaneously learning only the majority class. Although DNN ultimately learns the minority samples, learning in this manner only results in learnt minority representations that are non-generalizable at test phase because they are merely overfitted to keep the overall training loss as low as possible.

What carries the argument

Systematic monitoring of learning patterns on majority versus minority classes in datasets with controlled varying imbalance ratios.

If this is right

  • DNNs on imbalanced data will exhibit delayed and ineffective learning of minority classes compared to balanced cases.
  • Minority class representations learned under imbalance will show poor test-phase generalization due to loss-driven overfitting rather than pattern capture.
  • Standard training without imbalance correction will preferentially acquire majority class knowledge first.
  • Imbalance-handling methods must target both the initial underfitting and the subsequent non-generalizable fitting of minority samples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Imbalance-correction techniques may need to act in the earliest epochs to prevent the initial underfitting phase.
  • The pattern could be tested on additional model architectures or domains to check if the dynamics are architecture-specific.
  • Controlling for dataset size and feature properties in follow-up experiments would strengthen attribution to imbalance alone.

Load-bearing premise

Differences in learning patterns across datasets can be attributed primarily to class imbalance rather than to other factors such as dataset size, feature distributions, or hyperparameter choices.

What would settle it

A controlled experiment showing that minority-class test accuracy remains high and matches training accuracy on imbalanced data without any rebalancing technique would contradict the claim of non-generalizable overfitting.

read the original abstract

Class imbalance in deep neural networks (DNNs) has witnessed a rapid increase in research attention in recent years. However, the varying accounts of the reasons behind the poor performance of DNN on imbalance data in pertinent literature shows that little is known about how this agelong phenomenon impacts the performance of DNNs. A better understanding of this problem is crucial to developing effective DNN-based imbalance methods. Thus, this study systematically investigates the impact of class imbalance on the learning dynamics of DNN by monitoring the learning pattern of DNN models on both the majority and minority classes of datasets of varying imbalance ratios. Experimental findings shows that as against learning from balanced datasets where DNN learns the classes similarly, class imbalance has severe deteriorating impact on the performance of DNN, driving the model to underfit the minority class samples in the early training epochs while simultaneously learning only the majority class. Although DNN ultimately learns the minority samples, learning in this manner only results in learnt minority representations that are non-generalizable at test phase because they are merely overfitted to keep the overall training loss as low as possible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that class imbalance severely deteriorates DNN performance by driving the model to underfit minority-class samples in early training epochs while learning only the majority class. Although the network eventually learns the minority samples, the resulting representations are non-generalizable at test time because they are merely overfitted to minimize overall training loss. This contrasts with balanced datasets, where classes are learned similarly. The investigation relies on monitoring per-class learning patterns across datasets with varying imbalance ratios.

Significance. If the empirical observations are substantiated with proper isolation of the imbalance ratio and full experimental details, the work could supply useful intuition about why imbalance harms generalization and thereby guide the design of imbalance-handling techniques. The manuscript contains no equations, derivations, machine-checked proofs, or reproducible code, so its contribution rests solely on the quality and controls of the reported experiments.

major comments (2)
  1. [Abstract] The abstract states experimental findings but supplies no datasets, architectures, training protocols, quantitative metrics, or controls, so it is impossible to verify whether the data actually support the stated claim about underfitting and non-generalizable representations.
  2. The central claim requires that varying imbalance ratios (while monitoring per-class learning) isolates the effect of the ratio itself. The manuscript does not indicate controls such as subsampling the majority class to hold total sample size N fixed or matching class-conditional distributions across ratios. Without these, early underfitting of the minority class and late overfitting could arise from fewer total examples or shifted data statistics rather than the imbalance ratio.
minor comments (1)
  1. Grammatical issues: 'Experimental findings shows' should be 'show'. 'agelong phenomenon' is nonstandard; consider 'longstanding phenomenon'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects of experimental clarity and controls that we will address to strengthen the manuscript. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract] The abstract states experimental findings but supplies no datasets, architectures, training protocols, quantitative metrics, or controls, so it is impossible to verify whether the data actually support the stated claim about underfitting and non-generalizable representations.

    Authors: We agree that the abstract, as a high-level summary, omits these specifics and could be improved for standalone readability. In the revision we will add a concise statement noting the primary datasets (CIFAR-10 variants with controlled imbalance ratios), architectures (ResNet family), and metrics (per-class loss and accuracy trajectories). Full protocols remain detailed in Section 3; the change is limited to the abstract. revision: yes

  2. Referee: The central claim requires that varying imbalance ratios (while monitoring per-class learning) isolates the effect of the ratio itself. The manuscript does not indicate controls such as subsampling the majority class to hold total sample size N fixed or matching class-conditional distributions across ratios. Without these, early underfitting of the minority class and late overfitting could arise from fewer total examples or shifted data statistics rather than the imbalance ratio.

    Authors: The concern about isolating the imbalance ratio is well-founded. Our reported experiments construct imbalance by subsampling minority classes while holding the majority class size fixed, which necessarily alters total N. To directly address this, the revised manuscript will include additional controlled experiments that keep total sample size N constant across imbalance ratios (by also subsampling the majority class) and will explicitly confirm that all variants are drawn from the same underlying class-conditional distributions. These new results will be reported alongside the original findings. revision: yes

Circularity Check

0 steps flagged

No derivation chain; purely observational with no equations or predictions.

full rationale

The paper reports experimental observations on DNN training dynamics under varying imbalance ratios but contains no equations, derivations, fitted parameters presented as predictions, or self-citation chains supporting a mathematical claim. All load-bearing statements are empirical findings from monitoring per-class learning patterns. No step reduces a claimed result to its inputs by construction. The absence of any formal derivation makes circularity analysis inapplicable; the reader's score of 1.0 is consistent with this.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review contains no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that the monitored learning patterns are diagnostic of imbalance effects.

pith-pipeline@v0.9.1-grok · 5740 in / 1132 out tokens · 26300 ms · 2026-06-30T12:15:41.204254+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 16 canonical work pages · 1 internal anchor

  1. [1]

    Learning from imbalanced data: open challenges and future directions,

    B. Krawczyk, "Learning from imbalanced data: open challenges and future directions," Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221-232, 2016

  2. [2]

    Machine learning from imbalanced data sets 101,

    F. Provost, "Machine learning from imbalanced data sets 101," in Proceedings of the AAAI’2000 workshop on imbalanced data sets, 2000, vol. 68, no. 2000: AAAI Press, pp. 1-3

  3. [3]

    Cost -sensitive Prediction of Airline Delays Using Machine Learning,

    S. Choi, Y. J. Kim, S. Briceno, D. Mavris, and Ieee, "Cost -sensitive Prediction of Airline Delays Using Machine Learning," in 2017 Ieee/Aiaa 36th Digital Avionics Systems Conference, (IEEE -AIAA Digital Avionics Systems Conference, 2017

  4. [4]

    A Preliminary Study on Learning Challenges in Machine Learning -based Flight Delay Prediction,

    I. B. Mustapha, S. M. Shamsuddin, and S. Hasan, "A Preliminary Study on Learning Challenges in Machine Learning -based Flight Delay Prediction," International Journal of Innovative Computing, vol. 9, no. 1, 2019

  5. [5]

    Applying Cost- Sensitive Classification for Financial Fraud Detection under High Class-Imbalance,

    S. O. Moepya, S. S. Akhoury, and F. V. Nelwamondo, "Applying Cost- Sensitive Classification for Financial Fraud Detection under High Class-Imbalance," in 2014 IEEE International Conference on Data Mining Workshop, 14 -14 Dec. 2014 2014, pp. 183 -192, doi: 10.1109/ICDMW.2014.141

  6. [6]

    Real -time Credit Card Fraud Detection Using Machine Learning,

    A. Thennakoon, C. Bhagyani, S. Premadasa, S. Mihiranga, and N. Kuruwitaarachchi, "Real -time Credit Card Fraud Detection Using Machine Learning," in 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 10 -11 Jan. 2019 2019, pp. 488-493, doi: 10.1109/CONFLUENCE.2019.8776942

  7. [7]

    Using generative adversarial networks for improving classification effectiveness in credit card fraud detection,

    U. Fiore, A. De Santis, F. Perla, P. Zanetti, and F. Palmieri, "Using generative adversarial networks for improving classification effectiveness in credit card fraud detection," Information Sciences, vol. 479, pp. 448 -455, 2019/04/01/ 2019, doi: https://doi.org/10.1016/j.ins.2017.12.030

  8. [8]

    Automated classification of brain tumours from short echo time in vivo MRS data using Gaussian Decomposition and Bayesian Neural Networks,

    C. Arizmendi, D. A. Sierra, A. Vellido, and E. Romero, "Automated classification of brain tumours from short echo time in vivo MRS data using Gaussian Decomposition and Bayesian Neural Networks," Expert Systems with Applications, vol. 41, no. 11, pp. 5296 -5307, 2014/09/01/ 2014, doi: https://doi.org/10.1016/j.eswa.2014.02.031

  9. [9]

    A Data Augmentation -Based Framework to Handle Class Imbalance Problem for Alzheimer’s Stage Detection,

    S. Afzal et al., "A Data Augmentation -Based Framework to Handle Class Imbalance Problem for Alzheimer’s Stage Detection," IEEE Access, vol. 7, pp. 115528 -115539, 2019, doi: 10.1109/ACCESS.2019.2932786

  10. [10]

    Mining data with rare events: a case study,

    C. Seiffert, T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, "Mining data with rare events: a case study," in 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), 2007, vol. 2: IEEE, pp. 132-139

  11. [11]

    Survey on deep learning with class imbalance,

    J. M. J. M. Khoshgoftaar, "Survey on deep learning with class imbalance," Journal of Big Data, vol. 6, no. 1, 2019, doi: 10.1186/s40537-019-0192-5

  12. [12]

    Solving the under-fitting problem for decision tree algorithms by incremental swarm optimization in rare-event healthcare classification,

    J. Li, S. Fong, S. Mohammed, J. Fiaidhi, Q. Chen, and Z. Tan, "Solving the under-fitting problem for decision tree algorithms by incremental swarm optimization in rare-event healthcare classification," Journal of Medical Imaging and Health Informatics, vol. 6, no. 4, pp. 1102-1110,

  13. [13]

    Research Management Center Universiti Teknologi Malaysia

  14. [14]

    The class imbalance problem: Significance and strategies,

    N. Japkowicz, "The class imbalance problem: Significance and strategies," in Proc. of the Int’l Conf. on Artificial Intelligence, 2000, vol. 56: Citeseer, pp. 111-117

  15. [15]

    On the class overlap problem in imbalanced data classification,

    P. Vuttipittayamongkol, E. Elyan, and A. Petrovski, "On the class overlap problem in imbalanced data classification," Knowledge-based systems, vol. 212, p. 106631, 2021

  16. [16]

    Deep learning: methods and applications,

    L. Deng and D. Yu, "Deep learning: methods and applications," Foundations and trends in signal processing, vol. 7, no. 3 –4, pp. 197- 387, 2014

  17. [17]

    Representation learning: A review and new perspectives,

    Y. Bengio, A. Courville, and P. Vincent, "Representation learning: A review and new perspectives," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1798-1828, 2013

  18. [18]

    Deep neural networks and tabular data: A survey,

    V. Borisov, T. Leemann, K. Seßler, J. Haug, M. Pawelczyk, and G. Kasneci, "Deep neural networks and tabular data: A survey," arXiv preprint arXiv:2110.01889, 2021

  19. [19]

    Goodfellow, Y

    I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016

  20. [20]

    A systematic study of the class imbalance problem in convolutional neural networks,

    M. Buda, A. Maki, and M. A. Mazurowski, "A systematic study of the class imbalance problem in convolutional neural networks," Neural Networks, vol. 106, pp. 249 -259, 2018/10/01/ 2018, doi: https://doi.org/10.1016/j.neunet.2018.07.011

  21. [21]

    Didimo, G

    T. Grósz and I. N. T., "Document Classification with Deep Rectifier Neural Networks and Probabilistic Sampling," in Text, Speech and Dialogue, 2014, doi: 10.1007/978 -3-319-10816-2_14. [Online]. Available: http://link.springer.com/chapter/10.1007/978-3-319-10816- 2_14

  22. [22]

    Learning Imbalanced Datasets with Label -Distribution-Aware Margin Loss,

    K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, "Learning Imbalanced Datasets with Label -Distribution-Aware Margin Loss," arXiv preprint arXiv:1906.07413, 2019

  23. [23]

    Procrustean training for imbalanced deep learning,

    H.-J. Ye, D. -C. Zhan, and W. -L. Chao, "Procrustean training for imbalanced deep learning," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 92-102

  24. [24]

    Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification,

    I. B. Mustapha, S. Hasan, H. S. Nabbus, M. M. A. Montaser, S. O. Olatunji, and S. M. Shamsuddin, "Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification," International Journal of Advanced Computer Science and Applications, vol. 14, no. 2, 2023

  25. [25]

    Balanced - mixup for highly imbalanced medical image classification,

    A. Galdran, G. Carneiro, and M. A. González Ballester, "Balanced - mixup for highly imbalanced medical image classification," in Medical Image Computing and Computer Assisted Intervention –MICCAI 2021: 24th International Conference, Strasbourg, France, Septem ber 27–October 1, 2021, Proceedings, Part V 24, 2021: Springer, pp. 323- 333

  26. [26]

    Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem,

    E. Wu, H. Cui, and R. E. Welsch, "Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem," IEEE Access, vol. 8, pp. 91265-91275, 2020

  27. [27]

    Identifying and compensating for feature deviation in imbalanced deep learning.arXiv preprint arXiv:2001.01385,

    H.-J. Ye, H. -Y. Chen, D.-C. Zhan, and W. -L. Chao, "Identifying and compensating for feature deviation in imbalanced deep learning," arXiv preprint arXiv:2001.01385, 2020

  28. [28]

    Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation,

    Z. Li, K. Kamnitsas, and B. Glocker, "Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation," in International Conference on Medical Image Computing and Computer- Assisted Intervention, 2019: Springer, pp. 402-410

  29. [29]

    Adjusting decision boundary for class imbalanced learning,

    B. Kim and J. Kim, "Adjusting decision boundary for class imbalanced learning," IEEE Access, vol. 8, pp. 81674-81685, 2020

  30. [30]

    Feature transfer learning for face recognition with under -represented data,

    X. Yin, X. Yu, K. Sohn, X. Liu, and M. Chandraker, "Feature transfer learning for face recognition with under -represented data," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5704-5713

  31. [31]

    Balanced meta -softmax for long -tailed visual recognition,

    J. Ren et al., "Balanced meta -softmax for long -tailed visual recognition," arXiv preprint arXiv:2007.10740, 2020

  32. [32]

    Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,

    P. J. Rousseeuw, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis," Journal of computational and applied mathematics, vol. 20, pp. 53-65, 1987

  33. [33]

    Deep Learning and Data Sampling with Imbalanced Big Data,

    J. M. Johnson and T. M. Khoshgoftaar, "Deep Learning and Data Sampling with Imbalanced Big Data," in 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), 30 July -1 Aug. 2019 2019, pp. 175 -183, doi: 10.1109/IRI.2019.00038

  34. [34]

    Deep MLPs for Imbalanced Classification,

    D. Díaz -Vico, A. R. Figueiras -Vidal, and J. R. Dorronsoro, "Deep MLPs for Imbalanced Classification," in 2018 International Joint Conference on Neural Networks (IJCNN), 8-13 July 2018 2018, pp. 1- 7, doi: 10.1109/IJCNN.2018.8489504

  35. [35]

    Understanding the difficulty of training deep feedforward neural networks,

    X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010: JMLR Workshop and Conference Proceedings, pp. 249-256

  36. [36]

    Adam: A Method for Stochastic Optimization

    D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014

  37. [37]

    Four equity considerations for the use of artificial intelligence in public health,

    M. J. Smith, R. Axler, S. Bean, F. Rudzicz, and J. Shaw, "Four equity considerations for the use of artificial intelligence in public health," Bulletin of the World Health Organization, vol. 98, no. 4, p. 290, 2020

  38. [38]

    Loss landscapes and optimization in over-parameterized non -linear systems and neural networks,

    C. Liu, L. Zhu, and M. Belkin, "Loss landscapes and optimization in over-parameterized non -linear systems and neural networks," arXiv preprint arXiv:2003.00307, 2020

  39. [39]

    Classification with class imbalance problem: a review,

    A. Ali, S. M. Shamsuddin, and A. L. Ralescu, "Classification with class imbalance problem: a review," Int. J. Advance Soft Compu. Appl, vol. 7, no. 3, pp. 176-204, 2015

  40. [40]

    Class -balanced loss based on effective number of samples,

    Y. Cui, M. Jia, T. -Y. Lin, Y. Song, and S. Belongie, "Class -balanced loss based on effective number of samples," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9268-9277. V. APPENDIX Table A.1 Binary Imbalanced Datasets Data #Instances #Attributes %Majority Class %Minority Class IR SC abalone19 4174 8 99.23 0...