Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning

Christos K. Aridas; Fernando Nogueira; Guillaume Lemaitre

arxiv: 1609.06570 · v1 · pith:SQHOWWDLnew · submitted 2016-09-21 · 💻 cs.LG

Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning

Guillaume Lemaitre , Fernando Nogueira , Christos K. Aridas This is my paper

classification 💻 cs.LG

keywords toolboximbalanced-learnlearningmethodsgithubimbalancedmachinepython

0 comments

read the original abstract

Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented state-of-the-art methods can be categorized into 4 groups: (i) under-sampling, (ii) over-sampling, (iii) combination of over- and under-sampling, and (iv) ensemble learning methods. The proposed toolbox only depends on numpy, scipy, and scikit-learn and is distributed under MIT license. Furthermore, it is fully compatible with scikit-learn and is part of the scikit-learn-contrib supported project. Documentation, unit tests as well as integration tests are provided to ease usage and contribution. The toolbox is publicly available in GitHub: https://github.com/scikit-learn-contrib/imbalanced-learn.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Stellar flare detection in XMM-Newton with gradient boosted trees
astro-ph.HE 2025-09 conditional novelty 5.0

A gradient boosted classifier on X-ray light curve features detects stellar flares at 97.1% test accuracy and generates the largest public catalog of such events.
AdaDec: A Uncertainty-Guided Lookahead Decoding Framework for LLM-Based Code Generation
cs.SE 2025-06 unverdicted novelty 5.0

AdaDec improves Pass@1 accuracy of LLM code generation by up to 20.9% over greedy decoding by triggering lookahead reranking only at high-uncertainty steps on HumanEval+, MBPP+, and DevEval.