Ensemble Learning for Healthcare: A Comparative Analysis of Hybrid Voting and Ensemble Stacking in Obesity Risk Prediction

Md Sumon Ali; Towhidul Islam

arxiv: 2509.02826 · v2 · submitted 2025-09-02 · 💻 cs.LG · cs.AI· stat.AP· stat.CO

Ensemble Learning for Healthcare: A Comparative Analysis of Hybrid Voting and Ensemble Stacking in Obesity Risk Prediction

Towhidul Islam , Md Sumon Ali This is my paper

Pith reviewed 2026-05-18 19:03 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.APstat.CO

keywords ensemble learningobesity risk predictionstackingmajority votingmachine learninghealthcare predictionbase learnerscomparative analysis

0 comments

The pith

Ensemble stacking outperforms hybrid majority voting for obesity risk prediction, especially on complex datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests hybrid majority voting against ensemble stacking for predicting obesity risk on two health datasets. It selects the top three models from nine algorithms after fifty hyperparameter configurations, applies balancing and outlier detection, then builds majority hard voting, weighted hard voting, and stacking with a multi-layer perceptron meta-classifier. Stacking matches or exceeds the voting methods, showing its clearest advantage on the dataset with more intricate patterns. A sympathetic reader would care because improved risk models could support earlier interventions for a condition strongly linked to diabetes, heart disease, and cancer. The work positions stacking as the stronger option when data complexity rises while treating voting as a reliable simpler choice.

Core claim

On Dataset-1 weighted hard voting and stacking both reached accuracy 0.920304 and F1-score near 0.920, outperforming majority hard voting. On Dataset-2 stacking achieved accuracy 0.989837 and F1 0.989825, beating majority hard voting at accuracy 0.981707 while weighted hard voting performed worst. The results establish that stacking supplies stronger predictive capability for complex data distributions, with hybrid majority voting remaining a robust alternative.

What carries the argument

Ensemble construction from the top three base learners chosen from nine machine learning algorithms, assembled either as hybrid hard voting (majority or weighted) or as stacking with a multi-layer perceptron meta-classifier.

If this is right

Stacking is preferable when obesity data exhibits complex distributions.
Hybrid majority voting serves as a dependable lower-complexity option.
Tuning and selecting multiple base learners before ensembling improves reliability for healthcare tasks.
The comparative results can inform model choice in other medical prediction settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same stacking preference may appear in risk models for related conditions such as diabetes.
Real-world clinical streams with missing values could narrow or widen the observed performance gap.
Varying the meta-classifier beyond the multi-layer perceptron might further optimize stacking results.

Load-bearing premise

Selecting the top three models after evaluating fifty hyperparameter configurations plus dataset balancing and outlier detection produces an unbiased comparison of the two ensemble approaches without selection effects or data artifacts.

What would settle it

On a new obesity dataset processed identically, stacking fails to match or exceed the accuracy and F1-score of the voting ensembles.

read the original abstract

Obesity is a critical global health issue driven by dietary, physiological, and environmental factors, and is strongly associated with chronic diseases such as diabetes, cardiovascular disorders, and cancer. Machine learning has emerged as a promising approach for early obesity risk prediction, yet a comparative evaluation of ensemble techniques -- particularly hybrid majority voting and ensemble stacking -- remains limited. This study aims to compare hybrid majority voting and ensemble stacking methods for obesity risk prediction, identifying which approach delivers higher accuracy and efficiency. The analysis seeks to highlight the complementary strengths of these ensemble techniques in guiding better predictive model selection for healthcare applications. Two datasets were utilized to evaluate three ensemble models: Majority Hard Voting, Weighted Hard Voting, and Stacking (with a Multi-Layer Perceptron as meta-classifier). A pool of nine Machine Learning (ML) algorithms, evaluated across a total of 50 hyperparameter configurations, was analyzed to identify the top three models to serve as base learners for the ensemble methods. Preprocessing steps involved dataset balancing, and outlier detection, and model performance was evaluated using Accuracy and F1-Score. On Dataset-1, weighted hard voting and stacking achieved nearly identical performance (Accuracy: 0.920304, F1: 0.920070), outperforming majority hard voting. On Dataset-2, stacking demonstrated superior results (Accuracy: 0.989837, F1: 0.989825) compared to majority hard voting (Accuracy: 0.981707, F1: 0.981675) and weighted hard voting, which showed the lowest performance. The findings confirm that ensemble stacking provides stronger predictive capability, particularly for complex data distributions, while hybrid majority voting remains a robust alternative.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Stacking shows a modest edge over voting methods on these obesity datasets, but the top-three selection after 50 hyperparameter runs on the same data likely inflates the reported gains.

read the letter

The main takeaway is that this is a clean applied comparison of three ensemble setups on two obesity datasets, with stacking pulling ahead on the second one by a small margin. The numbers are reported plainly and the preprocessing steps are standard for health data work. Beyond that, the paper does not introduce new algorithms or theory, just runs the usual suspects through the usual pipeline and notes which one wins on these particular tables. That is useful for someone who needs a quick reference on how these methods behave in a tabular medical prediction task, but it will not shift the broader literature on ensembles. The authors credit the base learners and meta-learner choices without overclaiming. The soft spot sits in the model selection step. Nine algorithms were ranked across fifty hyperparameter configurations and the top three were locked in before the final ensemble runs. Because that ranking appears to use the same data regime later used for the headline accuracy and F1 figures, any leakage or optimistic bias in the selection carries straight into the comparison. The gaps are small, especially the 0.008 difference on dataset two, and without nested cross-validation or a separate selection hold-out it is hard to attribute the difference cleanly to stacking rather than to which three models were allowed to compete. Error bars or a statistical test would have helped separate signal from selection artifact. This paper is for applied researchers who build risk models on similar health datasets and want to see how voting versus stacking plays out in practice. Readers already comfortable with ensembles will not learn much new, but the concrete side-by-side on two datasets gives a usable data point. I would bring it to a reading group to talk through the validation design. I would not cite it in my own work. It still deserves peer review because the empirical setup is transparent enough that referees can usefully push on the selection procedure and ask for tighter controls.

Referee Report

1 major / 2 minor

Summary. The manuscript compares three ensemble methods—Majority Hard Voting, Weighted Hard Voting, and Stacking (MLP meta-learner)—for obesity risk prediction on two datasets. Nine base ML algorithms are evaluated across 50 hyperparameter configurations to select the top three as base learners. After preprocessing (balancing and outlier detection), Accuracy and F1-Score are reported: on Dataset-1, weighted voting and stacking reach ~0.9203 accuracy; on Dataset-2, stacking reaches 0.9898 accuracy and outperforms the voting variants. The authors conclude that stacking provides stronger predictive capability for complex data distributions while majority voting remains a robust alternative.

Significance. If the comparative results hold under unbiased evaluation, the work offers practical empirical guidance on ensemble selection for healthcare risk prediction tasks. The use of two datasets and concrete numeric results (Accuracy/F1) is a strength, but the absence of error bars, statistical tests, or nested validation limits the strength of claims about superiority for 'complex data distributions.' The contribution is incremental rather than foundational.

major comments (1)

[Methods / Experimental Setup] The model selection pipeline (evaluation of nine algorithms over 50 hyperparameter configurations to choose the top three base learners for all ensembles) is performed on the same data regime later used to report final Accuracy/F1 on Dataset-1 (0.9203) and Dataset-2 (0.9898 for stacking). This introduces selection bias that directly inflates the headline numbers and prevents clean attribution of the observed gap (especially the ~0.008 difference on Dataset-2) to the ensemble method itself rather than to which models were permitted to participate. Nested cross-validation or a held-out selection set is required to support the central claim.

minor comments (2)

[Data Description] Dataset-1 and Dataset-2 are referenced only by number; their sources, sizes, feature counts, and class distributions should be stated explicitly in the data section for reproducibility.
[Results] No standard deviations, confidence intervals, or statistical significance tests accompany the reported Accuracy and F1 values; adding these would strengthen the comparison between ensembles.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We appreciate the emphasis on methodological rigor in the experimental setup. Below we provide a point-by-point response to the major comment, explaining our position and outlining the revisions we will make.

read point-by-point responses

Referee: [Methods / Experimental Setup] The model selection pipeline (evaluation of nine algorithms over 50 hyperparameter configurations to choose the top three base learners for all ensembles) is performed on the same data regime later used to report final Accuracy/F1 on Dataset-1 (0.9203) and Dataset-2 (0.9898 for stacking). This introduces selection bias that directly inflates the headline numbers and prevents clean attribution of the observed gap (especially the ~0.008 difference on Dataset-2) to the ensemble method itself rather than to which models were permitted to participate. Nested cross-validation or a held-out selection set is required to support the central claim.

Authors: We agree that conducting the base-learner selection and hyperparameter search on the same data later used for final reporting can introduce selection bias and produce somewhat optimistic absolute performance figures. This is a legitimate methodological concern. At the same time, because the identical selection procedure (nine algorithms, 50 configurations, top-three base learners) was applied uniformly to all three ensemble methods, the relative comparisons between Majority Hard Voting, Weighted Hard Voting, and Stacking remain internally consistent and are not confounded by differential model selection. The performance gap observed on Dataset-2, where stacking reaches 0.9898 accuracy while the voting variants are lower, can therefore still be attributed to the ensemble strategy itself. To fully address the referee’s point and strengthen the claims, we will revise the manuscript to adopt nested cross-validation: an outer loop for unbiased performance estimation and an inner loop for model selection and hyperparameter tuning. Updated results, methodology description, and any changes to the reported numbers will be included in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical ensemble comparison

full rationale

The paper reports direct experimental results from training nine base ML models across 50 hyperparameter configurations on two obesity datasets, selecting the top three, and measuring Accuracy/F1 for Majority Hard Voting, Weighted Hard Voting, and Stacking ensembles. These are standard held-out performance metrics with no mathematical derivation, first-principles claim, or quantity that reduces by construction to the selection step or fitted inputs. No self-citations, uniqueness theorems, or ansatzes are invoked in a load-bearing way. The study is self-contained empirical benchmarking without logical loops.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard supervised learning assumptions and empirical tuning rather than new theoretical constructs; the main added cost is the specific hyperparameter search and preprocessing choices.

free parameters (1)

Hyperparameter configurations for base learners = 50 configurations
Fifty configurations across nine algorithms were evaluated to select the top three base learners for the ensembles.

axioms (1)

domain assumption The two datasets are representative of real-world obesity risk factors and suitable for supervised prediction after balancing and outlier removal.
Invoked to justify applying the models to healthcare risk prediction.

pith-pipeline@v0.9.0 · 5853 in / 1301 out tokens · 51880 ms · 2026-05-18T19:03:03.547788+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A pool of nine Machine Learning (ML) algorithms, evaluated across a total of 50 hyperparameter configurations, was analyzed to identify the top three models to serve as base learners for the ensemble methods.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

[1]

Diagnostics 13(15), 2610 (2023)

Solomon, D.D., Khan, S., Garg, S., Gupta, G., Almjally, A., Alabduallah, B.I., Alsagri, H.S., Ibrahim, M.M., Abdallah, A.M.A.: Hybrid majority voting: Prediction and classification model for obesity. Diagnostics 13(15), 2610 (2023)

work page 2023
[2]

Journal of Exercise Science & Physical Activity Reviews 2(1), 104–113 (2024)

Pinar, A., Yagin, F.H., Georgian, B.: Use of logistic regression method in predicting obesity levels with machine learning method. Journal of Exercise Science & Physical Activity Reviews 2(1), 104–113 (2024)

work page 2024
[3]

Frontiers in endocrinology 12, 706978 (2021)

Lin, X., Li, H.: Obesity: epidemiology, pathophysiology, and therapeutics. Frontiers in endocrinology 12, 706978 (2021)

work page 2021
[4]

Frontiers in Public Health 10, 998782 (2023)

Jeon, J., Lee, S., Oh, C.: Age-specific risk factors for the prediction of obesity using a machine learning approach. Frontiers in Public Health 10, 998782 (2023)

work page 2023
[5]

In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), pp

Liu, L., Wei, W., Chow, K.-H., Loper, M., Gursoy, E., Truex, S., Wu, Y.: Deep neural network ensembles against deception: Ensemble diversity, accuracy and robustness. In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), pp. 274–282 (2019). IEEE

work page 2019
[6]

In: 2017 IEEE International Conference on 23 INnovations in Intelligent Systems and Applications (INISTA), pp

Leon, F., Floria, S.-A., B˘ adic˘ a, C.: Evaluating the effect of voting methods on ensemble-based classification. In: 2017 IEEE International Conference on 23 INnovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2017). IEEE

work page 2017
[7]

In: International Conference on Data Analytics and Insights, pp

Dey, R., Mathur, R.: Ensemble learning method using stacking with base learner, a comparison. In: International Conference on Data Analytics and Insights, pp. 159–169 (2023). Springer

work page 2023
[8]

Sinop ¨Universitesi Fen Bilimleri Dergisi 9(1), 217–239 (2024) https://doi.org/10.33484/ sinopfbd.1445215

Koklu, N., Sulak, S.A.: Using artificial intelligence techniques for the analysis of obesity status according to the individuals’ social and physical activities. Sinop ¨Universitesi Fen Bilimleri Dergisi 9(1), 217–239 (2024) https://doi.org/10.33484/ sinopfbd.1445215

work page 2024
[9]

https://github.com/pymche/ Machine-Learning-Obesity-Classification

pymche: Machine-Learning-Obesity-Classification. https://github.com/pymche/ Machine-Learning-Obesity-Classification. GitHub repository, accessed August 26, 2025 (2020)

work page 2025
[10]

International Journal of Data Science and Analytics, 1–10 (2024)

Dutta, R.R., Mukherjee, I., Chakraborty, C.: Obesity disease risk prediction using machine learning. International Journal of Data Science and Analytics, 1–10 (2024)

work page 2024
[11]

Plos one 19(1), 0292100 (2024)

Talari, P., N, B., Kaur, G., Alshahrani, H., Al Reshan, M.S., Sulaiman, A., Shaikh, A.: Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2. Plos one 19(1), 0292100 (2024)

work page 2024
[12]

: Obesity prediction using machine learning techniques

Musa, F., Basaky, F., et al. : Obesity prediction using machine learning techniques. Journal of Applied Artificial Intelligence 3(1), 24–33 (2022)

work page 2022
[13]

In: IDDM, pp

Rodr´ ıguez, E., Rodr´ ıguez, E., Nascimento, L., Silva, A.F., Marins, F.A.S.: Machine learning techniques to predict overweight or obesity. In: IDDM, pp. 190–204 (2021)

work page 2021
[14]

In: Recent Findings in Intelligent Computing Techniques: Proceedings of the 5th ICACNI 2017, Volume 2, pp

Jindal, K., Baliyan, N., Rana, P.S.: Obesity prediction using ensemble machine learning approaches. In: Recent Findings in Intelligent Computing Techniques: Proceedings of the 5th ICACNI 2017, Volume 2, pp. 355–362. Springer, ??? (2018)

work page 2017
[15]

In: Proceedings of the International Conference on Software Engineering (ICSE), pp

Basili, V.R., Weiss, D.M.: A methodology for collecting valid software engineering data. In: Proceedings of the International Conference on Software Engineering (ICSE), pp. 75–77 (1984)

work page 1984
[16]

2020.pandas-dev/pandas: Pandas

team, T.: pandas-dev/pandas: Pandas. Zenodo (2020). https://doi.org/10.5281/ zenodo.3509134 . https://doi.org/10.5281/zenodo.3509134

work page doi:10.5281/zenodo.3509134 2020
[17]

https://seaborn

Seaborn Developers: Seaborn Documentation — Version 0.13.2. https://seaborn. pydata.org/. Accessed: 2025-08-29 (2025)

work page 2025
[18]

Journal of Machine Learning Research 12, 2825–2830 (2011) 24

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011) 24

work page 2011
[19]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.linear model.LogisticRegression — scikit-learn doc- umentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.linear model.LogisticRegression.html

work page 2025
[20]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.neighbors.KNeighborsClassifier — scikit-learn doc- umentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.neighbors.KNeighborsClassifier.html

work page 2025
[21]

Naive Bayes — scikit-learn documentation

Scikit-learn developers: 1.9. Naive Bayes — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/naive bayes.html

work page 2025
[22]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.tree.DecisionTreeClassifier — scikit-learn docu- mentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.tree.DecisionTreeClassifier.html

work page 2025
[23]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.ensemble.RandomForestClassifier — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/ modules/generated/sklearn.ensemble.RandomForestClassifier.html

work page 2025
[24]

Accessed: 27 August 2025 (2025)

Scikit-learn developers: sklearn.ensemble.GradientBoostingClassifier — scikit-learn documentation. Accessed: 27 August 2025 (2025). https://scikit-learn.org/stable/ modules/generated/sklearn.ensemble.GradientBoostingClassifier.html

work page 2025
[25]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.ensemble.AdaBoostClassifier — scikit-learn docu- mentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.ensemble.AdaBoostClassifier.html

work page 2025
[26]

Accessed: 27 August 2025 (2025)

Scikit-learn developers: sklearn.svm.SVC — scikit-learn documentation. Accessed: 27 August 2025 (2025). https://scikit-learn.org/stable/modules/generated/sklearn. svm.SVC.html

work page 2025
[27]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.neural network.MLPClassifier — scikit-learn docu- mentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.neural network.MLPClassifier.html

work page 2025
[28]

Accessed: 2025-08-27 (2024)

Scikit-learn developers: sklearn.metrics.roc auc score — scikit-learn 1.5.0 docu- mentation. Accessed: 2025-08-27 (2024). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.roc auc score.html

work page 2025
[29]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.average precision score — scikit-learn doc- umentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.average precision score.html

work page 2025
[30]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.precision score — scikit-learn documen- tation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.precision score.html 25

work page 2025
[31]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.recall score — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/generated/ sklearn.metrics.recall score.html

work page 2025
[32]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.f1 score — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/generated/ sklearn.metrics.f1 score.html

work page 2025
[33]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.accuracy score — scikit-learn documen- tation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.accuracy score.html 26

work page 2025

[1] [1]

Diagnostics 13(15), 2610 (2023)

Solomon, D.D., Khan, S., Garg, S., Gupta, G., Almjally, A., Alabduallah, B.I., Alsagri, H.S., Ibrahim, M.M., Abdallah, A.M.A.: Hybrid majority voting: Prediction and classification model for obesity. Diagnostics 13(15), 2610 (2023)

work page 2023

[2] [2]

Journal of Exercise Science & Physical Activity Reviews 2(1), 104–113 (2024)

Pinar, A., Yagin, F.H., Georgian, B.: Use of logistic regression method in predicting obesity levels with machine learning method. Journal of Exercise Science & Physical Activity Reviews 2(1), 104–113 (2024)

work page 2024

[3] [3]

Frontiers in endocrinology 12, 706978 (2021)

Lin, X., Li, H.: Obesity: epidemiology, pathophysiology, and therapeutics. Frontiers in endocrinology 12, 706978 (2021)

work page 2021

[4] [4]

Frontiers in Public Health 10, 998782 (2023)

Jeon, J., Lee, S., Oh, C.: Age-specific risk factors for the prediction of obesity using a machine learning approach. Frontiers in Public Health 10, 998782 (2023)

work page 2023

[5] [5]

In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), pp

Liu, L., Wei, W., Chow, K.-H., Loper, M., Gursoy, E., Truex, S., Wu, Y.: Deep neural network ensembles against deception: Ensemble diversity, accuracy and robustness. In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), pp. 274–282 (2019). IEEE

work page 2019

[6] [6]

In: 2017 IEEE International Conference on 23 INnovations in Intelligent Systems and Applications (INISTA), pp

Leon, F., Floria, S.-A., B˘ adic˘ a, C.: Evaluating the effect of voting methods on ensemble-based classification. In: 2017 IEEE International Conference on 23 INnovations in Intelligent Systems and Applications (INISTA), pp. 1–6 (2017). IEEE

work page 2017

[7] [7]

In: International Conference on Data Analytics and Insights, pp

Dey, R., Mathur, R.: Ensemble learning method using stacking with base learner, a comparison. In: International Conference on Data Analytics and Insights, pp. 159–169 (2023). Springer

work page 2023

[8] [8]

Sinop ¨Universitesi Fen Bilimleri Dergisi 9(1), 217–239 (2024) https://doi.org/10.33484/ sinopfbd.1445215

Koklu, N., Sulak, S.A.: Using artificial intelligence techniques for the analysis of obesity status according to the individuals’ social and physical activities. Sinop ¨Universitesi Fen Bilimleri Dergisi 9(1), 217–239 (2024) https://doi.org/10.33484/ sinopfbd.1445215

work page 2024

[9] [9]

https://github.com/pymche/ Machine-Learning-Obesity-Classification

pymche: Machine-Learning-Obesity-Classification. https://github.com/pymche/ Machine-Learning-Obesity-Classification. GitHub repository, accessed August 26, 2025 (2020)

work page 2025

[10] [10]

International Journal of Data Science and Analytics, 1–10 (2024)

Dutta, R.R., Mukherjee, I., Chakraborty, C.: Obesity disease risk prediction using machine learning. International Journal of Data Science and Analytics, 1–10 (2024)

work page 2024

[11] [11]

Plos one 19(1), 0292100 (2024)

Talari, P., N, B., Kaur, G., Alshahrani, H., Al Reshan, M.S., Sulaiman, A., Shaikh, A.: Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2. Plos one 19(1), 0292100 (2024)

work page 2024

[12] [12]

: Obesity prediction using machine learning techniques

Musa, F., Basaky, F., et al. : Obesity prediction using machine learning techniques. Journal of Applied Artificial Intelligence 3(1), 24–33 (2022)

work page 2022

[13] [13]

In: IDDM, pp

Rodr´ ıguez, E., Rodr´ ıguez, E., Nascimento, L., Silva, A.F., Marins, F.A.S.: Machine learning techniques to predict overweight or obesity. In: IDDM, pp. 190–204 (2021)

work page 2021

[14] [14]

In: Recent Findings in Intelligent Computing Techniques: Proceedings of the 5th ICACNI 2017, Volume 2, pp

Jindal, K., Baliyan, N., Rana, P.S.: Obesity prediction using ensemble machine learning approaches. In: Recent Findings in Intelligent Computing Techniques: Proceedings of the 5th ICACNI 2017, Volume 2, pp. 355–362. Springer, ??? (2018)

work page 2017

[15] [15]

In: Proceedings of the International Conference on Software Engineering (ICSE), pp

Basili, V.R., Weiss, D.M.: A methodology for collecting valid software engineering data. In: Proceedings of the International Conference on Software Engineering (ICSE), pp. 75–77 (1984)

work page 1984

[16] [16]

2020.pandas-dev/pandas: Pandas

team, T.: pandas-dev/pandas: Pandas. Zenodo (2020). https://doi.org/10.5281/ zenodo.3509134 . https://doi.org/10.5281/zenodo.3509134

work page doi:10.5281/zenodo.3509134 2020

[17] [17]

https://seaborn

Seaborn Developers: Seaborn Documentation — Version 0.13.2. https://seaborn. pydata.org/. Accessed: 2025-08-29 (2025)

work page 2025

[18] [18]

Journal of Machine Learning Research 12, 2825–2830 (2011) 24

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011) 24

work page 2011

[19] [19]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.linear model.LogisticRegression — scikit-learn doc- umentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.linear model.LogisticRegression.html

work page 2025

[20] [20]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.neighbors.KNeighborsClassifier — scikit-learn doc- umentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.neighbors.KNeighborsClassifier.html

work page 2025

[21] [21]

Naive Bayes — scikit-learn documentation

Scikit-learn developers: 1.9. Naive Bayes — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/naive bayes.html

work page 2025

[22] [22]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.tree.DecisionTreeClassifier — scikit-learn docu- mentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.tree.DecisionTreeClassifier.html

work page 2025

[23] [23]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.ensemble.RandomForestClassifier — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/ modules/generated/sklearn.ensemble.RandomForestClassifier.html

work page 2025

[24] [24]

Accessed: 27 August 2025 (2025)

Scikit-learn developers: sklearn.ensemble.GradientBoostingClassifier — scikit-learn documentation. Accessed: 27 August 2025 (2025). https://scikit-learn.org/stable/ modules/generated/sklearn.ensemble.GradientBoostingClassifier.html

work page 2025

[25] [25]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.ensemble.AdaBoostClassifier — scikit-learn docu- mentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.ensemble.AdaBoostClassifier.html

work page 2025

[26] [26]

Accessed: 27 August 2025 (2025)

Scikit-learn developers: sklearn.svm.SVC — scikit-learn documentation. Accessed: 27 August 2025 (2025). https://scikit-learn.org/stable/modules/generated/sklearn. svm.SVC.html

work page 2025

[27] [27]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.neural network.MLPClassifier — scikit-learn docu- mentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.neural network.MLPClassifier.html

work page 2025

[28] [28]

Accessed: 2025-08-27 (2024)

Scikit-learn developers: sklearn.metrics.roc auc score — scikit-learn 1.5.0 docu- mentation. Accessed: 2025-08-27 (2024). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.roc auc score.html

work page 2025

[29] [29]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.average precision score — scikit-learn doc- umentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.average precision score.html

work page 2025

[30] [30]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.precision score — scikit-learn documen- tation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.precision score.html 25

work page 2025

[31] [31]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.recall score — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/generated/ sklearn.metrics.recall score.html

work page 2025

[32] [32]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.f1 score — scikit-learn documentation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/generated/ sklearn.metrics.f1 score.html

work page 2025

[33] [33]

Accessed: 2025-08-27 (2025)

Scikit-learn developers: sklearn.metrics.accuracy score — scikit-learn documen- tation. Accessed: 2025-08-27 (2025). https://scikit-learn.org/stable/modules/ generated/sklearn.metrics.accuracy score.html 26

work page 2025