A Unified Three-Stage Machine Learning Framework for Diabetes Detection, Subtype Discrimination, and Cognitive-Metabolic Hypothesis Testing

Rishav Tewari; Ruzina Haque Laskar; Vishal Pandey

arxiv: 2605.13464 · v1 · pith:OIT4KSOSnew · submitted 2026-05-13 · 💻 cs.LG

A Unified Three-Stage Machine Learning Framework for Diabetes Detection, Subtype Discrimination, and Cognitive-Metabolic Hypothesis Testing

Vishal Pandey , Ruzina Haque Laskar , Rishav Tewari This is my paper

Pith reviewed 2026-05-14 19:13 UTC · model grok-4.3

classification 💻 cs.LG

keywords diabetes detectionsubtype clusteringcognitive associationmachine learning classificationK-Means clusteringSHAP explainabilityglycaemic controlmetabolic-cognitive link

0 comments

The pith

A three-stage machine learning framework detects diabetes, clusters subtypes without labels, and links better glycaemic control to higher cognitive scores.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a single reproducible pipeline that first classifies diabetes presence from routine clinical measurements, then partitions confirmed cases into subtypes through clustering on a few key variables, and finally tests whether metabolic control correlates with cognitive performance in longitudinal data. It benchmarks multiple classifiers with cross-validation and feature attribution, applies silhouette-validated K-Means to recover two groups, and reports a statistically corrected positive correlation. The work matters because it shows how standard, interpretable machine-learning steps can be chained to move from diagnosis to exploratory subtype analysis and hypothesis testing without requiring new labeled subtype data.

Core claim

The authors establish that supervised classifiers reach an ROC-AUC of 0.825 and accuracy of 0.762 on the NCSU diabetes dataset with Glucose, BMI, and Age as dominant predictors, that K-Means clustering using Glucose, Insulin, and Age yields two partitions among diabetic cases with silhouette score approximately 0.116 interpreted as clinically plausible, and that glycaemic control shows a significant positive Spearman correlation of 0.208 with cognitive function in the Ohio dataset that survives Holm correction.

What carries the argument

The unified three-stage pipeline: supervised classification with SHAP explainability for detection, silhouette-validated K-Means clustering for subtype discrimination, and statistical correlation testing for metabolic-cognitive associations.

If this is right

Glucose, BMI, and Age function as the primary predictive biomarkers in the classification stage.
Two subtype partitions can be recovered from diabetic cases using only Glucose, Insulin, and Age without ground-truth labels.
Glycaemic control maintains a positive association with cognitive function after multiple-testing correction.
The combination of cross-validation, feature attribution, and statistical validation supports reproducible diabetes analytics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The clustering approach could be tested on other chronic conditions to identify subtypes from routine lab values alone.
Linking the recovered clusters to longitudinal complication data would test whether they carry predictive value for personalized management.
Adding additional metabolic or imaging features might raise the silhouette score and clarify whether the observed cognitive association strengthens or attenuates.

Load-bearing premise

A low silhouette score of approximately 0.116 still marks clinically plausible subtype partitions, and the NCSU and Ohio datasets are representative without major unstated selection biases.

What would settle it

A replication dataset in which the same three features produce K-Means clusters that show no difference in independent clinical outcomes such as complication rates or treatment response would falsify the subtype stage.

Figures

Figures reproduced from arXiv: 2605.13464 by Rishav Tewari, Ruzina Haque Laskar, Vishal Pandey.

**Figure 1.** Figure 1: Three-stage unified pipeline. Stage 1 performs binary diabetes detection with cross-validated supervised classifiers and SHAP explainability. Stage 2 applies validated K-Means clustering to the diabetic sub-cohort for T1DM/T2DM discrimination. Stage 3 conducts statistical hypothesis testing on the Ohio longitudinal cohort to probe the T3DM glycaemic-cognitive link. 4.1 Stage 1: Binary Diabetes Detection Pr… view at source ↗

**Figure 2.** Figure 2: Stage 1 test-set evaluation. Left: confusion matrix for SVM-RBF on the held-out test set. Right: ROC curve with AUC = 0.80. SHAP feature attribution [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: SHAP beeswarm plot , Random Forest (Stage 1). Each dot represents one test-set instance. Colour encodes feature value (red = high, blue = low). Horizontal position encodes SHAP value (positive = pushes prediction towards diabetic class). Features are ranked by mean |SHAP| in descending order. median age , is consistent with T2DM phenomenology (insulin resistance, relative insulin excess in early stages, ad… view at source ↗

**Figure 4.** Figure 4: K-Means silhouette validation curve. k=2 achieves a silhouette score of ≈ 0.116, consistent with moderate cluster structure. k=4 is a local maximum but lacks clinical interpretability [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

Diabetes mellitus affects over 537 million adults worldwide and remains a major challenge in preventive healthcare. Existing machine-learning studies primarily formulate diabetes prediction as a binary classification problem, while subtype-oriented analysis and glycaemic-cognitive associations remain comparatively underexplored. We present a reproducible three-stage machine learning framework for diabetes detection, subtype-oriented clustering, and metabolic-cognitive association analysis. In Stage 1, five supervised classifiers together with a stacking ensemble are benchmarked on the NCSU Diabetes Dataset using stratified five-fold cross-validation and evaluation metrics including ROC-AUC, balanced accuracy, recall, and F1-score. SVM-RBF and Logistic Regression achieve the highest ROC-AUC ($0.825 \pm 0.026$), while Random Forest achieves the highest accuracy ($0.762 \pm 0.030$). SHAP explainability identifies Glucose, BMI, and Age as the dominant predictive biomarkers. In Stage 2, silhouette-validated K-Means clustering ($k=2$, silhouette $\approx 0.116$) is applied to confirmed diabetic cases using Glucose, Insulin, and Age, recovering clinically plausible subtype-oriented partitions without requiring ground-truth subtype labels. In Stage 3, statistical analysis of the Ohio Longitudinal Cognitive Dataset ($n=373$) reveals a significant positive association between glycaemic control and cognitive function ($\rho_s = 0.208$, $p = 5.29 \times 10^{-5}$), which survives Holm correction. The findings support the utility of statistically grounded and interpretable ML pipelines for reproducible diabetes analytics and subtype-aware exploratory analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a clean but incremental application of standard ML tools to diabetes data, with the clustering stage resting on weak separation.

read the letter

The paper chains three standard stages on public datasets: supervised classification for diabetes detection, K-Means on glucose/insulin/age for subtypes, and Spearman correlation for cognitive links. Nothing is methodologically new, but the authors execute the pipeline transparently and report concrete numbers from stratified five-fold cross-validation plus a p-value that survives correction. Stage one benchmarks five classifiers plus stacking and uses SHAP to flag glucose, BMI, and age as top features, which is useful for anyone wanting a reproducible template. Stage three finds a modest positive association (rho 0.208) on the Ohio longitudinal set that looks statistically grounded. The main weakness is stage two. The silhouette score of roughly 0.116 for k=2 indicates substantial overlap, yet the abstract still calls the partitions clinically plausible without external validation, comparison to known subtypes, or additional cluster metrics. That gap makes the subtype-discrimination claim the least supported part of the story. The work is aimed at applied researchers building practical diabetes analytics pipelines rather than theorists seeking new methods. A reader who needs a worked example of multi-stage analysis on real medical data will find the numbers and the explainability step helpful. It deserves peer review because the methods are standard, the results are quantified, and the reproducibility steps are in place, even if revisions will be needed on the clustering interpretation.

Referee Report

1 major / 2 minor

Summary. The manuscript presents a three-stage machine learning framework for diabetes detection using supervised classifiers on the NCSU dataset, subtype discrimination via K-Means clustering on diabetic cases, and analysis of glycaemic-cognitive associations in the Ohio dataset. It reports performance metrics from cross-validation, identifies key biomarkers via SHAP, claims clinically plausible subtypes from clustering with silhouette score ≈0.116, and a significant correlation (ρ_s = 0.208, p = 5.29×10^{-5}) surviving correction.

Significance. If the subtype partitions are clinically meaningful, the work offers a reproducible, interpretable pipeline integrating prediction, exploratory subtyping, and hypothesis testing for diabetes research. Strengths include stratified cross-validation, ensemble benchmarking, SHAP explainability, and proper multiple-testing correction, supporting utility for reproducible diabetes analytics.

major comments (1)

[Stage 2] Stage 2 clustering section: K-Means (k=2 on Glucose/Insulin/Age) yields silhouette ≈0.116, indicating weak separation and overlap. This directly undercuts the claim of recovering 'clinically plausible subtype-oriented partitions' without external validation against known subtypes, clinical thresholds, or additional indices (e.g., Davies-Bouldin).

minor comments (2)

[Stage 1] Stage 1 methods lack explicit details on hyperparameter search ranges, missing-value imputation, and exact feature scaling, limiting full reproducibility despite the reported CV protocol.
[Abstract and Stage 2] The abstract and Stage 2 text describe the clustering as 'silhouette-validated' without acknowledging the low absolute value or discussing its implications for partition quality.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comment point by point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Stage 2] Stage 2 clustering section: K-Means (k=2 on Glucose/Insulin/Age) yields silhouette ≈0.116, indicating weak separation and overlap. This directly undercuts the claim of recovering 'clinically plausible subtype-oriented partitions' without external validation against known subtypes, clinical thresholds, or additional indices (e.g., Davies-Bouldin).

Authors: We agree that a silhouette score of ≈0.116 reflects weak separation and notable overlap, which limits the strength of interpreting the clusters as clinically definitive subtypes. In the revised manuscript we will moderate the language in the abstract, Stage 2 section, and discussion to describe the results as 'exploratory subtype-oriented partitions' rather than 'clinically plausible'. We will also add the Davies-Bouldin index (and Calinski-Harabasz index) to the cluster validation, explicitly discuss the low silhouette as a limitation, and note the lack of external validation against known clinical subtypes or thresholds. These revisions will improve transparency without altering the reported methodology or metrics. revision: yes

Circularity Check

0 steps flagged

No significant circularity in three-stage ML framework

full rationale

The paper applies standard supervised classifiers (with stratified 5-fold CV and metrics like ROC-AUC) to the NCSU dataset in Stage 1, performs K-Means (k=2) clustering with silhouette validation on diabetic cases using Glucose/Insulin/Age in Stage 2, and computes Spearman correlation on the Ohio dataset in Stage 3. No derivation reduces to its inputs by construction, no fitted parameters are renamed as predictions, and no load-bearing self-citations or ansatzes are present. All steps are direct applications of established methods to external data, yielding independent empirical results.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The paper depends on standard statistical assumptions and the validity of the input datasets; no new physical or mathematical entities are postulated.

free parameters (2)

Number of clusters k = 2
Selected based on silhouette score validation in stage 2
Classifier hyperparameters
Tuned for SVM-RBF, Random Forest etc., but values not reported in abstract

axioms (2)

standard math Stratified five-fold cross-validation provides unbiased performance estimates
Used in stage 1 for benchmarking classifiers
domain assumption The chosen features (Glucose, Insulin, Age) are sufficient for subtype discrimination
Assumed in stage 2 clustering without ground truth labels

pith-pipeline@v0.9.0 · 5601 in / 1396 out tokens · 56994 ms · 2026-05-14T19:13:37.682852+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

silhouette-validated K-Means clustering (k=2, silhouette ≈0.116) ... on Glucose, Insulin, and Age
Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Spearman rank correlation ... ρs = 0.208, p = 5.29×10−5 ... survives Holm correction

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

[1]

Standards of medical care in diabetes --- 2021

American Diabetes Association. Standards of medical care in diabetes --- 2021. Diabetes Care, 44(Suppl.\ 1):S1--S232, 2021

work page 2021
[2]

M. A. Atkinson, G. S. Eisenbarth, and A. W. Michels. Type 1 diabetes. The Lancet, 383(9911):69--82, 2014

work page 2014
[3]

S. M. de la Monte and J. R. Wands. Alzheimer's disease is type 3 diabetes --- evidence reviewed. Journal of Diabetes Science and Technology, 2(6):1101--1113, 2008

work page 2008
[4]

Feinkohl, J

I. Feinkohl, J. F. Price, M. W. Strachan, and B. M. Frier. The impact of diabetes on cognitive decline: potential vascular, metabolic, and psychosocial risk factors. Alzheimer's & Dementia, 11(8):970--978, 2015

work page 2015
[5]

IDF Diabetes Atlas, 10th ed

International Diabetes Federation. IDF Diabetes Atlas, 10th ed. Brussels, Belgium: IDF, 2021

work page 2021
[6]

Janson, T

J. Janson, T. Laedtke, J. E. Parisi, P. O'Brien, and R. C. Petersen. Increased risk of type 2 diabetes in Alzheimer disease. Diabetes, 53(2):474--481, 2004

work page 2004
[7]

S. E. Kahn and M. E. Cooper. Type 2 diabetes, cardiovascular disease, and the mechanism of action of antidiabetic agents. Diabetes Care, 42(12):2237--2246, 2019

work page 2019
[8]

Kavakiotis, O

I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda. Machine learning and data mining methods in diabetes research. Computational and Structural Biotechnology Journal, 15:104--116, 2017

work page 2017
[9]

J. G. Klann, A. Joss, K. Embree, and S. N. Murphy. Data model harmonization for the all of us research program: transforming i2b2 data into the OMOP common data model. PLOS ONE, 14(2):e0212463, 2019

work page 2019
[10]

S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, volume 30, 2017

work page 2017
[11]

D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, and R. L. Buckner. Open access series of imaging studies ( OASIS ): longitudinal MRI data in nondemented and demented older adults. Journal of Cognitive Neuroscience, 22(12):2677--2684, 2010

work page 2010
[12]

Shimpi and Shakkeera

J. Shimpi and Shakkeera. Predictive analysis of type-1 and type-2 diabetes mellitus using machine learning. In Proceedings of the 3rd ICCIP, 2021. Available at https://ssrn.com/abstract=3917810

work page 2021
[13]

Sisodia and D

D. Sisodia and D. S. Sisodia. Prediction of diabetes using classification algorithms. Procedia Computer Science, 132:1578--1585, 2018

work page 2018
[14]

J. W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler, and R. S. Johannes. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Annual Symposium on Computer Application in Medical Care, pages 261--265, 1988

work page 1988
[15]

M. W. Strachan, J. F. Price, and B. M. Frier. Diabetes, cognitive impairment, and dementia. Diabetes Care, 41(11):2509--2518, 2018

work page 2018
[16]

Tasin, T

I. Tasin, T. U. Nabil, S. Islam, and R. Khan. Diabetes prediction using machine learning and explainable AI techniques. Healthcare Technology Letters, 10(1--2):1--10, 2023

work page 2023
[17]

N. T. Vagelatos and G. D. Eslick. Type 2 diabetes as a risk factor for Alzheimer's disease: the confounders, interactions, and neuropathology associated with this relationship. Epidemiologic Reviews, 35(1):152--160, 2013

work page 2013

[1] [1]

Standards of medical care in diabetes --- 2021

American Diabetes Association. Standards of medical care in diabetes --- 2021. Diabetes Care, 44(Suppl.\ 1):S1--S232, 2021

work page 2021

[2] [2]

M. A. Atkinson, G. S. Eisenbarth, and A. W. Michels. Type 1 diabetes. The Lancet, 383(9911):69--82, 2014

work page 2014

[3] [3]

S. M. de la Monte and J. R. Wands. Alzheimer's disease is type 3 diabetes --- evidence reviewed. Journal of Diabetes Science and Technology, 2(6):1101--1113, 2008

work page 2008

[4] [4]

Feinkohl, J

I. Feinkohl, J. F. Price, M. W. Strachan, and B. M. Frier. The impact of diabetes on cognitive decline: potential vascular, metabolic, and psychosocial risk factors. Alzheimer's & Dementia, 11(8):970--978, 2015

work page 2015

[5] [5]

IDF Diabetes Atlas, 10th ed

International Diabetes Federation. IDF Diabetes Atlas, 10th ed. Brussels, Belgium: IDF, 2021

work page 2021

[6] [6]

Janson, T

J. Janson, T. Laedtke, J. E. Parisi, P. O'Brien, and R. C. Petersen. Increased risk of type 2 diabetes in Alzheimer disease. Diabetes, 53(2):474--481, 2004

work page 2004

[7] [7]

S. E. Kahn and M. E. Cooper. Type 2 diabetes, cardiovascular disease, and the mechanism of action of antidiabetic agents. Diabetes Care, 42(12):2237--2246, 2019

work page 2019

[8] [8]

Kavakiotis, O

I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda. Machine learning and data mining methods in diabetes research. Computational and Structural Biotechnology Journal, 15:104--116, 2017

work page 2017

[9] [9]

J. G. Klann, A. Joss, K. Embree, and S. N. Murphy. Data model harmonization for the all of us research program: transforming i2b2 data into the OMOP common data model. PLOS ONE, 14(2):e0212463, 2019

work page 2019

[10] [10]

S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, volume 30, 2017

work page 2017

[11] [11]

D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, and R. L. Buckner. Open access series of imaging studies ( OASIS ): longitudinal MRI data in nondemented and demented older adults. Journal of Cognitive Neuroscience, 22(12):2677--2684, 2010

work page 2010

[12] [12]

Shimpi and Shakkeera

J. Shimpi and Shakkeera. Predictive analysis of type-1 and type-2 diabetes mellitus using machine learning. In Proceedings of the 3rd ICCIP, 2021. Available at https://ssrn.com/abstract=3917810

work page 2021

[13] [13]

Sisodia and D

D. Sisodia and D. S. Sisodia. Prediction of diabetes using classification algorithms. Procedia Computer Science, 132:1578--1585, 2018

work page 2018

[14] [14]

J. W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler, and R. S. Johannes. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Annual Symposium on Computer Application in Medical Care, pages 261--265, 1988

work page 1988

[15] [15]

M. W. Strachan, J. F. Price, and B. M. Frier. Diabetes, cognitive impairment, and dementia. Diabetes Care, 41(11):2509--2518, 2018

work page 2018

[16] [16]

Tasin, T

I. Tasin, T. U. Nabil, S. Islam, and R. Khan. Diabetes prediction using machine learning and explainable AI techniques. Healthcare Technology Letters, 10(1--2):1--10, 2023

work page 2023

[17] [17]

N. T. Vagelatos and G. D. Eslick. Type 2 diabetes as a risk factor for Alzheimer's disease: the confounders, interactions, and neuropathology associated with this relationship. Epidemiologic Reviews, 35(1):152--160, 2013

work page 2013