Can Breath Biomarkers Causally Influence Blood Glucose? Investigating VOC-Mediated Modulation in Diabetes
Pith reviewed 2026-05-22 08:14 UTC · model grok-4.3
The pith
Specific volatile organic compounds in breath causally influence blood glucose levels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that specific VOCs exhibit a strong causal influence on glucose levels, demonstrated through causal inference techniques on observational data, and that machine learning models can reliably classify diabetics from non-diabetics and stratify high-risk individuals using these non-invasive markers, supported by a risk-based ranking system and Gaussian Mixture Model clustering.
What carries the argument
Causal inference techniques to estimate the effects of specific breath VOCs on blood glucose levels, integrated with a machine learning classifier and Gaussian Mixture Model for risk stratification and clustering.
If this is right
- Non-invasive tools for early diabetes screening can be built from breath VOC measurements and lifestyle data.
- Individuals in the diagnostic gray zone can receive personalized risk rankings.
- Natural population subgroups can be identified for tailored diabetes management approaches.
- Understanding VOC effects may guide new strategies for glucose control without invasive testing.
Where Pith is reading between the lines
- Integration with wearable sensors could enable continuous, real-time glucose monitoring via breath analysis.
- These findings might connect to broader research on how metabolic byproducts influence systemic health markers.
- Future studies could test interventions that alter specific VOC levels to observe direct glucose responses.
Load-bearing premise
The observational data and causal inference techniques used can isolate true causal effects of VOCs on blood glucose without substantial unmeasured confounding or measurement error.
What would settle it
An experiment that alters specific VOC levels through controlled means, such as dietary or environmental changes, and measures no corresponding change in blood glucose levels would challenge the causal claim.
Figures
read the original abstract
Diabetes is a global health burden, and early detection is critical for timely intervention. This study explores a non-invasive, data-driven framework to identify individuals at risk of diabetes using Volatile Organic Compounds (VOCs) and lifestyle variables. We use causal inference techniques to estimate the impact of VOCs such as acetone, isopropanol, isoprene, and ethanol on blood glucose levels. Additionally, we designed a classifier to distinguish diabetics from non-diabetics using non-invasive markers. We created a risk-based ranking system for individuals in the "gray zone," and identified natural clusters in the population using Gaussian Mixture Model. Our results suggest that specific VOCs exhibit a strong causal influence on glucose levels and that machine learning models can reliably classify and stratify individuals at high risk. This integrated causal-explainable analysis can support the development of tool for non-invasive early screening of diabetes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a non-invasive diabetes screening framework that combines breath volatile organic compounds (VOCs: acetone, isopropanol, isoprene, ethanol) with lifestyle variables. It applies causal inference to estimate directed effects of these VOCs on blood glucose, trains a classifier to separate diabetics from non-diabetics, ranks individuals in the gray zone by risk, and uses Gaussian Mixture Models to discover population clusters. The central conclusion is that selected VOCs exert a strong causal influence on glucose levels and that the resulting ML models enable reliable risk stratification.
Significance. If the causal identification strategy were shown to be robust against reverse causation and unmeasured confounding, and if the classifier were validated with proper metrics on held-out data, the work could support development of breath-based early-detection tools. At present the observational design and lack of reported identification assumptions limit the immediate clinical or scientific impact.
major comments (3)
- [Abstract / Methods] Abstract and Methods: The claim that specific VOCs exhibit a 'strong causal influence' on blood glucose is load-bearing for the paper's contribution, yet no identification strategy, adjustment set, or sensitivity analysis is described that would block back-door paths from unmeasured confounders (dietary load, hepatic function, medication timing) or reverse arrows (ketogenesis producing acetone in response to hyperglycemia).
- [Results] Results: No effect sizes, standard errors, or robustness checks are supplied for the reported causal effects, so it is impossible to evaluate whether the estimated influences survive standard tests for unmeasured confounding or whether they reduce to associations.
- [Classifier / GMM] Classifier and GMM sections: The risk-stratification and clustering steps inherit the same identification problem; any downstream ranking or cluster labels derived from confounded VOC-glucose associations cannot be interpreted as causal risk strata without additional assumptions or instruments.
minor comments (2)
- [Abstract] The abstract states that machine-learning models 'reliably classify' but supplies no accuracy, AUC, or confusion-matrix values; these metrics should be added with cross-validation details.
- [Methods] Notation for the VOC variables and the exact causal inference procedure (e.g., which algorithm or library) is not introduced; a short methods paragraph defining these would improve readability.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which highlight important aspects of causal identification and reporting. We address each major comment below and have revised the manuscript to strengthen the presentation of assumptions and results.
read point-by-point responses
-
Referee: [Abstract / Methods] Abstract and Methods: The claim that specific VOCs exhibit a 'strong causal influence' on blood glucose is load-bearing for the paper's contribution, yet no identification strategy, adjustment set, or sensitivity analysis is described that would block back-door paths from unmeasured confounders (dietary load, hepatic function, medication timing) or reverse arrows (ketogenesis producing acetone in response to hyperglycemia).
Authors: We agree that explicit identification assumptions are required to support the causal claims. In the revised manuscript we add a new subsection in Methods that specifies the assumed DAG, the observed adjustment set (lifestyle variables plus measured covariates), and sensitivity analyses for unmeasured confounding and reverse causation. We also discuss the temporal ordering implicit in the data collection to address potential reverse arrows from hyperglycemia to acetone production. revision: yes
-
Referee: [Results] Results: No effect sizes, standard errors, or robustness checks are supplied for the reported causal effects, so it is impossible to evaluate whether the estimated influences survive standard tests for unmeasured confounding or whether they reduce to associations.
Authors: We accept this criticism. The revised Results section now reports the numerical causal effect estimates together with standard errors, confidence intervals, and p-values. We further include robustness checks that vary the adjustment set and report sensitivity metrics (e.g., E-values) that quantify how strong unmeasured confounding would need to be to nullify the observed effects. revision: yes
-
Referee: [Classifier / GMM] Classifier and GMM sections: The risk-stratification and clustering steps inherit the same identification problem; any downstream ranking or cluster labels derived from confounded VOC-glucose associations cannot be interpreted as causal risk strata without additional assumptions or instruments.
Authors: We recognize that causal interpretation of the risk ranking and clusters rests on the validity of the upstream VOC-to-glucose effects. The revision adds an explicit statement of the additional assumptions needed for causal reading of the stratification and clustering steps. We also present the classifier and GMM outputs under both a predictive interpretation (which does not require causal identification) and a conditional causal interpretation (conditional on the stated assumptions holding). No instruments are present in the observational dataset, which we now acknowledge as a limitation. revision: partial
Circularity Check
No significant circularity detected
full rationale
The paper applies standard causal inference methods to observational breath and glucose data, followed by ML classification, risk ranking, and GMM clustering. No equations, self-definitional constructs, or fitted parameters renamed as independent predictions appear in the abstract or described workflow. The derivation chain relies on external causal assumptions and data rather than reducing results to inputs by construction, rendering the analysis self-contained.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
[ame, 2024] Diagnosis and classification of diabetes: stan- dards of care in diabetes—2024.Diabetes Care, 47(Sup- plement 1):S20–S42,
work page 2024
-
[2]
[Galassettiet al., 2005 ] Pietro R Galassetti, Brian Novak, Dan Nemet, Christie Rose-Gottron, Dan M Cooper, Si- mone Meinardi, Robert Newcomb, Frank Zaldivar, and Donald R Blake. Breath ethanol and acetone as indica- tors of serum glucose levels: an initial report.Diabetes technology & therapeutics, 7(1):115–123,
work page 2005
-
[3]
[Heet al., 2019 ] Miao He, Weixi Gu, Ying Kong, Lin Zhang, Costas J Spanos, and Khalid M Mosalam. Causalbg: Causal recurrent neural network for the blood glucose in- ference with iot platform.IEEE Internet of Things Journal, 7(1):598–610,
work page 2019
-
[4]
[Liet al., 2017 ] Wenwen Li, Yong Liu, Yu Liu, Shouquan Cheng, and Yixiang Duan. Exhaled isopropanol: new po- tential biomarker in diabetic breathomics and its metabolic correlations with acetone.Rsc Advances, 7(28):17480– 17488,
work page 2017
-
[5]
[Loset al., 2020 ] Evan A Los, William L Stone, and George Ford. 382-p: Exhaled isoprene products change during hy- poglycemia in type 1 diabetes and home breath capture is feasible as a diagnostic tool.Diabetes, 69(Supplement 1),
work page 2020
-
[6]
A unified approach to interpreting model predictions
[Lundberg and Lee, 2017] Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30,
work page 2017
-
[7]
[Minhet al., 2012 ] Timothy Do Chau Minh, Donald Ray Blake, and Pietro Renato Galassetti. The clinical potential of exhaled breath analysis for diabetes mellitus.Diabetes research and clinical practice, 97(2):195–205,
work page 2012
-
[8]
[Molina-Monteset al., 2021 ] Esther Molina-Montes, Clau- dia Coscia, Paulina G ´omez-Rubio, Alba Fern ´andez, Ri- anne Boenink, Marta Rava, Mirari M ´arquez, Xavier Molero, Matthias L ¨ohr, Linda Sharp, et al. Deciphering the complex interplay between pancreatic cancer, diabetes mellitus subtypes and obesity/bmi through causal infer- ence and mediation an...
work page 2021
-
[9]
[Nematet al., 2023 ] Hoda Nemat, Heydar Khadem, Jackie Elliott, and Mohammed Benaissa. Causality analysis in type 1 diabetes mellitus with application to blood glu- cose level prediction.Computers in Biology and Medicine, 153:106535,
work page 2023
-
[10]
[Noh and Kim, 2025] Mi Jin Noh and Yang Sok Kim. Di- abetes prediction through linkage of causal discovery and inference model with machine learning models. Biomedicines, 13(1):124,
work page 2025
-
[11]
[Prosperiet al., 2020 ] Mattia Prosperi, Yi Guo, Matt Sper- rin, James S Koopman, Jae S Min, Xing He, Shannan Rich, Mo Wang, Iain E Buchan, and Jiang Bian. Causal in- ference and counterfactual prediction in machine learning for actionable healthcare.Nature Machine Intelligence, 2(7):369–375,
work page 2020
-
[12]
[Righettoniet al., 2013 ] Marco Righettoni, Alex Schmid, Anton Amann, and Sotiris E Pratsinis. Correlations be- tween blood glucose and breath components from portable gas sensors and ptr-tof-ms.Journal of breath research, 7(3):037110,
work page 2013
-
[13]
[Royet al., 2024 ] Souradeep Roy, Varsha Sharma, Avik Ghose, Sanjay Kimbahune, Arpan Pal, and Prasanta K Guha. Machine learning-driven resistive sensing of ar- tificial breath biomarkers from lipid metabolism: A step toward non-invasive healthcare.IEEE Sensors Journal,
work page 2024
-
[14]
[Saasaet al., 2018 ] Valentine Saasa, Thomas Malwela, Mervyn Beukes, Matlou Mokgotho, Chaun-Pu Liu, and Bonex Mwakikunga. Sensing technologies for detection of acetone in human breath for diabetes diagnosis and monitoring.Diagnostics, 8(1):12,
work page 2018
-
[15]
Feasi- bility study of blood glucose level prediction using breath metabolites as biomarkers
[Sharmaet al., 2025 ] Varsha Sharma, Paramita Kar Choud- hury, Souradeep Roy, Satinath Mukhopadhyay, Piyas Gargari, Prasanta Kumar Guha, Sanjay Madhukar Kim- bahune, Sujit Shinde, Arpan Pal, and Avik Ghose. Feasi- bility study of blood glucose level prediction using breath metabolites as biomarkers. In2025 IEEE Applied Sensing Conference (APSCON) (APSCON ...
work page 2025
-
[16]
[Sunet al., 2022 ] Hong Sun, Pouya Saeedi, Suvi Karuranga, Moritz Pinkepank, Katherine Ogurtsova, Bruce B Dun- can, Caroline Stein, Abdul Basit, Juliana CN Chan, Jean Claude Mbanya, et al. Idf diabetes atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045.Diabetes research and clinical practice, 183:109119,
work page 2022
-
[17]
Pre- diabetes: a high-risk state for diabetes development.The Lancet, 379(9833):2279–2290,
[Tab´aket al., 2012 ] Adam G Tab´ak, Christian Herder, Wolf- gang Rathmann, Eric J Brunner, and Mika Kivim¨aki. Pre- diabetes: a high-risk state for diabetes development.The Lancet, 379(9833):2279–2290,
work page 2012
-
[18]
[Teoet al., 2021 ] Zhen Ling Teo, Yih-Chung Tham, Marco Yu, Miao Li Chee, Tyler Hyungtaek Rim, Ning Che- ung, Mukharram M Bikbov, Ya Xing Wang, Yating Tang, Yi Lu, et al. Global prevalence of diabetic retinopathy and projection of burden through 2045: systematic review and meta-analysis.Ophthalmology, 128(11):1580–1591,
work page 2021
-
[19]
[Wanget al., 2009 ] Chuji Wang, Armstrong Mbi, and Mark Shepherd. A study on breath acetone in diabetic patients using a cavity ringdown breath analyzer: Exploring corre- lations of breath acetone with blood glucose and glycohe- moglobin a1c.IEEE Sensors Journal, 10(1):54–63,
work page 2009
-
[20]
[Witarsyahet al., 2025 ] Deden Witarsyah, Hadi Almohab, and Haneen AA Abushammala. Causal inference in obser- vational studies: Assessing the impact of lifestyle factors on diabetes risk.JOIV: International Journal on Informat- ics Visualization, 9(2):585–591, 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.