Learning to Identify Patients at Risk of Uncontrolled Hypertension Using Electronic Health Records Data
Pith reviewed 2026-05-25 12:45 UTC · model grok-4.3
The pith
Machine learning models using EHR data can identify patients likely to have uncontrolled hypertension in the next three months.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Logistic regression and recurrent neural networks trained on electronic health record data from 14,407 patients can stratify hypertension patients by their risk of uncontrolled hypertension within three months, with the best model achieving an AUROC of 0.719 compared to a baseline of 0.634 using only the last blood pressure measure.
What carries the argument
Logistic regression and recurrent neural networks applied to sequences of patient EHR features for three-month risk stratification of uncontrolled hypertension.
If this is right
- Targeted use of personalized treatments for high-risk patients becomes feasible.
- Simple linear models like logistic regression serve as strong baselines and may suffice for EHR predictive tasks.
- Recurrent neural networks do not provide additional benefit over logistic regression in this setting.
- Proactive management of hypertension could decrease incidence of uncontrolled cases.
Where Pith is reading between the lines
- Similar modeling approaches might apply to predicting other chronic disease complications using EHR.
- Deployment would require validation on diverse populations beyond the training data.
- Integration into clinical workflows could change how follow-up visits are scheduled.
Load-bearing premise
The electronic health records from the studied patients are complete, unbiased, and representative of future patients seen in clinical practice.
What would settle it
A drop in predictive performance below the reported AUROC when the model is applied to a new, independent cohort of patients from a different healthcare system.
Figures
read the original abstract
Hypertension is a major risk factor for stroke, cardiovascular disease, and end-stage renal disease, and its prevalence is expected to rise dramatically. Effective hypertension management is thus critical. A particular priority is decreasing the incidence of uncontrolled hypertension. Early identification of patients at risk for uncontrolled hypertension would allow targeted use of personalized, proactive treatments. We develop machine learning models (logistic regression and recurrent neural networks) to stratify patients with respect to the risk of exhibiting uncontrolled hypertension within the coming three-month period. We trained and tested models using EHR data from 14,407 and 3,009 patients, respectively. The best model achieved an AUROC of 0.719, outperforming the simple, competitive baseline of relying prediction based on the last BP measure alone (0.634). Perhaps surprisingly, recurrent neural networks did not outperform a simple logistic regression for this task, suggesting that linear models should be included as strong baselines for predictive tasks using EHR
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops logistic regression and recurrent neural network models to predict the risk that a patient will exhibit uncontrolled hypertension in the next three-month window, using electronic health record data. Models are trained on 14,407 patients and evaluated on a held-out set of 3,009 patients; the best model reaches an AUROC of 0.719, exceeding the baseline that simply uses the most recent blood-pressure measurement (AUROC 0.634). The authors observe that the RNN does not outperform logistic regression and therefore recommend that linear models be retained as strong baselines for EHR prediction tasks.
Significance. If the reported performance difference is reproducible and generalizable, the work supplies a concrete, low-complexity risk-stratification signal that could support targeted hypertension management. The explicit comparison against a competitive last-BP baseline and the counter-intuitive finding that a recurrent architecture adds no value are useful contributions that strengthen the empirical literature on EHR-based forecasting.
major comments (2)
- [Methods] Methods section: the abstract (and therefore the central performance claim) supplies no description of cohort assembly, inclusion/exclusion criteria, the temporal or random nature of the 14,407/3,009 split, feature construction, or handling of missing blood-pressure values. These omissions are load-bearing because systematic missingness or selection effects in EHR follow-up frequency could inflate the reported AUROC difference relative to the last-BP baseline.
- [Results] Results: no confidence intervals, statistical test, or calibration plot is mentioned for the AUROC values 0.719 versus 0.634. Without these, it is impossible to judge whether the 0.085 absolute improvement is distinguishable from sampling variability and therefore whether the headline claim of outperformance is actionable.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We will revise the manuscript to address the points raised on methodological transparency and statistical reporting. Point-by-point responses follow.
read point-by-point responses
-
Referee: [Methods] Methods section: the abstract (and therefore the central performance claim) supplies no description of cohort assembly, inclusion/exclusion criteria, the temporal or random nature of the 14,407/3,009 split, feature construction, or handling of missing blood-pressure values. These omissions are load-bearing because systematic missingness or selection effects in EHR follow-up frequency could inflate the reported AUROC difference relative to the last-BP baseline.
Authors: We agree these details are essential for evaluating the results. We will expand the Methods section to explicitly describe cohort assembly, inclusion/exclusion criteria, the random (non-temporal) nature of the 14,407/3,009 split, feature construction from EHR data, and the handling of missing blood-pressure values (via forward-fill where clinically appropriate or exclusion). This revision will clarify the comparison to the last-BP baseline. revision: yes
-
Referee: [Results] Results: no confidence intervals, statistical test, or calibration plot is mentioned for the AUROC values 0.719 versus 0.634. Without these, it is impossible to judge whether the 0.085 absolute improvement is distinguishable from sampling variability and therefore whether the headline claim of outperformance is actionable.
Authors: We agree that uncertainty quantification and a formal comparison are needed. We will add 95% bootstrap confidence intervals for both AUROCs, apply a statistical test for the difference (e.g., DeLong test), and include a calibration plot in the revised results section. revision: yes
Circularity Check
No circularity: purely empirical held-out evaluation
full rationale
The paper trains logistic regression and RNN models on EHR data from 14,407 patients and reports AUROC 0.719 on a separate 3,009-patient test set, outperforming a last-BP baseline (0.634). No equations, derivations, or self-citations are present that reduce the reported performance metric to any fitted input by construction. The result is a standard empirical comparison on held-out data and is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- logistic regression and RNN parameters
axioms (2)
- domain assumption Training and test splits are representative of the target clinical population
- domain assumption Uncontrolled hypertension can be reliably labeled from EHR fields
Reference graph
Works this paper leans on
-
[1]
Hypertension management: an update,
Q. Nguyen, J. Dominguez, L. Nguyen, and N. Gullapalli, “Hypertension management: an update,” American health & drug benefits, vol. 3, no. 1, p. 47, 2010
work page 2010
-
[2]
Heart disease and stroke statisticsâ ˘AˇT2015 update: a report from the american heart association,
D. Mozaffarian, “Heart disease and stroke statisticsâ ˘AˇT2015 update: a report from the american heart association,” Circulation, vol. 131, no. 4, pp. e29–e322, 2015
work page 2015
-
[3]
New acc/aha high blood pressure guidelines lower definition of hyperten- sion,
A. C. of Cardiology Foundation et al., “New acc/aha high blood pressure guidelines lower definition of hyperten- sion,” 2018
work page 2018
-
[4]
L. G. Ogden, J. He, E. Lydick, and P. K. Whelton, “Long-term absolute benefit of lowering blood pressure in hypertensive patients according to the jnc vi risk stratification,”Hypertension, vol. 35, no. 2, pp. 539–543, 2000
work page 2000
-
[5]
Risk stratification in hypertension: new insights from the framingham study,
W. B. Kannel, “Risk stratification in hypertension: new insights from the framingham study,” American journal of hypertension, vol. 13, no. S1, pp. 3S–10S, 2000
work page 2000
-
[6]
Accountable care organization (aco),
G. W. de la Torre JI, “Accountable care organization (aco),” Medical Care Research and Review, 2017
work page 2017
-
[7]
Accountable care organizations, explained,
J. Gold, “Accountable care organizations, explained,” 2015
work page 2015
-
[8]
J. Sun, C. D. McNaughton, P. Zhang, A. Perer, A. Gkoulalas-Divanis, J. C. Denny, J. Kirby, T. Lasko, A. Saip, and B. A. Malin, “Predicting changes in hypertension control using electronic health records from a chronic disease management program,” Journal of the American Medical Informatics Association , vol. 21, no. 2, pp. 337–344, 2013
work page 2013
-
[9]
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997
work page 1997
-
[10]
B. C. Wallace, K. Small, C. E. Brodley, and T. A. Trikalinos, “Class imbalance, redux,” in Data Mining (ICDM), 2011 IEEE 11th International Conference on, pp. 754–763, IEEE, 2011
work page 2011
-
[11]
Learning to Diagnose with LSTM Recurrent Neural Networks
Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzel, “Learning to diagnose with lstm recurrent neural networks,” arXiv preprint arXiv:1511.03677, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[12]
Scalable and accurate deep learning with electronic health records,
A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. Liu, J. Marcus, M. Sun, et al., “Scalable and accurate deep learning with electronic health records,” npj Digital Medicine , vol. 1, no. 1, p. 18, 2018
work page 2018
-
[13]
Chollet et al., “Keras.” https://keras.io, 2015
F. Chollet et al., “Keras.” https://keras.io, 2015
work page 2015
-
[14]
Tensorflow: a system for large-scale machine learning.,
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., “Tensorflow: a system for large-scale machine learning.,” in OSDI, vol. 16, pp. 265–283, 2016
work page 2016
-
[15]
Adam: A Method for Stochastic Optimization
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[16]
Axiomatic attribution for deep networks,
M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks,” in International Conference on Machine Learning, pp. 3319–3328, 2017
work page 2017
-
[17]
J. A. Sterne, I. R. White, J. B. Carlin, M. Spratt, P. Royston, M. G. Kenward, A. M. Wood, and J. R. Carpen- ter, “Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls,” Bmj, vol. 338, p. b2393, 2009
work page 2009
-
[18]
Modeling missing data in clinical time series with rnns,
Z. C. Lipton, D. C. Kale, and R. Wetzel, “Modeling missing data in clinical time series with rnns,” Machine Learning for Healthcare, 2016
work page 2016
-
[19]
Supervised machine learning: A review of classification tech- niques,
S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification tech- niques,” Emerging artificial intelligence applications in computer engineering, vol. 160, pp. 3–24, 2007. Appendix A Medications Drug Family Types Drug Family Types ACE InhibitorLisinopril, Benazepril Calcium channel blockerAmlodipine, Nifedipine ...
work page 2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.