Improving Driver Drowsiness Detection via Personalized EAR/MAR Thresholds and CNN-Based Classification
Pith reviewed 2026-05-08 12:31 UTC · model grok-4.3
The pith
Personalized eye and mouth aspect ratio thresholds with CNN classification improve driver drowsiness detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a drowsiness detection system using personalized EAR and MAR thresholds calibrated pre-driving, combined with CNN-based classification for eye states and yawning, achieves higher accuracy than fixed threshold methods alone, specifically improving by 2-3% with personalization and reaching 99.1% for eyes and 98.8% for yawning on diverse datasets.
What carries the argument
Personalized calibration of Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR) thresholds for individual drivers, augmented by CNN models that classify eye openness and yawning behavior.
If this is right
- Real-time monitoring can issue warnings based on combined signals from eyelids, head position, and yawning.
- The hybrid method handles variations in illumination and head poses better than metrics alone.
- Testing on public datasets plus custom ones under different conditions validates the improvements.
- Such systems may support continuous driver monitoring in vehicles to reduce accident risks.
Where Pith is reading between the lines
- Calibration might need periodic updates if fatigue alters facial features over long drives.
- High CNN accuracy opens the possibility of running these models efficiently on mobile or embedded hardware in cars.
- Similar personalization could apply to other biometric monitoring tasks like stress detection.
Load-bearing premise
That the pre-driving personalized thresholds for EAR and MAR will continue to work accurately as illumination, head poses, and the driver's fatigue level change during actual driving.
What would settle it
Observing a significant drop in detection accuracy when testing the system on drivers or in conditions not included in the calibration or training data, such as extreme lighting changes or new facial structures.
Figures
read the original abstract
Driver drowsiness is a major cause of traffic accidents worldwide, posing a serious threat to public safety. Vision-based driver monitoring systems often rely on fixed Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR) thresholds; however, such fixed values frequently fail to generalize across individuals due to variations in facial structure, illumination, and driving conditions. This paper proposes a personalized driver drowsiness detection system that monitors eyelid movements, head position, and yawning behavior in real time and provides warnings when signs of fatigue are detected. The system employs driver-specific EAR and MAR thresholds, calibrated before driving, to improve classical metric-based detection. In addition, deep learning-based Convolutional Neural Network (CNN) models are integrated to enhance accuracy in challenging scenarios. The system is evaluated using publicly available datasets as well as a custom dataset collected under diverse lighting conditions, head poses, and user characteristics. Experimental results show that personalized thresholding improves detection accuracy by 2-3% compared to fixed thresholds, while CNN-based classification achieves 99.1% accuracy for eye state detection and 98.8% for yawning detection, demonstrating the effectiveness of combining classical metrics with deep learning for robust real-time driver monitoring.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a hybrid driver drowsiness detection system that combines driver-specific EAR and MAR thresholds, calibrated once before driving, with CNN models for real-time eye-state and yawning classification. It evaluates the approach on public datasets plus a custom dataset collected under varied lighting, head poses, and user characteristics, claiming a 2-3% accuracy improvement from personalization over fixed thresholds together with 99.1% eye-state and 98.8% yawning detection accuracies.
Significance. If the reported gains hold under realistic longitudinal conditions, the work would offer a practical route to improving generalization of vision-based drowsiness monitors by accounting for inter-driver facial variation while retaining interpretable metrics alongside deep learning. The hybrid design is a reasonable engineering choice for real-time embedded deployment.
major comments (2)
- [Experimental evaluation] The evaluation (described after the method section) provides no temporal-split or longitudinal experiments in which EAR/MAR thresholds are calibrated on an initial segment of a continuous recording and then tested on later segments of the same session under changing illumination, head pose, or fatigue. Without such tests the 2-3% improvement claim cannot be shown to survive the very variations the abstract lists as motivation for personalization.
- [Experimental evaluation] The CNN accuracies (99.1% eye, 98.8% yawning) are reported without explicit confirmation that test drivers or sessions are fully disjoint from the data used to set personalized thresholds or to train the networks. This leaves open the possibility that reported figures partly reflect driver-specific overfitting rather than generalization.
minor comments (2)
- [Abstract and §3] The abstract and method description should state the exact public datasets used and the number of subjects/sessions in the custom dataset.
- [Figures] Figure captions for the CNN architecture and sample detections would benefit from explicit mention of input resolution and preprocessing steps.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help improve the rigor of our experimental claims. We address each major point below and will incorporate revisions to provide clearer evidence of generalization.
read point-by-point responses
-
Referee: The evaluation (described after the method section) provides no temporal-split or longitudinal experiments in which EAR/MAR thresholds are calibrated on an initial segment of a continuous recording and then tested on later segments of the same session under changing illumination, head pose, or fatigue. Without such tests the 2-3% improvement claim cannot be shown to survive the very variations the abstract lists as motivation for personalization.
Authors: We agree this is a limitation in the current evaluation protocol. While our custom dataset includes varied conditions across sessions and we performed per-driver calibration separate from testing, we did not explicitly conduct or report temporal splits within continuous recordings. In the revised manuscript we will add longitudinal experiments that calibrate EAR/MAR thresholds on initial segments of each recording and evaluate on subsequent segments under changing illumination, head pose, and fatigue levels. This will directly test whether the reported 2-3% gain persists under the variations motivating personalization. revision: yes
-
Referee: The CNN accuracies (99.1% eye, 98.8% yawning) are reported without explicit confirmation that test drivers or sessions are fully disjoint from the data used to set personalized thresholds or to train the networks. This leaves open the possibility that reported figures partly reflect driver-specific overfitting rather than generalization.
Authors: We confirm that CNN training used driver-disjoint partitions on both public and custom datasets, and that personalized thresholds were derived from calibration data held out from the test sessions. However, the manuscript does not state this partitioning explicitly. We will revise the experimental evaluation section to detail the exact data splits, including driver/session disjointness for threshold calibration and model training/testing, thereby removing any ambiguity about potential overfitting. revision: yes
Circularity Check
No significant circularity; claims rest on empirical evaluation of proposed system
full rationale
The paper describes an empirical system for drowsiness detection that calibrates driver-specific EAR/MAR thresholds pre-driving and augments with CNN classifiers for eye and yawning states. Reported accuracy gains (2-3% over fixed thresholds) and CNN performance figures (99.1% eye, 98.8% yawning) are presented as outcomes of experiments on public datasets plus a custom collection under varied lighting, poses, and users. No derivation chain, equations, or first-principles steps are shown that reduce by construction to their own inputs; there are no self-citations invoked as load-bearing uniqueness theorems, no ansatzes smuggled via prior work, and no renaming of known patterns as novel organization. The evaluation is self-contained against external datasets rather than tautological, satisfying the criteria for a non-circular finding.
Axiom & Free-Parameter Ledger
free parameters (1)
- personalized EAR and MAR thresholds
axioms (2)
- domain assumption EAR and MAR metrics derived from facial landmarks reliably indicate drowsiness when thresholds are appropriately set
- domain assumption CNN models trained on facial image data can accurately classify eye openness and yawning states
Reference graph
Works this paper leans on
-
[1]
The effect of sleep disorders and fatigue on in-vehicle traffic accidents,
S. Tekin and M. Seyit, “The effect of sleep disorders and fatigue on in-vehicle traffic accidents,”Namık Kemal Medical Journal, 2022
work page 2022
-
[2]
Real-time eye blink detection using facial landmarks,
J. Cech and T. Soukupova, “Real-time eye blink detection using facial landmarks,”Cent. Mach. Perception, Dep. Cybern. Fac. Electr. Eng. Czech Tech. Univ. Prague, pp. 1–8, 2016
work page 2016
-
[3]
A cnn-lstm-based deep learning approach for driver drowsiness prediction,
M. W. Gomaa, R. O. Mahmoud, and A. M. Sarhan, “A cnn-lstm-based deep learning approach for driver drowsiness prediction,”Journal of Engineering Research, vol. 6, no. 3, pp. 59–70, 2022
work page 2022
-
[4]
Improving facial emotion recognition through dataset merg- ing and balanced training strategies,
S. Kırbız, “Improving facial emotion recognition through dataset merg- ing and balanced training strategies,”Journal of the Franklin Institute, vol. 362, no. 7, p. 107659, 2025
work page 2025
-
[5]
A review of recent develop- ments in driver drowsiness detection systems,
Y . Albadawi, M. Takruri, and M. Awad, “A review of recent develop- ments in driver drowsiness detection systems,”Sensors, vol. 22, no. 5, p. 2069, 2022
work page 2069
-
[6]
M. H. Baccour, F. Driewer, T. Sch ¨ack, and E. Kasneci, “Comparative analysis of vehicle-based and driver-based features for driver drowsiness monitoring by support vector machines,”IEEE transactions on intelli- gent transportation systems, vol. 23, no. 12, pp. 23 164–23 178, 2022
work page 2022
-
[7]
Real-time fatigue detection algorithms using machine learning for yawning and eye state,
F. Makhmudov, D. Turimov, M. Xamidov, F. Nazarov, and Y .-I. Cho, “Real-time fatigue detection algorithms using machine learning for yawning and eye state,”Sensors, vol. 24, no. 23, p. 7810, 2024
work page 2024
-
[8]
Perclos-based technologies for detecting drowsiness: current evidence and future directions,
T. Abe, “Perclos-based technologies for detecting drowsiness: current evidence and future directions,”Sleep Advances, vol. 4, no. 1, p. zpad006, 2023
work page 2023
-
[9]
Yawning detection using embedded smart cameras,
M. Omidyeganeh, S. Shirmohammadi, S. Abtahi, A. Khurshid, M. Farhan, J. Scharcanski, B. Hariri, D. Laroche, and L. Martel, “Yawning detection using embedded smart cameras,”IEEE Transactions on Instrumentation and Measurement, vol. 65, no. 3, pp. 570–582, 2016
work page 2016
-
[10]
Detecting driver drowsiness using hybrid facial features and ensemble learning,
C. Xu, W. Huang, J. Liu, and L. Li, “Detecting driver drowsiness using hybrid facial features and ensemble learning,”Information, vol. 16, no. 4, p. 294, 2025
work page 2025
-
[11]
A comprehensive survey and evaluation of mediapipe face mesh for human emotion recognition,
S. A. Jakhete and N. Kulkarni, “A comprehensive survey and evaluation of mediapipe face mesh for human emotion recognition,” in2024 8th International Conference on Computing, Communication, Control and Automation (ICCUBEA). IEEE, 2024, pp. 1–8
work page 2024
-
[12]
Akash Shingha, “Mrl eye dataset,” https://www.kaggle.com/datasets/ akashshingha850/mrl-eye-dataset, Kaggle, 2024, kaggle dataset, ac- cessed on 2026-01-15
work page 2024
-
[13]
David Vazquez CIC, “Yawn dataset,” https://www.kaggle.com/datasets/ davidvazquezcic/yawn-dataset/data, Kaggle, 2024, kaggle dataset, ac- cessed on 2026-01-15
work page 2024
-
[14]
A review on clahe based enhancement techniques,
R. Sharma and A. Kamra, “A review on clahe based enhancement techniques,” in2023 6th International Conference on Contemporary Computing and Informatics (IC3I), vol. 6. IEEE, 2023, pp. 321–325
work page 2023
-
[15]
Cnn-based emotion recognition using data augmentation and preprocessing methods,
B. Kayao ˘glu, T. Toktas ¸, and S. Kırbız, “Cnn-based emotion recognition using data augmentation and preprocessing methods,” in2023 Innova- tions in Intelligent Systems and Applications Conference (ASYU). IEEE, 2023, pp. 1–4
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.