Detection of Real-world Driving-induced Affective State Using Physiological Signals and Multi-view Multi-task Machine Learning

Daniel Lopez-Martinez; Neska El-Haouij; Rosalind Picard

arxiv: 1907.09929 · v1 · pith:N433DQK2new · submitted 2019-07-19 · 💻 cs.LG · stat.ML

Detection of Real-world Driving-induced Affective State Using Physiological Signals and Multi-view Multi-task Machine Learning

Daniel Lopez-Martinez , Neska El-Haouij , Rosalind Picard This is my paper

Pith reviewed 2026-05-24 19:01 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords affective state detectionphysiological signalsmulti-view multi-task learningreal-world drivingdriver safetyinter-drive variabilityinterpretable machine learning

0 comments

The pith

Accounting for drive-specific differences in physiological signals significantly improves detection of drivers' affective states.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a multiview multi-task machine learning method to detect drivers' affective states from physiological signals recorded during real-world driving. The approach explicitly models variability across different drives while keeping the learned models interpretable. This matters because affective states can impair driver awareness and cognitive processes, so reliable detection could support interfaces that adapt to improve safety and comfort. The method is tested on three separate real-world driving datasets, with results showing clear gains when drive-specific differences are taken into account.

Core claim

A multiview multi-task machine learning method for detecting driver's affective states from physiological signals accounts for inter-drive variability in responses, enables model interpretability, and yields significantly better performance on three real-world driving datasets than methods that ignore drive-specific differences.

What carries the argument

The multiview multi-task machine learning method that models inter-drive variability while preserving interpretability.

If this is right

Improved detection supports empathic automotive interfaces that respond to the driver's emotional state.
Interpretability of the models becomes feasible for safety-critical real-world deployment.
Performance gains arise specifically from handling variability across individual drives.
The method can be applied to other physiological-signal tasks that exhibit similar inter-session differences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same modeling strategy could be tested in non-driving settings that also produce high physiological variability, such as workplace monitoring.
If drive-specific factors prove dominant, future systems might need per-driver calibration rather than population-level models.
Combining this approach with vehicle telemetry could allow real-time identification of when affective states are most likely to affect safety.

Load-bearing premise

The three real-world driving datasets contain physiological signals that reliably indicate affective states without dominant confounding influences from the driving environment itself.

What would settle it

A replication on a new collection of real-world drives in which adding drive-specific modeling produces no measurable improvement in detection accuracy would falsify the central claim.

Figures

Figures reproduced from arXiv: 1907.09929 by Daniel Lopez-Martinez, Neska El-Haouij, Rosalind Picard.

**Figure 2.** Figure 2: Kernel weights η of the multi-view multi-task model. The larger the weights are, the darker the matrix elements are. This represents increasing importance of that view for binary classification performance for a given task. the best performances were obtained when 3 clusters are used, resulting in 93% and 83% classification accuracy respectively. For the HciLab, the best performance (71%) was obtained with… view at source ↗

read the original abstract

Affective states have a critical role in driving performance and safety. They can degrade driver situation awareness and negatively impact cognitive processes, severely diminishing road safety. Therefore, detecting and assessing drivers' affective states is crucial in order to help improve the driving experience, and increase safety, comfort and well-being. Recent advances in affective computing have enabled the detection of such states. This may lead to empathic automotive user interfaces that account for the driver's emotional state and influence the driver in order to improve safety. In this work, we propose a multiview multi-task machine learning method for the detection of driver's affective states using physiological signals. The proposed approach is able to account for inter-drive variability in physiological responses while enabling interpretability of the learned models, a factor that is especially important in systems deployed in the real world. We evaluate the models on three different datasets containing real-world driving experiences. Our results indicate that accounting for drive-specific differences significantly improves model performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract claims that a multi-view multi-task method improves affective state detection by handling inter-drive variability in physiological signals, but supplies no numbers, methods, or confound checks to support it.

read the letter

Accounting for drive-specific differences is presented as the key to better detection of affective states in real driving using physiological signals, but the abstract gives almost no evidence to back that up. The approach uses a multi-view multi-task setup that seems designed to capture both shared patterns across drives and drive-specific ones. That kind of structure can help with the high variability you see in physiological data collected in the field. The paper also flags interpretability as important, which fits for safety applications. What stands out as positive is the focus on real-world datasets rather than lab simulations. Three different driving datasets is a step toward generalizability. The main weaknesses are in the lack of detail. No accuracy numbers, no baseline comparisons, no mention of how affective states were labeled or validated. The concern about environmental confounds is fair – things like acceleration, temperature, or driver movement could easily dominate the signals, and without checks against vehicle data or other controls, it's hard to know if the model is really picking up affect. The abstract doesn't address this. This paper would mainly interest people already working on driver monitoring systems or affective computing in vehicles. A reader might pick up the multi-task idea for handling session effects, but without the methods and results it's mostly a high-level proposal. I don't think it is ready for peer review. The claim is plausible but the supporting information is missing, so a referee would have little to work with. The authors should add the full experimental details and confound analysis first.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a multiview multi-task machine learning method for detecting drivers' affective states from physiological signals collected during real-world driving. The approach is designed to account for inter-drive variability while supporting model interpretability. Evaluation is performed on three real-world driving datasets, with the central empirical claim that explicitly modeling drive-specific differences yields significant performance gains.

Significance. If the performance gains are shown to arise from affective-state information rather than environmental confounds, the work would be relevant to safety-critical automotive interfaces. The use of real-world data and the interpretability emphasis are strengths. However, the provided abstract supplies no quantitative metrics, ablation results, or validation against vehicle telemetry, so the practical significance cannot yet be determined.

major comments (2)

[Abstract] Abstract: the claim that 'accounting for drive-specific differences significantly improves model performance' is stated without any supporting numbers, statistical tests, baseline comparisons, or error analysis. This absence blocks evaluation of whether the reported gains are load-bearing for the affective-detection interpretation.
[Abstract] Abstract: no description is given of how labels for affective states were obtained or validated, nor of any controls for drive-specific physical confounds (motion, temperature, road conditions). Without such information the central claim that the multi-task components capture affect rather than environmental factors cannot be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We address each major comment below and indicate where revisions will be made to strengthen the presentation of our claims.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'accounting for drive-specific differences significantly improves model performance' is stated without any supporting numbers, statistical tests, baseline comparisons, or error analysis. This absence blocks evaluation of whether the reported gains are load-bearing for the affective-detection interpretation.

Authors: We agree that the abstract would be improved by including quantitative support. The full manuscript reports performance metrics, statistical tests, baseline comparisons (including single-task and non-multi-view models), and error analyses in the Results and Discussion sections demonstrating the gains from explicitly modeling inter-drive variability. To address the concern, we will revise the abstract to incorporate key quantitative results and reference the evaluation approach. revision: yes
Referee: [Abstract] Abstract: no description is given of how labels for affective states were obtained or validated, nor of any controls for drive-specific physical confounds (motion, temperature, road conditions). Without such information the central claim that the multi-task components capture affect rather than environmental factors cannot be assessed.

Authors: The abstract omits these details due to length limits, but the Methods section of the manuscript describes label acquisition (via validated self-report protocols and post-drive annotation) and validation procedures, as well as preprocessing steps and controls for physical confounds including motion artifact removal, temperature normalization, and road-condition metadata. We will add brief statements to the abstract summarizing label sources and confound controls to allow immediate assessment of the affective-state interpretation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML evaluation on external datasets

full rationale

The paper proposes a multi-view multi-task ML method for affective state detection from physiological signals and evaluates it on three real-world driving datasets. The central claim is an empirical performance improvement from accounting for drive-specific variability. No equations, derivations, or self-citations are load-bearing in a way that reduces predictions to inputs by construction. Standard ML train/test evaluation on external data does not constitute circularity under the defined patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities can be identified from the abstract. The work relies on standard machine learning techniques applied to physiological data.

pith-pipeline@v0.9.0 · 5702 in / 1031 out tokens · 32323 ms · 2026-05-24T19:01:16.145981+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 1 internal anchor

[1]

Improving automotive safety by pairing driver emotion and car voice emotion,

C. Nass, I.-M. Jonsson, H. Harris, B. Reaves, J. Endo, S. Brave, and L. Takayama, “Improving automotive safety by pairing driver emotion and car voice emotion,” in ACM Conference on Human Factors in Computing Systems (CHI) , (New York, New York, USA), p. 1973, ACM Press, 2005

work page 1973
[2]

Matching In-Car V oice with Driver State : Impact on Attitude and Driving Performance,

I.-M. Johnsson, C. Nass, H. Harris, and L. Takayama, “Matching In-Car V oice with Driver State : Impact on Attitude and Driving Performance,” in Driving assessment 2005: proceedings of the 3rd International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, (Iowa City, Iowa), pp. 173–180, University of Iowa, 10 2005

work page 2005
[3]

Detecting stress during real-world driving tasks using physiological sensors,

J. Healey and R. Picard, “Detecting stress during real-world driving tasks using physiological sensors,” IEEE Transactions on Intelligent Transportation Systems, vol. 6, pp. 156–166, June 2005

work page 2005
[4]

A reasoning-based framework for car driver’ss stress prediction,

G. Rigas, C. D. Katsis, P. Bougia, and D. I. Fotiadis, “A reasoning-based framework for car driver’ss stress prediction,” in 2008 16th Mediterranean Conference on Control and Automation , pp. 627–632, IEEE, 6 2008

work page 2008
[5]

Driver alertness monitoring using fusion of facial features and bio-signals,

B. Lee and W. Chung, “Driver alertness monitoring using fusion of facial features and bio-signals,” IEEE Sensors Journal, vol. 12, pp. 2416–2422, July 2012

work page 2012
[6]

A comparative evaluation of neural network classiﬁers for stress level analysis of automotive drivers using physiological signals,

R. R. Singh, S. Conjeti, and R. Banerjee, “A comparative evaluation of neural network classiﬁers for stress level analysis of automotive drivers using physiological signals,” Biomedical Signal Processing and Control , vol. 8, pp. 740–754, nov 2013

work page 2013
[7]

Drowsiness detection using heart rate variability,

J. Vicente, P. Laguna, A. Bartra, and R. Bailón, “Drowsiness detection using heart rate variability,” Medical & Biological Engineering & Computing, vol. 54, pp. 927–937, jun 2016

work page 2016
[8]

Detecting driving stress in physiological signals based on multimodal feature analysis and kernel classiﬁers,

L.-l. Chen, Y . Zhao, P.-f. Ye, J. Zhang, and J.-z. Zou, “Detecting driving stress in physiological signals based on multimodal feature analysis and kernel classiﬁers,”Expert Systems with Applications, vol. 85, pp. 279–291, nov 2017

work page 2017
[9]

Personalized driver stress detection with multi-task neural networks using physiological signals,

A. Saeed and S. Trajanovski, “Personalized driver stress detection with multi-task neural networks using physiological signals,” in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Health, (Long Beach, CA, USA), December 2017

work page 2017
[10]

Characterizing Driver Stress Using Physiological and Operational Data from Real- World Electric Vehicle Driving Experiment,

S. Kim, W. Rhee, D. Choi, Y . J. Jang, and Y . Yoon, “Characterizing Driver Stress Using Physiological and Operational Data from Real- World Electric Vehicle Driving Experiment,” International Journal of Automotive Technology, vol. 19, pp. 895–906, oct 2018

work page 2018
[11]

Random forest-based approach for physiological functional variable selection for driver’s stress level classiﬁcation,

N. El Haouij, J.-M. Poggi, R. Ghozi, S. Sevestre-Ghalila, and M. Jaïdane, “Random forest-based approach for physiological functional variable selection for driver’s stress level classiﬁcation,” Statistical Methods & Applications, Feb 2018

work page 2018
[12]

Towards A Rigorous Science of Interpretable Machine Learning

F. Doshi-Velez and B. Kim, “Towards A Rigorous Science of Interpretable Machine Learning,” in eprint arXiv:1702.08608, 2 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[13]

PhysioBank, PhysioToolkit, and PhysioNet,

A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “PhysioBank, PhysioToolkit, and PhysioNet,” Circulation, vol. 101, 6 2000

work page 2000
[14]

MIT Media Lab, Affective Computing Group databases

“MIT Media Lab, Affective Computing Group databases.” https://affect. media.mit.edu/share-data.php

work page
[15]

A data set of real world driving to assess driver workload,

S. Schneegass, B. Pﬂeging, N. Broy, A. Schmidt, and H. F., “ A data set of real world driving to assess driver workload,” in 5th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI’13). ACM, New York, NY, USA, pp. 150–157, IEEE, sep 2013

work page 2013
[16]

AffectiveROAD System and Database to Assess Driver’s Arousal State,

N. El Haouij, J.-M. Poggi, S. Sevestre-Ghalila, R. Ghozi, and M. Jaïdane, “AffectiveROAD System and Database to Assess Driver’s Arousal State,” in SAC 2018: Symposium on Applied Computing , April 9–13, 2018, Pau, France, 2018

work page 2018
[17]

Boucsein, Electrodermal Activity

W. Boucsein, Electrodermal Activity. Boston, MA: Springer US, 2012

work page 2012
[18]

Healey, Wearable and automotive systems for affect recognition from physiology

J. Healey, Wearable and automotive systems for affect recognition from physiology. PhD thesis, MIT Dept. of Electrical Engineering and Computer Science, 2000

work page 2000
[19]

Physiological and Behavioral Proﬁling for Nociceptive Pain Estimation Using Personalized Multitask Learning,

D. Lopez-Martinez, O. Rudovic, and R. Picard, “Physiological and Behavioral Proﬁling for Nociceptive Pain Estimation Using Personalized Multitask Learning,” in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Health , (Long Beach, USA), 2017

work page 2017
[20]

Multitask multiple kernel machines for personalized pain recognition from functional near-infrared spectroscopy brain signals,

D. Lopez-Martinez, K. Peng, S. Steele, A. Lee, D. Borsook, and R. Picard, “Multitask multiple kernel machines for personalized pain recognition from functional near-infrared spectroscopy brain signals,” in International Conference on Pattern Recognition (ICPR) , (Beijing), 2018

work page 2018
[21]

Normalized cuts and image segmentation,

Jianbo Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000

work page 2000
[22]

Multitask Learning,

R. Caruana, “Multitask Learning,” Machine Learning , vol. 28, no. 1, pp. 41–75, 1997

work page 1997
[23]

Multi- task and multi-view learning of user state,

M. Kandemir, A. Vetek, M. Gönen, A. Klami, and S. Kaski, “Multi- task and multi-view learning of user state,” Neurocomputing, vol. 139, pp. 97–106, 9 2014

work page 2014
[24]

Multi-task neural networks for personalized pain recognition from physiological signals,

D. Lopez-Martinez and R. Picard, “Multi-task neural networks for personalized pain recognition from physiological signals,” in 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) , pp. 181–184, IEEE, 10 2017

work page 2017

[1] [1]

Improving automotive safety by pairing driver emotion and car voice emotion,

C. Nass, I.-M. Jonsson, H. Harris, B. Reaves, J. Endo, S. Brave, and L. Takayama, “Improving automotive safety by pairing driver emotion and car voice emotion,” in ACM Conference on Human Factors in Computing Systems (CHI) , (New York, New York, USA), p. 1973, ACM Press, 2005

work page 1973

[2] [2]

Matching In-Car V oice with Driver State : Impact on Attitude and Driving Performance,

I.-M. Johnsson, C. Nass, H. Harris, and L. Takayama, “Matching In-Car V oice with Driver State : Impact on Attitude and Driving Performance,” in Driving assessment 2005: proceedings of the 3rd International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, (Iowa City, Iowa), pp. 173–180, University of Iowa, 10 2005

work page 2005

[3] [3]

Detecting stress during real-world driving tasks using physiological sensors,

J. Healey and R. Picard, “Detecting stress during real-world driving tasks using physiological sensors,” IEEE Transactions on Intelligent Transportation Systems, vol. 6, pp. 156–166, June 2005

work page 2005

[4] [4]

A reasoning-based framework for car driver’ss stress prediction,

G. Rigas, C. D. Katsis, P. Bougia, and D. I. Fotiadis, “A reasoning-based framework for car driver’ss stress prediction,” in 2008 16th Mediterranean Conference on Control and Automation , pp. 627–632, IEEE, 6 2008

work page 2008

[5] [5]

Driver alertness monitoring using fusion of facial features and bio-signals,

B. Lee and W. Chung, “Driver alertness monitoring using fusion of facial features and bio-signals,” IEEE Sensors Journal, vol. 12, pp. 2416–2422, July 2012

work page 2012

[6] [6]

A comparative evaluation of neural network classiﬁers for stress level analysis of automotive drivers using physiological signals,

R. R. Singh, S. Conjeti, and R. Banerjee, “A comparative evaluation of neural network classiﬁers for stress level analysis of automotive drivers using physiological signals,” Biomedical Signal Processing and Control , vol. 8, pp. 740–754, nov 2013

work page 2013

[7] [7]

Drowsiness detection using heart rate variability,

J. Vicente, P. Laguna, A. Bartra, and R. Bailón, “Drowsiness detection using heart rate variability,” Medical & Biological Engineering & Computing, vol. 54, pp. 927–937, jun 2016

work page 2016

[8] [8]

Detecting driving stress in physiological signals based on multimodal feature analysis and kernel classiﬁers,

L.-l. Chen, Y . Zhao, P.-f. Ye, J. Zhang, and J.-z. Zou, “Detecting driving stress in physiological signals based on multimodal feature analysis and kernel classiﬁers,”Expert Systems with Applications, vol. 85, pp. 279–291, nov 2017

work page 2017

[9] [9]

Personalized driver stress detection with multi-task neural networks using physiological signals,

A. Saeed and S. Trajanovski, “Personalized driver stress detection with multi-task neural networks using physiological signals,” in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Health, (Long Beach, CA, USA), December 2017

work page 2017

[10] [10]

Characterizing Driver Stress Using Physiological and Operational Data from Real- World Electric Vehicle Driving Experiment,

S. Kim, W. Rhee, D. Choi, Y . J. Jang, and Y . Yoon, “Characterizing Driver Stress Using Physiological and Operational Data from Real- World Electric Vehicle Driving Experiment,” International Journal of Automotive Technology, vol. 19, pp. 895–906, oct 2018

work page 2018

[11] [11]

Random forest-based approach for physiological functional variable selection for driver’s stress level classiﬁcation,

N. El Haouij, J.-M. Poggi, R. Ghozi, S. Sevestre-Ghalila, and M. Jaïdane, “Random forest-based approach for physiological functional variable selection for driver’s stress level classiﬁcation,” Statistical Methods & Applications, Feb 2018

work page 2018

[12] [12]

Towards A Rigorous Science of Interpretable Machine Learning

F. Doshi-Velez and B. Kim, “Towards A Rigorous Science of Interpretable Machine Learning,” in eprint arXiv:1702.08608, 2 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[13] [13]

PhysioBank, PhysioToolkit, and PhysioNet,

A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “PhysioBank, PhysioToolkit, and PhysioNet,” Circulation, vol. 101, 6 2000

work page 2000

[14] [14]

MIT Media Lab, Affective Computing Group databases

“MIT Media Lab, Affective Computing Group databases.” https://affect. media.mit.edu/share-data.php

work page

[15] [15]

A data set of real world driving to assess driver workload,

S. Schneegass, B. Pﬂeging, N. Broy, A. Schmidt, and H. F., “ A data set of real world driving to assess driver workload,” in 5th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI’13). ACM, New York, NY, USA, pp. 150–157, IEEE, sep 2013

work page 2013

[16] [16]

AffectiveROAD System and Database to Assess Driver’s Arousal State,

N. El Haouij, J.-M. Poggi, S. Sevestre-Ghalila, R. Ghozi, and M. Jaïdane, “AffectiveROAD System and Database to Assess Driver’s Arousal State,” in SAC 2018: Symposium on Applied Computing , April 9–13, 2018, Pau, France, 2018

work page 2018

[17] [17]

Boucsein, Electrodermal Activity

W. Boucsein, Electrodermal Activity. Boston, MA: Springer US, 2012

work page 2012

[18] [18]

Healey, Wearable and automotive systems for affect recognition from physiology

J. Healey, Wearable and automotive systems for affect recognition from physiology. PhD thesis, MIT Dept. of Electrical Engineering and Computer Science, 2000

work page 2000

[19] [19]

Physiological and Behavioral Proﬁling for Nociceptive Pain Estimation Using Personalized Multitask Learning,

D. Lopez-Martinez, O. Rudovic, and R. Picard, “Physiological and Behavioral Proﬁling for Nociceptive Pain Estimation Using Personalized Multitask Learning,” in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Health , (Long Beach, USA), 2017

work page 2017

[20] [20]

Multitask multiple kernel machines for personalized pain recognition from functional near-infrared spectroscopy brain signals,

D. Lopez-Martinez, K. Peng, S. Steele, A. Lee, D. Borsook, and R. Picard, “Multitask multiple kernel machines for personalized pain recognition from functional near-infrared spectroscopy brain signals,” in International Conference on Pattern Recognition (ICPR) , (Beijing), 2018

work page 2018

[21] [21]

Normalized cuts and image segmentation,

Jianbo Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000

work page 2000

[22] [22]

Multitask Learning,

R. Caruana, “Multitask Learning,” Machine Learning , vol. 28, no. 1, pp. 41–75, 1997

work page 1997

[23] [23]

Multi- task and multi-view learning of user state,

M. Kandemir, A. Vetek, M. Gönen, A. Klami, and S. Kaski, “Multi- task and multi-view learning of user state,” Neurocomputing, vol. 139, pp. 97–106, 9 2014

work page 2014

[24] [24]

Multi-task neural networks for personalized pain recognition from physiological signals,

D. Lopez-Martinez and R. Picard, “Multi-task neural networks for personalized pain recognition from physiological signals,” in 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) , pp. 181–184, IEEE, 10 2017

work page 2017