pith. sign in

arxiv: 2602.12361 · v2 · submitted 2026-02-12 · 💻 cs.CV

Thermal Imaging for Contactless Cardiorespiratory and Sudomotor Response Monitoring

Pith reviewed 2026-05-16 02:01 UTC · model grok-4.3

classification 💻 cs.CV
keywords thermal imagingcontactless monitoringelectrodermal activitybreathing rateheart ratefacial regionssudomotor responsehuman-machine interfaces
0
0 comments X

The pith

Thermal video of the face can extract sudomotor activity and breathing rate signals for contactless operator monitoring.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests whether thermal infrared video can serve as a contactless way to monitor electrodermal activity, heart rate, and breathing rate by capturing skin temperature changes on the face. It introduces a pipeline that selects and tracks specific facial regions, aggregates the temperature time series, and splits slow sweat-related trends from faster cardiorespiratory oscillations. When evaluated on 31 sessions from a public driver dataset against contact references, the strongest fixed configuration reaches a mean absolute EDA correlation of 0.40, with some individual sessions at 0.89, breathing rate error of 3.1 bpm, and heart rate error of 13.8 bpm. These numbers indicate that thermal cues can supply useful state information for human-machine interfaces where visible cameras face lighting or privacy limits. The results also highlight practical constraints from region choice, signal polarity, latency, and person-to-person differences.

Core claim

The central claim is that thermal video provides usable sudomotor and respiratory signals through a pipeline of facial region tracking, thermal signal aggregation, and frequency-based separation of slow EDA trends from faster cardiorespiratory components, achieving a best fixed EDA mean absolute correlation of 0.40 against palm reference and breathing rate mean absolute error of 3.1 bpm while heart rate remains limited by the 7.5 Hz camera rate.

What carries the argument

Signal-processing pipeline that tracks chosen facial regions, aggregates their thermal time series, separates slow sudomotor trends from faster cardiorespiratory components, and applies orthogonal matrix image transformation for heart rate peak detection.

If this is right

  • Thermal video supplies auxiliary respiratory and sudomotor information for adaptive industrial human-machine interfaces.
  • Breathing rate can be recovered from nasal and cheek thermal signals via spectral peak detection with roughly 3 bpm average error.
  • Heart rate estimation from multiple facial regions remains constrained by the thermal camera's frame rate.
  • ROI selection and polarity handling are critical design choices that determine whether the extracted signals remain usable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Higher-frame-rate thermal cameras would likely reduce the heart-rate error and make the method more competitive with contact sensors.
  • Subject-specific or session-adaptive ROI selection could lower the observed variability in EDA correlations.
  • The approach could be tested as a privacy-preserving layer in vehicle cabins or factory workstations where RGB imaging is restricted.

Load-bearing premise

Facial thermal signals can be cleanly divided into sudomotor and cardiorespiratory parts with only modest interference from motion, subject differences, or polarity flips.

What would settle it

Re-running the 288 ROI configurations on the same dataset and finding that no configuration exceeds 0.25 mean absolute EDA correlation across all sessions would show the separation is not reliable enough for the claimed utility.

Figures

Figures reproduced from arXiv: 2602.12361 by Constantino \'Alvarez Casado, Manuel Lage Ca\~nellas, Miguel Bordallo L\'opez, Mohammad Rahman, Nhi Nguyen, Sasan Sharifipour, Xiaoting Wu.

Figure 1
Figure 1. Figure 1: Thermal facial video can be converted into ROI temperature traces and [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Pipeline overview. Thermal frames are processed by a face detector to localise landmarks. Six facial ROIs are defined and scalar temperature traces are [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: PCCabs against PEDA for all 48 ROI-method combinations (n = 31). Task-stratified results explain part of this variability. Table II reports the two strongest ROI–method combinations per task, computed over the alpha-sensitivity sweep. Against PEDA, PD is best captured by the right periorbital and forehead signals with exponential MA (PCCabs ≈ 0.58), ND by the nose with median or exponential smoothing (≈ 0.… view at source ↗
Figure 5
Figure 5. Figure 5: PCCabs against PEDA by driving condition (n = 7–8 per task). TABLE V BEST EDA ROI-METHOD PER SUBJECT (MEAN ACROSS SESSIONS - N). Subj. S/A n Best (PP NR) Best (PEDA) T002 F/Y 4 cheeks/eMA 0.44±0.19 nose/med 0.36±0.22 T003 M/Y 4 nose/mAvg 0.35±0.23 cheeks/eMA 0.53±0.16 T005 M/Y 3 foreh./med 0.39±0.09 cheeks/eMA 0.35±0.05 T014 F/Y 4 cheeks/eMA 0.43±0.19 cheeks/eMA 0.39±0.34 T029 F/O 4 nose/mAvg 0.40±0.15 che… view at source ↗
Figure 6
Figure 6. Figure 6: T003-ED: PP NR and PEDA GT (top), nose ROI vs PP NR (middle, r = 0.42), eyes ROI (bottom, r = 0.15). HMI: respiration provides the most stable low-rate physiolog￾ical input, EDA-like trends can support arousal or workload context when ROI quality and polarity are handled, and HR requires higher frame rates, more specific vascular ROIs, or complementary sensing. Demographic and subject-level analyzes indica… view at source ↗
read the original abstract

Human-machine interfaces in industrial automation need sensing modules that monitor operator actions and physiological state. This is important in factories, vehicles, machinery cabins, and human-robot collaboration, where workload, stress, fatigue, or reduced attention can affect safety. RGB monitoring is limited by low light, shadows, and privacy concerns, while thermal infrared imaging captures skin temperature dynamics without visible illumination. This paper studies thermal video as a contactless computer vision modality for estimating electrodermal activity (EDA), heart rate (HR), and breathing rate (BR), with the goal of supporting adaptive human-machine interfaces and operator-state awareness. We propose a signal-processing pipeline that tracks facial regions, aggregates thermal signals, and separates slow sudomotor trends from faster cardiorespiratory components. HR is estimated using orthogonal matrix image transformation (OMIT) across multiple facial regions, while BR is estimated from nasal and cheek thermal signals using spectral peak detection. We characterize 288 ROI-method configurations against contact references with lag-tolerant metrics using 31 sessions from the public SIMULATOR STUDY 1 (SIM1) driver monitoring dataset. The best fixed EDA configuration reaches a mean absolute correlation of $0.40 \pm 0.23$ against palm EDA, with individual sessions reaching $0.89$. BR estimation achieves $3.1 \pm 1.1$\,bpm mean absolute error, while HR estimation yields $13.8 \pm 7.5$\,bpm MAE, limited by the $7.5$\,Hz thermal camera frame rate. The results show that thermal video provides useful respiratory and sudomotor cues, while revealing limitations caused by ROI selection, polarity changes, latency, and subject variability. These findings provide baseline design guidance for thermal computer vision as an auxiliary sensing layer in adaptive industrial HMI systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that facial thermal dynamics encode EDA, HR, and BR signals separable by standard filtering, plus empirical selection of ROI configurations; no new entities are introduced and free parameters are limited to the tested ROI-method choices.

free parameters (1)
  • ROI-method configuration
    288 configurations evaluated; best fixed one selected post-hoc for reported EDA correlation.
axioms (1)
  • domain assumption Facial skin temperature dynamics reflect sudomotor, cardiac, and respiratory activity
    Invoked in the signal-processing pipeline description to justify separation of slow and fast components.

pith-pipeline@v0.9.0 · 5658 in / 1300 out tokens · 69724 ms · 2026-05-16T02:01:24.476943+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Human-centric manufacturing: Re- thinking, re-justifying, and re-envisioning,

    X. Xu, T. Ji, P. Zheng, and L. Wang, “Human-centric manufacturing: Re- thinking, re-justifying, and re-envisioning,”Journal of Manufacturing Systems, vol. 84, pp. 259–268, 2026

  2. [2]

    Enhanc- ing decision-making in industry 5.0 through adaptive human-machine interfaces: A systematic literature review,

    S. Meftah, M. Sahnoun, M. Messaadia, and S.-M. Benslimane, “Enhanc- ing decision-making in industry 5.0 through adaptive human-machine interfaces: A systematic literature review,”Computer Standards & Interfaces, vol. 96, p. 104091, 2026

  3. [3]

    Workplace well-being in industry 5.0: A worker-centered systematic review,

    F. G. Antonaci, E. C. Olivetti, F. Marcolin, I. A. Castiblanco Jimenez, B. Eynard, E. Vezzetti, and S. Moos, “Workplace well-being in industry 5.0: A worker-centered systematic review,”Sensors, vol. 24, no. 17, p. 5473, 2024

  4. [4]

    Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead,

    C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead,”Nature machine intelligence, vol. 1, no. 5, pp. 206–215, 2019

  5. [5]

    A comprehensive survey on contactless vital sign monitoring using vision-based, radio-based, and fusion approaches,

    Z. Li, X. Wu, C. ´Alvarez Casado, V . Lindholm, K. Mikkonen, Z. Xia, X. Feng, and M. Bordallo L ´opez, “A comprehensive survey on contactless vital sign monitoring using vision-based, radio-based, and fusion approaches,”Neurocomputing, vol. 674, p. 132877, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0925231226002742

  6. [6]

    Contactless vital-sign monitoring of construction machinery operators using millimeter-wave technology,

    J. Wang, H. Li, S. Han, G. Wang, and W. Ren, “Contactless vital-sign monitoring of construction machinery operators using millimeter-wave technology,”Journal of Construction Engineering and Management, vol. 151, no. 1, p. 04024182, 2025

  7. [7]

    Considering ai-driven therapy: When does human empathy matter? (preprint),

    M. Rubin, H. Arnon, J. Huppert, and A. Perry, “Considering ai-driven therapy: When does human empathy matter? (preprint),”JMIR Mental Health, vol. 11, 01 2024

  8. [8]

    Face2ppg: An unsupervised pipeline for blood volume pulse extraction from faces,

    C. Alvarez Casado and M. Bordallo L ´opez, “Face2ppg: An unsupervised pipeline for blood volume pulse extraction from faces,”IEEE Journal of Biomedical and Health Informatics (J-BHI)., 2022

  9. [9]

    Identification, activity, and biometric classification using radar-based sensing,

    L. Nguyen, C. ´A. Casado, O. Silv ´en, and M. B. L ´opez, “Identification, activity, and biometric classification using radar-based sensing,” in2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, 2022, pp. 1–8

  10. [10]

    Electrodermal and thermal measurement of users’ emo- tional reaction for a visual stimuli,

    M. Jukiewicz, P. Łupkowski, R. Majchrowski, J. Marcinkowska, and D. Ratajczyk, “Electrodermal and thermal measurement of users’ emo- tional reaction for a visual stimuli,”Case Studies in Thermal Engineer- ing, vol. 27, p. 101303, 2021

  11. [11]

    Dissecting driver behaviors under cog- nitive, emotional, sensorimotor, and mixed stressors,

    I. Pavlidis, M. Dcosta, S. Taamneh, M. Manser, T. Ferris, R. Wunderlich, E. Akleman, and P. Tsiamyrtzis, “Dissecting driver behaviors under cog- nitive, emotional, sensorimotor, and mixed stressors,”Scientific reports, vol. 6, no. 1, p. 25651, 2016

  12. [12]

    Thermal cameras and applications: a survey,

    R. Gade and T. B. Moeslund, “Thermal cameras and applications: a survey,”Machine vision and applications, vol. 25, no. 1, pp. 245–262, 2014

  13. [13]

    Enhancing human-robot collaboration with thermal images and deep neural networks: The unique thermal industrial dataset wlri-hrc and evaluation of convolutional neural networks,

    S. S ¨ume, K.-M. Ponomarjova, T. M. Wendt, and S. J. Rupitsch, “Enhancing human-robot collaboration with thermal images and deep neural networks: The unique thermal industrial dataset wlri-hrc and evaluation of convolutional neural networks,”Journal of Sensors and Sensor Systems, vol. 14, pp. 37–46, 2025

  14. [14]

    Adaptive thermal imaging signal analysis for real-time non-invasive respiratory rate mon- itoring,

    R. Analia, A. Forster, S.-Q. Xie, and Z. Zhang, “Adaptive thermal imaging signal analysis for real-time non-invasive respiratory rate mon- itoring,”Sensors, vol. 26, no. 1, p. 278, 2026

  15. [15]

    Contact-free measurement of cardiac pulse based on the analysis of thermal imagery,

    M. Garbey, N. Sun, A. Merla, and I. Pavlidis, “Contact-free measurement of cardiac pulse based on the analysis of thermal imagery,”IEEE Transactions on Biomedical Engineering, vol. 54, no. 8, pp. 1418–1426, 2007

  16. [16]

    Driver stress state evaluation by means of thermal imaging: A supervised machine learning approach based on ecg signal,

    D. Cardone, D. Perpetuini, C. Filippini, E. Spadolini, L. Mancini, A. M. Chiarelli, and A. Merla, “Driver stress state evaluation by means of thermal imaging: A supervised machine learning approach based on ecg signal,”Applied Sciences, vol. 10, no. 16, p. 5673, 2020

  17. [17]

    Reading between the heat: Co-teaching body thermal signatures for non-intrusive stress detection,

    Y . Xiao, H. Sharma, Z. Zhang, D. Bergen-Cico, T. Rahman, and A. Salekin, “Reading between the heat: Co-teaching body thermal signatures for non-intrusive stress detection,”Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 7, no. 4, pp. 1–30, 2024

  18. [18]

    A multimodal dataset for various forms of distracted driving,

    S. Taamneh, P. Tsiamyrtzis, M. Dcosta, P. Buddharaju, A. Khatri, M. Manser, T. Ferris, R. Wunderlich, and I. Pavlidis, “A multimodal dataset for various forms of distracted driving,”Scientific data, vol. 4, no. 1, pp. 1–21, 2017

  19. [19]

    Thermistor at a distance: Unobtrusive mea- surement of breathing,

    J. Fei and I. Pavlidis, “Thermistor at a distance: Unobtrusive mea- surement of breathing,”IEEE Transactions on Biomedical Engineering, vol. 57, no. 4, pp. 988–998, 2010

  20. [20]

    Robust tracking of respiratory rate in high-dynamic range scenes using mobile thermal imaging,

    Y . Cho, S. J. Julier, N. Marquardt, and N. Bianchi-Berthouze, “Robust tracking of respiratory rate in high-dynamic range scenes using mobile thermal imaging,”Biomedical Optics Express, vol. 8, no. 10, pp. 4480– 4503, 2017

  21. [21]

    Non-contact respiratory rate monitoring using thermal and visible imaging: A pilot study on neonates,

    L. Maurya, R. Zwiggelaar, D. Chawla, and P. Mahapatra, “Non-contact respiratory rate monitoring using thermal and visible imaging: A pilot study on neonates,”Journal of Clinical Monitoring and Computing, vol. 37, pp. 815–828, 2023

  22. [22]

    Application of thermography to estimate respiratory rate in the routine of an emergency room,

    A. Aldredet al., “Application of thermography to estimate respiratory rate in the routine of an emergency room,”Temperature, vol. 9, no. 4, pp. 378–386, 2022

  23. [23]

    Multi-camera infrared thermography for infant respiration monitoring,

    I. Lorato, S. Stuijk, M. Meftah, D. Kommers, P. Andriessen, C. van Pul, and G. de Haan, “Multi-camera infrared thermography for infant respiration monitoring,”Biomedical Optics Express, vol. 11, no. 9, pp. 4848–4861, 2020

  24. [24]

    Thermal imaging of the superficial temporal artery: An arterial pulse recovery model,

    S. Y . Chekmenev, A. A. Farag, and E. A. Essock, “Thermal imaging of the superficial temporal artery: An arterial pulse recovery model,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2007, best Paper, OTCBVS Workshop

  25. [25]

    Monitoring of cardiorespiratory signals using thermal imaging: A pilot study on healthy human subjects,

    C. Barbosa Pereira, M. Czaplik, V . Blazek, S. Leonhardt, and D. Teich- mann, “Monitoring of cardiorespiratory signals using thermal imaging: A pilot study on healthy human subjects,”Sensors, vol. 18, no. 5, p. 1541, 2018

  26. [26]

    A non-contact technique for measuring eccrine sweat gland activity using passive thermal imaging,

    A. T. Krzywicki, G. G. Berntson, and B. L. O’Kane, “A non-contact technique for measuring eccrine sweat gland activity using passive thermal imaging,”International Journal of Psychophysiology, vol. 94, no. 1, pp. 25–34, 2014

  27. [27]

    Separate extraction of human eccrine sweat gland activity and periph- eral hemodynamics from high- and low-quality thermal imaging data,

    A. Sagaidachnyi, D. Mayskov, A. Fomin, I. Zaletov, and A. Skripal, “Separate extraction of human eccrine sweat gland activity and periph- eral hemodynamics from high- and low-quality thermal imaging data,” Journal of Thermal Biology, vol. 110, p. 103351, 2022

  28. [28]

    ThermICA: Novel approach for a multivariate analysis of facial thermal responses,

    F. Gioia, A. Greco, A. L. Callara, N. Vanello, E. P. Scilingo, and L. Citi, “ThermICA: Novel approach for a multivariate analysis of facial thermal responses,”IEEE Transactions on Biomedical Engineering, vol. 72, no. 4, pp. 1237–1247, 2025

  29. [29]

    Detecting changes in facial temperature induced by a sudden auditory stimulus based on deep learning-assisted face tracking,

    S. Sonkusare, D. Ahmedt-Aristizabal, M. J. Aburn, V . T. Nguyen, T. Pang, S. Frydman, S. Denman, C. Fookes, M. Breakspear, and C. C. Guo, “Detecting changes in facial temperature induced by a sudden auditory stimulus based on deep learning-assisted face tracking,” Scientific reports, vol. 9, no. 1, p. 4729, 2019

  30. [30]

    YOLO5Face: Why reinventing a face detector,

    D. Qi, W. Tan, Q. Yao, and J. Liu, “YOLO5Face: Why reinventing a face detector,” inProceedings of the European Conference on Computer Vision (ECCV) Workshops. Springer, 2022, pp. 228–244

  31. [31]

    TFW: Annotated thermal faces in the wild dataset,

    A. Kuzdeuov, D. Aubakirova, D. Koishigarina, and H. A. Varol, “TFW: Annotated thermal faces in the wild dataset,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 2084–2094, 2022

  32. [32]

    SpeakingFaces: A large-scale multimodal dataset of voice commands with visual and thermal video streams,

    M. Abdrakhmanova, A. Kuzdeuov, S. Jarju, Y . Khassanov, M. Lewis, and H. A. Varol, “SpeakingFaces: A large-scale multimodal dataset of voice commands with visual and thermal video streams,”Sensors, vol. 21, no. 10, p. 3465, 2021