BreathAI: Transfer Learning-Based Thermal Imaging for Automated Breathing Pattern Recognition

Abbes Amira; Hamza Kheddar; Yassine Himeur

arxiv: 2604.17442 · v1 · submitted 2026-04-19 · 📡 eess.IV

BreathAI: Transfer Learning-Based Thermal Imaging for Automated Breathing Pattern Recognition

Hamza Kheddar , Yassine Himeur , Abbes Amira This is my paper

Pith reviewed 2026-05-10 05:30 UTC · model grok-4.3

classification 📡 eess.IV

keywords thermal imagingbreathing pattern recognitiontransfer learningdeep learningrespiratory monitoringinhalation exhalationsleep apnea

0 comments

The pith

Thermal imaging combined with adaptive transfer learning recognizes inhalation and exhalation phases at 98.8 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a deep learning framework called ATL-TDLM to automatically detect breathing patterns from thermal images rather than sound recordings. It applies hierarchical feature extraction, adaptive multi-thresholding to segment breathing phases, knowledge distillation for efficient transfer of learning, and contrastive methods to better separate inhalation from exhalation. The resulting model reaches 98.8 percent accuracy while remaining computationally light. A sympathetic reader would care because this offers a contact-free way to monitor respiration that could support detection of disorders such as sleep apnea or asthma in everyday settings.

Core claim

The ATL-TDLM framework integrates hierarchical deep feature extraction with adaptive multi-thresholding for improved segmentation, knowledge distillation-based fine-tuning to optimize transfer, and contrastive representation learning to increase separability between inhalation and exhalation classes, delivering 98.8 percent accuracy on thermal imaging data and outperforming existing approaches while preserving computational efficiency.

What carries the argument

ATL-TDLM framework that combines adaptive multi-thresholding with knowledge distillation fine-tuning and contrastive representation learning on thermal image features.

If this is right

Provides a non-contact alternative to audio-based respiratory monitoring for clinical use.
Supports automated identification of abnormal breathing cycles relevant to sleep apnea and asthma.
Maintains low computational cost, allowing potential deployment on edge devices for continuous tracking.
Improves class distinction between inhalation and exhalation through contrastive learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same thermal approach might combine with simple camera hardware already present in homes or clinics to enable passive long-term tracking.
If the segmentation thresholds prove stable, the method could be adapted to detect subtler respiratory events such as hypopneas.
Real-world testing across age groups and lighting conditions would reveal whether the reported accuracy holds outside controlled recordings.

Load-bearing premise

Thermal images supply enough clear information about breathing phase changes and a model trained on the current dataset will classify patterns correctly for new patients and different recording conditions.

What would settle it

Evaluation on an independent thermal imaging dataset from different patients or environments that yields accuracy well below 98.8 percent.

Figures

Figures reproduced from arXiv: 2604.17442 by Abbes Amira, Hamza Kheddar, Yassine Himeur.

**Figure 2.** Figure 2: Illustration of the proposed ATL-TDLM framework. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Gradual FT without TH. (a) Zero trained layer, (b) One trained layer, (c) Two trained layer. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Gradual FT with TH. (a) One trained layer, (b) Two trained layer, (c) Three trained layer. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

read the original abstract

This study presents an Adaptive Transfer Learning and Thresholding-based Deep Learning Model (ATL-TDLM) for automated breathing pattern recognition using thermal imaging. Unlike conventional methods that rely on sound-based respiratory data, our approach leverages hierarchical deep feature extraction and adaptive multi-thresholding (AMT) to enhance feature segmentation. The model integrates knowledge distillation-based fine-tuning (KD-FT) to optimize learning transfer and contrastive representation learning (CRL) to improve inter-class separability between inhalation (INH) and exhalation (EXH) phases. The ATL-TDLM framework achieves an accuracy of 98.8%, significantly outperforming state-of-the-art models while ensuring computational efficiency. This approach has potential applications in respiratory disorder detection, including sleep apnea and asthma monitoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies transfer learning, knowledge distillation, and contrastive learning to thermal images for breathing phase detection and reports 98.8% accuracy, but supplies no dataset size, subject count, or validation protocol to support the number.

read the letter

The core claim is that ATL-TDLM reaches 98.8% accuracy on inhalation versus exhalation from thermal video by combining adaptive multi-thresholding, knowledge-distillation fine-tuning, and contrastive representation learning. That combination is the main technical step: it takes standard transfer-learning tricks and points them at a new sensing modality where temperature shifts are small and noisy. The efficiency angle is also useful if the model really runs light enough for edge devices in sleep or asthma monitoring. Those are the parts that feel like genuine application work rather than just another benchmark run. The rest of the abstract is thin on evidence. No numbers appear for images per subject, total subjects, train-test split method, or whether the split kept subjects separate. Without those, the 98.8% figure cannot be separated from possible overfitting or optimistic partitioning. Baseline comparisons are mentioned only in passing, so it is unclear whether the gain comes from the new pieces or from better tuning on the same data. The generalization worry is real: thermal breathing signals vary with camera angle, clothing, room temperature, and patient physiology, yet nothing in the description shows testing across those conditions. This paper is aimed at applied researchers who need non-contact respiratory monitoring and are willing to re-implement or request the data. A methods-focused reader might borrow the thresholding-plus-distillation recipe, but anyone planning to cite the accuracy number will have to wait for the full experimental section and code. I would send it out for review because the application is concrete and the method choices are specific enough to be tested, even though the current write-up needs substantial additions on data and splits before the result can be trusted.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes an Adaptive Transfer Learning and Thresholding-based Deep Learning Model (ATL-TDLM) for automated recognition of breathing patterns (inhalation/exhalation phases) from thermal images. It combines hierarchical feature extraction, adaptive multi-thresholding (AMT), knowledge distillation-based fine-tuning (KD-FT), and contrastive representation learning (CRL) to achieve 98.8% accuracy while claiming computational efficiency and superiority over state-of-the-art models, with potential uses in respiratory disorder monitoring such as sleep apnea.

Significance. If the performance claims hold under rigorous, subject-independent validation, the work could contribute to non-contact thermal imaging methods for respiratory analysis, offering an alternative to audio-based approaches. The combination of transfer learning with KD-FT and CRL for improved separability and efficiency represents a reasonable technical direction, but the absence of supporting experimental details prevents assessment of whether the result advances the field beyond existing thermal or vision-based breathing detection literature.

major comments (3)

[Abstract and §4] Abstract and §4 (Results/Experiments): The central claim of 98.8% accuracy 'significantly outperforming state-of-the-art models' is presented without any reported dataset size, number of subjects, acquisition protocol, train/test partitioning strategy (e.g., leave-one-subject-out), cross-validation method, or quantitative baseline comparisons. This directly undermines the generalization assumption required for the ATL-TDLM framework and makes it impossible to distinguish the result from optimistic partitioning or overfitting.
[§3] §3 (Proposed Method): The description of AMT, KD-FT, and CRL integration does not include ablation studies or controls isolating the contribution of each component to the final accuracy. Without these, it is unclear whether the reported performance stems from the proposed architecture or from other factors such as data characteristics.
[§4] §4 (Evaluation): No information is provided on potential biases (e.g., subject demographics, environmental conditions, or thermal camera variations) or on whether the model was tested on unseen patients/environments. This leaves the weakest assumption—that thermal images reliably capture phase signals and generalize—unverified and load-bearing for the claimed applications.

minor comments (2)

[Title and Abstract] The title refers to 'BreathAI' while the abstract and body consistently use 'ATL-TDLM'; a brief clarification of the relationship between these terms would improve consistency.
[§4] Figure captions and tables (if present in §4) should explicitly state the number of samples or folds used for each reported metric to aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important gaps in experimental reporting and validation. We address each major comment below and will revise the manuscript to strengthen the presentation of our results and methods.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Results/Experiments): The central claim of 98.8% accuracy 'significantly outperforming state-of-the-art models' is presented without any reported dataset size, number of subjects, acquisition protocol, train/test partitioning strategy (e.g., leave-one-subject-out), cross-validation method, or quantitative baseline comparisons. This directly undermines the generalization assumption required for the ATL-TDLM framework and makes it impossible to distinguish the result from optimistic partitioning or overfitting.

Authors: We agree that these methodological details are critical for evaluating generalization. In the revised manuscript, we will expand the abstract and Section 4 to explicitly report the dataset size (number of subjects and thermal images), acquisition protocol, train/test partitioning (including leave-one-subject-out cross-validation), and quantitative comparisons against the referenced state-of-the-art baselines. This will allow readers to assess the robustness of the 98.8% accuracy claim. revision: yes
Referee: [§3] §3 (Proposed Method): The description of AMT, KD-FT, and CRL integration does not include ablation studies or controls isolating the contribution of each component to the final accuracy. Without these, it is unclear whether the reported performance stems from the proposed architecture or from other factors such as data characteristics.

Authors: We acknowledge the value of component-wise analysis. The revised Section 3 will include ablation studies that isolate the effects of adaptive multi-thresholding (AMT), knowledge distillation-based fine-tuning (KD-FT), and contrastive representation learning (CRL) by reporting accuracy when each is removed or disabled, thereby clarifying their individual contributions to the overall performance. revision: yes
Referee: [§4] §4 (Evaluation): No information is provided on potential biases (e.g., subject demographics, environmental conditions, or thermal camera variations) or on whether the model was tested on unseen patients/environments. This leaves the weakest assumption—that thermal images reliably capture phase signals and generalize—unverified and load-bearing for the claimed applications.

Authors: We will augment Section 4 with an analysis of potential biases, including subject demographics, environmental conditions, and camera variations where available in our data. We will also report performance on held-out unseen subjects and environments to better substantiate generalization. If our existing dataset does not fully cover all requested bias dimensions, we will note this limitation explicitly. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical accuracy claim is independent of inputs

full rationale

The paper presents an applied ML framework (ATL-TDLM with KD-FT, CRL, and AMT) whose central output is an empirical accuracy figure of 98.8% on thermal breathing data. No equations, self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or described derivation. The performance metric is reported as the measured outcome of the proposed pipeline rather than a quantity forced by construction from the training inputs. The result remains externally falsifiable via replication on held-out subjects or new environments, satisfying the criteria for a non-circular empirical claim.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The approach relies on standard assumptions in deep learning such as the availability of labeled thermal image data and the effectiveness of pre-trained models for feature extraction; no new free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5432 in / 1020 out tokens · 44816 ms · 2026-05-10T05:30:43.696382+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

A deep learning approach to differentiate between acute asthma and bronchitis in preschool children,

W. Salih and H. Koyuncu, “A deep learning approach to differentiate between acute asthma and bronchitis in preschool children,” in2022 International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT). IEEE, 2022, pp. 380–385

work page 2022
[2]

Assessment of inspiration and expiration time using infrared thermal imaging modality,

M. F. M. Shakhih, A. A. Wahab, and M. I. M. Salim, “Assessment of inspiration and expiration time using infrared thermal imaging modality,”Infrared Physics & Technology, vol. 99, pp. 129–139, 2019

work page 2019
[3]

AI in thyroid cancer diagnosis: Techniques, trends, and future directions,

Y . Habchi, Y . Himeur, H. Kheddar, A. Boukabou, S. Atalla, A. Chouchane, A. Ouamane, and W. Mansoor, “AI in thyroid cancer diagnosis: Techniques, trends, and future directions,”Systems, vol. 11, no. 10, p. 519, 2023

work page 2023
[4]

Deep learning algorithms-based ct images in glucocorticoid therapy in asthma children with small airway obstruction,

Y . Qin, J. Wang, Y . Han, L. Luet al., “Deep learning algorithms-based ct images in glucocorticoid therapy in asthma children with small airway obstruction,”Journal of Healthcare Engineering, vol. 2021, 2021

work page 2021
[5]

Explainable attention resnet18-based model for asthma detection using stethoscope lung sounds,

I. Topaloglu, P. D. Barua, A. M. Yildiz, T. Keles, S. Dogan, M. Baygin, H. F. Gul, T. Tuncer, R.-S. Tan, and U. R. Acharya, “Explainable attention resnet18-based model for asthma detection using stethoscope lung sounds,”Engineering Applications of Artificial Intelligence, vol. 126, p. 106887, 2023

work page 2023
[6]

Deep transfer learning for automatic speech recognition: Towards better generalization,

H. Kheddar, Y . Himeur, S. Al-Maadeed, A. Amira, and F. Bensaali, “Deep transfer learning for automatic speech recognition: Towards better generalization,”Knowledge-Based Systems, vol. 277, p. 110851, 2023

work page 2023
[7]

Deep and machine learning towards pneumonia and asthma detection,

A. Yahyaoui and N. Yumu¸ sak, “Deep and machine learning towards pneumonia and asthma detection,” in2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technolo- gies (3ICT). IEEE, 2021, pp. 494–497

work page 2021
[8]

Machine learning-based asthma risk prediction using iot and smartphone applications,

G. S. Bhat, N. Shankar, D. Kim, D. J. Song, S. Seo, I. M. Panahi, and L. Tamil, “Machine learning-based asthma risk prediction using iot and smartphone applications,”IEEE Access, vol. 9, pp. 118 708–118 715, 2021

work page 2021
[9]

Interpreting patient-specific risk prediction using contextual decomposition of bil- stms: application to children with asthma,

R. AlSaad, Q. Malluhi, I. Janahi, and S. Boughorbel, “Interpreting patient-specific risk prediction using contextual decomposition of bil- stms: application to children with asthma,”BMC medical informatics and decision making, vol. 19, pp. 1–11, 2019

work page 2019
[10]

A deep learning model developed for sleep apnea detection: A multi-center study,

F. Li, Y . Xu, J. Chen, P. Lu, B. Zhang, and F. Cong, “A deep learning model developed for sleep apnea detection: A multi-center study,” Biomedical Signal Processing and Control, vol. 85, p. 104689, 2023

work page 2023
[11]

Sleep apnea prediction using deep learning,

E. Wang, I. Koprinska, and B. Jeffries, “Sleep apnea prediction using deep learning,”IEEE Journal of Biomedical and Health Informatics, 2023

work page 2023
[12]

Estimation of respiratory rate and exhale duration using audio signals recorded by smartphone microphones,

E. P. Doheny, B. P. O’Callaghan, V . S. Fahed, J. Liegey, C. Goulding, S. Ryan, and M. M. Lowery, “Estimation of respiratory rate and exhale duration using audio signals recorded by smartphone microphones,” Biomedical Signal Processing and Control, vol. 80, p. 104318, 2023

work page 2023
[13]

A clinical evaluation of a low-cost strain gauge respiration belt and machine learning to detect sleep apnea,

S. Kristiansen, K. Nikolaidis, T. Plagemann, V . Goebel, G. M. Traaen, B. Øverland, L. Akerøy, T.-E. Hunt, J. P. Loennechen, S. L. Steinshamn et al., “A clinical evaluation of a low-cost strain gauge respiration belt and machine learning to detect sleep apnea,”Smart Health, vol. 27, p. 100373, 2023

work page 2023
[14]

Detection of sleep apnea using machine learning algorithms based on ecg signals: A comprehensive systematic review,

N. Salari, A. Hosseinian-Far, M. Mohammadi, H. Ghasemi, H. Khazaie, A. Daneshkhah, and A. Ahmadi, “Detection of sleep apnea using machine learning algorithms based on ecg signals: A comprehensive systematic review,”Expert Systems with Applications, vol. 187, p. 115950, 2022

work page 2022
[15]

Deep learning for steganalysis of diverse data types: A review of methods, taxonomy, challenges and future directions,

H. Kheddar, M. Hemis, Y . Himeur, D. Megias, and A. Amira, “Deep learning for steganalysis of diverse data types: A review of methods, taxonomy, challenges and future directions,”Neurocomputing, vol. 581, p. 127528, 2024

work page 2024
[16]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998

work page 1998
[17]

Decaf: A deep convolutional activation feature for generic visual recognition,

J. Donahue, Y . Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep convolutional activation feature for generic visual recognition,” inInternational conference on machine learning. PMLR, 2014, pp. 647–655

work page 2014
[18]

How transferable are features in deep neural networks?

J. Yosinski, J. Clune, Y . Bengio, and H. Lipson, “How transferable are features in deep neural networks?” inAdvances in neural information processing systems, 2014, pp. 3320–3328

work page 2014

[1] [1]

A deep learning approach to differentiate between acute asthma and bronchitis in preschool children,

W. Salih and H. Koyuncu, “A deep learning approach to differentiate between acute asthma and bronchitis in preschool children,” in2022 International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT). IEEE, 2022, pp. 380–385

work page 2022

[2] [2]

Assessment of inspiration and expiration time using infrared thermal imaging modality,

M. F. M. Shakhih, A. A. Wahab, and M. I. M. Salim, “Assessment of inspiration and expiration time using infrared thermal imaging modality,”Infrared Physics & Technology, vol. 99, pp. 129–139, 2019

work page 2019

[3] [3]

AI in thyroid cancer diagnosis: Techniques, trends, and future directions,

Y . Habchi, Y . Himeur, H. Kheddar, A. Boukabou, S. Atalla, A. Chouchane, A. Ouamane, and W. Mansoor, “AI in thyroid cancer diagnosis: Techniques, trends, and future directions,”Systems, vol. 11, no. 10, p. 519, 2023

work page 2023

[4] [4]

Deep learning algorithms-based ct images in glucocorticoid therapy in asthma children with small airway obstruction,

Y . Qin, J. Wang, Y . Han, L. Luet al., “Deep learning algorithms-based ct images in glucocorticoid therapy in asthma children with small airway obstruction,”Journal of Healthcare Engineering, vol. 2021, 2021

work page 2021

[5] [5]

Explainable attention resnet18-based model for asthma detection using stethoscope lung sounds,

I. Topaloglu, P. D. Barua, A. M. Yildiz, T. Keles, S. Dogan, M. Baygin, H. F. Gul, T. Tuncer, R.-S. Tan, and U. R. Acharya, “Explainable attention resnet18-based model for asthma detection using stethoscope lung sounds,”Engineering Applications of Artificial Intelligence, vol. 126, p. 106887, 2023

work page 2023

[6] [6]

Deep transfer learning for automatic speech recognition: Towards better generalization,

H. Kheddar, Y . Himeur, S. Al-Maadeed, A. Amira, and F. Bensaali, “Deep transfer learning for automatic speech recognition: Towards better generalization,”Knowledge-Based Systems, vol. 277, p. 110851, 2023

work page 2023

[7] [7]

Deep and machine learning towards pneumonia and asthma detection,

A. Yahyaoui and N. Yumu¸ sak, “Deep and machine learning towards pneumonia and asthma detection,” in2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technolo- gies (3ICT). IEEE, 2021, pp. 494–497

work page 2021

[8] [8]

Machine learning-based asthma risk prediction using iot and smartphone applications,

G. S. Bhat, N. Shankar, D. Kim, D. J. Song, S. Seo, I. M. Panahi, and L. Tamil, “Machine learning-based asthma risk prediction using iot and smartphone applications,”IEEE Access, vol. 9, pp. 118 708–118 715, 2021

work page 2021

[9] [9]

Interpreting patient-specific risk prediction using contextual decomposition of bil- stms: application to children with asthma,

R. AlSaad, Q. Malluhi, I. Janahi, and S. Boughorbel, “Interpreting patient-specific risk prediction using contextual decomposition of bil- stms: application to children with asthma,”BMC medical informatics and decision making, vol. 19, pp. 1–11, 2019

work page 2019

[10] [10]

A deep learning model developed for sleep apnea detection: A multi-center study,

F. Li, Y . Xu, J. Chen, P. Lu, B. Zhang, and F. Cong, “A deep learning model developed for sleep apnea detection: A multi-center study,” Biomedical Signal Processing and Control, vol. 85, p. 104689, 2023

work page 2023

[11] [11]

Sleep apnea prediction using deep learning,

E. Wang, I. Koprinska, and B. Jeffries, “Sleep apnea prediction using deep learning,”IEEE Journal of Biomedical and Health Informatics, 2023

work page 2023

[12] [12]

Estimation of respiratory rate and exhale duration using audio signals recorded by smartphone microphones,

E. P. Doheny, B. P. O’Callaghan, V . S. Fahed, J. Liegey, C. Goulding, S. Ryan, and M. M. Lowery, “Estimation of respiratory rate and exhale duration using audio signals recorded by smartphone microphones,” Biomedical Signal Processing and Control, vol. 80, p. 104318, 2023

work page 2023

[13] [13]

A clinical evaluation of a low-cost strain gauge respiration belt and machine learning to detect sleep apnea,

S. Kristiansen, K. Nikolaidis, T. Plagemann, V . Goebel, G. M. Traaen, B. Øverland, L. Akerøy, T.-E. Hunt, J. P. Loennechen, S. L. Steinshamn et al., “A clinical evaluation of a low-cost strain gauge respiration belt and machine learning to detect sleep apnea,”Smart Health, vol. 27, p. 100373, 2023

work page 2023

[14] [14]

Detection of sleep apnea using machine learning algorithms based on ecg signals: A comprehensive systematic review,

N. Salari, A. Hosseinian-Far, M. Mohammadi, H. Ghasemi, H. Khazaie, A. Daneshkhah, and A. Ahmadi, “Detection of sleep apnea using machine learning algorithms based on ecg signals: A comprehensive systematic review,”Expert Systems with Applications, vol. 187, p. 115950, 2022

work page 2022

[15] [15]

Deep learning for steganalysis of diverse data types: A review of methods, taxonomy, challenges and future directions,

H. Kheddar, M. Hemis, Y . Himeur, D. Megias, and A. Amira, “Deep learning for steganalysis of diverse data types: A review of methods, taxonomy, challenges and future directions,”Neurocomputing, vol. 581, p. 127528, 2024

work page 2024

[16] [16]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998

work page 1998

[17] [17]

Decaf: A deep convolutional activation feature for generic visual recognition,

J. Donahue, Y . Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep convolutional activation feature for generic visual recognition,” inInternational conference on machine learning. PMLR, 2014, pp. 647–655

work page 2014

[18] [18]

How transferable are features in deep neural networks?

J. Yosinski, J. Clune, Y . Bengio, and H. Lipson, “How transferable are features in deep neural networks?” inAdvances in neural information processing systems, 2014, pp. 3320–3328

work page 2014