Attention-Guided Fusion of 1D and 2D CNNs for Robust ECG-Based Biometric Recognition
Pith reviewed 2026-05-20 13:12 UTC · model grok-4.3
The pith
A hybrid CNN system fuses raw ECG waveforms with their time-frequency images through learned attention to reach 99-100% identification accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that an attention-guided fusion of features from a 1D CNN processing raw ECG signals and a 2D CNN processing time-frequency spectrograms produces a biometric system capable of 99.56% accuracy on the ECG-ID dataset, 100% on MIT-BIH, and 99.89% on PTB. The same architecture, when tested on the Heartprint collection collected over ten years, maintains same-session accuracies between 94.93% and 99.09% while cross-session performance falls to 53-56%, indicating capture of enduring individual traits rather than transient states. Ablation experiments establish that the learned attention weights outperform static fusion strategies such as early or late concatenation.
What carries the argument
Attention-guided fusion mechanism that dynamically assigns importance to 1D temporal features and 2D spectral features extracted by InceptionTime and ResNet-34 networks respectively.
If this is right
- The hybrid model maintains high accuracy even when subjects have cardiac pathologies.
- Attention-based fusion yields better results than conventional static fusion methods.
- The framework captures biometric features that persist across multiple years in the same individual.
- Specific network choices of InceptionTime for 1D and ResNet-34 for 2D produce the best performance.
Where Pith is reading between the lines
- Similar attention-driven fusion could be applied to combine other complementary signal representations in biometric or medical classification tasks.
- In real-world deployment, the method might adapt automatically to variations in signal quality or electrode placement.
- The observed drop in cross-session accuracy highlights the need for periodic re-enrollment or adaptation techniques in long-term biometric systems.
Load-bearing premise
The attention mechanism can reliably learn to emphasize the more informative modality for any given ECG recording in a manner that generalizes to new subjects and sessions.
What would settle it
Measuring whether the attention weights change substantially across different inputs and whether disabling the attention module reduces accuracy by more than a few percentage points on the same evaluation sets.
Figures
read the original abstract
Electrocardiogram (ECG)-based biometric recognition has emerged as a promising solution for secure authentication and liveness detection. However, most existing methods rely on unimodal deep learning architectures that independently process either one-dimensional (1D) temporal signals or two-dimensional (2D) time-frequency representations, limiting robustness and generalization. To address this issue, this paper proposes a hybrid framework integrating 1D and 2D convolutional neural networks (CNNs) within a unified end-to-end architecture. The 1D branch extracts temporal and morphological features from raw ECG signals, while the 2D branch captures discriminative spectral information from time-frequency representations. An attention-guided fusion mechanism dynamically weights both modalities according to input characteristics, overcoming the limitations of conventional static fusion strategies. The framework was evaluated on three benchmark datasets (ECG-ID, MIT-BIH, and PTB), including healthy subjects and patients with cardiac pathologies, achieving identification accuracies of 99.56%, 100.00%, and 99.89%, respectively. To assess long-term biometric permanence, experiments were also conducted on the multi-session Heartprint dataset spanning ten years. The proposed approach achieved same-session accuracies of 98.54% (S1), 99.09% (S2), 94.93% (S3R), and 96.08% (S3L), while cross-session evaluations reached 56.33% (S1-S2) and 53.27% (S2-S3R), demonstrating the ability to capture stable biometric signatures over time. The optimal configuration combines InceptionTime for 1D processing, ResNet-34 for 2D analysis, and attention-based fusion. Ablation studies confirm that the proposed attention mechanism consistently outperforms conventional fusion approaches. Overall, the proposed framework provides a robust, scalable, and high-performance solution for ECG biometric recognition.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a hybrid end-to-end framework that combines a 1D CNN branch (InceptionTime) extracting temporal/morphological features from raw ECG signals with a 2D CNN branch (ResNet-34) processing time-frequency representations, fused via an attention-guided mechanism that dynamically weights the modalities. It reports identification accuracies of 99.56% on ECG-ID, 100.00% on MIT-BIH, and 99.89% on PTB, plus same-session and cross-session results on the multi-year Heartprint dataset, claiming that attention-based fusion consistently outperforms static fusion in ablation studies.
Significance. If the performance numbers hold under rigorous validation and the attention mechanism is shown to generalize beyond the tested backbones, the work would strengthen evidence for dynamic multimodal fusion in ECG biometrics, particularly for handling both healthy and pathological subjects as well as long-term permanence. The explicit comparison of attention versus static fusion is a clear strength, but the current lack of statistical rigor and architecture-variation experiments limits how much the central claims can be credited at present.
major comments (3)
- [Ablation studies] Ablation studies (implicitly §4): All reported gains of attention-guided fusion over static fusion are obtained exclusively with InceptionTime (1D) and ResNet-34 (2D). No experiments replace either backbone with an alternative architecture while holding the fusion module fixed, nor do they introduce an unseen dataset. This leaves open the possibility that the observed superiority is tied to these specific choices rather than the attention mechanism itself, directly weakening the claim of a robust, general hybrid framework.
- [Results] Results section: The headline accuracies (99.56%, 100.00%, 99.89%) and Heartprint cross-session figures are presented without error bars, standard deviations across runs, or statistical tests. Dataset split details (subject-wise partitioning, train/validation/test ratios), preprocessing pipeline, and any multiple-run protocol are also absent, rendering the central performance claims impossible to evaluate or reproduce from the given text.
- [Heartprint experiments] Heartprint experiments: While same-session and cross-session numbers are given, the manuscript does not compare the attention fusion against static fusion or other baselines on the cross-session tasks, nor does it analyze whether the learned weights remain stable across sessions. This is load-bearing for the claim that the framework captures stable biometric signatures over ten years.
minor comments (1)
- [Abstract] The abstract and introduction use the term 'parameter-free' or similar phrasing for the fusion; if this is intended, it should be clarified against the learned attention parameters.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below, indicating the revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: [Ablation studies] Ablation studies (implicitly §4): All reported gains of attention-guided fusion over static fusion are obtained exclusively with InceptionTime (1D) and ResNet-34 (2D). No experiments replace either backbone with an alternative architecture while holding the fusion module fixed, nor do they introduce an unseen dataset. This leaves open the possibility that the observed superiority is tied to these specific choices rather than the attention mechanism itself, directly weakening the claim of a robust, general hybrid framework.
Authors: We agree that testing alternative backbones and an additional dataset would strengthen the generality of the attention mechanism. The current ablations used InceptionTime and ResNet-34 as representative and widely adopted architectures for 1D ECG and 2D time-frequency inputs. In the revision we will add experiments that replace one or both backbones with alternatives (e.g., a standard 1D CNN and a different 2D CNN) while keeping the fusion module unchanged, and we will evaluate the full framework on one further public ECG dataset. revision: yes
-
Referee: [Results] Results section: The headline accuracies (99.56%, 100.00%, 99.89%) and Heartprint cross-session figures are presented without error bars, standard deviations across runs, or statistical tests. Dataset split details (subject-wise partitioning, train/validation/test ratios), preprocessing pipeline, and any multiple-run protocol are also absent, rendering the central performance claims impossible to evaluate or reproduce from the given text.
Authors: The referee is correct that these details are necessary for reproducibility and statistical credibility. We will revise the Results section to report mean accuracies with standard deviations and error bars obtained from multiple independent runs, include appropriate statistical tests comparing fusion strategies, and explicitly document subject-wise partitioning, train/validation/test ratios, the complete preprocessing pipeline, and the multiple-run protocol. revision: yes
-
Referee: [Heartprint experiments] Heartprint experiments: While same-session and cross-session numbers are given, the manuscript does not compare the attention fusion against static fusion or other baselines on the cross-session tasks, nor does it analyze whether the learned weights remain stable across sessions. This is load-bearing for the claim that the framework captures stable biometric signatures over ten years.
Authors: We acknowledge that direct comparisons and weight-stability analysis on the cross-session tasks are important for supporting long-term biometric claims. In the revised manuscript we will add comparisons of attention-guided fusion versus static fusion on the cross-session Heartprint evaluations and will include an analysis of the stability (or variation) of the learned attention weights across sessions. revision: yes
Circularity Check
No significant circularity: empirical ML results on public benchmarks
full rationale
The paper is an empirical machine-learning study that proposes a hybrid 1D/2D CNN architecture with attention-guided fusion and reports identification accuracies plus ablation comparisons on public ECG datasets (ECG-ID, MIT-BIH, PTB, Heartprint). No mathematical derivation, uniqueness theorem, or first-principles claim is present that reduces by construction to fitted parameters, self-citations, or renamed inputs. The central performance claims rest on standard training/evaluation pipelines and ablation tables whose superiority is measured against explicit baselines within the same experimental setup; these results are externally falsifiable on the cited benchmarks and do not rely on load-bearing self-referential definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Convolutional neural networks can extract discriminative temporal and spectral features from ECG signals.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The key contribution of this work is an attention-guided fusion mechanism that enables dynamic, input-dependent weighting of the two modalities... Ffused = α ˆF1 + (1−α) ˆF2
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
N. Amrouni, A. Benzaoui, and A. Zeroual, “Palmprint recognition: extensive exploration of databases, methodologies, comparative assessment, and future directions,”Applied Sciences, vol. 14, no. 1, p. 153, 2023
work page 2023
-
[2]
Using convolutional neural network and a single heartbeat for ecg biometric recognition,
D. A. AlDuwaile and M. S. Islam, “Using convolutional neural network and a single heartbeat for ecg biometric recognition,”Entropy, vol. 23, no. 6, p. 733, 2021
work page 2021
-
[3]
Hybrid learning paradigm for automatic electrocardiograph heartbeat anomaly detection,
A. Zeroual, A. Benzaoui, and L. Boubchir, “Hybrid learning paradigm for automatic electrocardiograph heartbeat anomaly detection,” in2025 IEEE International Conference on Big Data (BigData). IEEE, 2025, pp. 8080–8085
work page 2025
-
[4]
Ecg biometric authen- tication based on non-fiducial approach using kernel methods,
M. Hejazi, S. A. R. Al-Haddad, Y. P. Singh, S. J. Hashim, and A. F. A. Aziz, “Ecg biometric authen- tication based on non-fiducial approach using kernel methods,”Digital Signal Processing, vol. 52, pp. 72–86, 2016. 41
work page 2016
-
[5]
Toward improving ecg biometric identification using cascaded convolutional neural networks,
Y. Li, Y. Pang, K. Wang, and X. Li, “Toward improving ecg biometric identification using cascaded convolutional neural networks,”Neurocomputing, vol. 391, pp. 83–95, 2020
work page 2020
-
[6]
M. Carvalho and S. Bras, “Addressing intra-subject variability in electrocardiogram-based biometric systems through a hybrid architecture,”Biomedical Signal Processing and Control, vol. 87, p. 105465, 2024
work page 2024
-
[7]
Y. Zhang, Z. Zhao, Y. Deng, X. Zhang, and Y. Zhang, “Human identification driven by deep cnn and transfer learning based on multiview feature representations of ecg,”Biomedical Signal Processing and Control, vol. 68, p. 102689, 2021
work page 2021
-
[8]
Ecg-based authentication systems: a comprehensive and systematic review,
S. Asadianfam, M. J. Talebi, and E. Nikougoftar, “Ecg-based authentication systems: a comprehensive and systematic review,”Multimedia Tools and Applications, vol. 83, no. 9, pp. 27647–27701, 2024
work page 2024
-
[9]
An ecg biometric system using hierarchical lstm with attention mecha- nism,
D. Jyotishi and S. Dandapat, “An ecg biometric system using hierarchical lstm with attention mecha- nism,”IEEE Sensors Journal, vol. 22, no. 6, pp. 6052–6061, 2021
work page 2021
-
[10]
A. N. Uwaechia and D. A. Ramli, “A comprehensive survey on ecg signals as new biometric modality for human authentication: Recent advances and future challenges,”IEEE Access, vol. 9, pp. 97760–97802, 2021
work page 2021
-
[11]
A hybrid deep cnn model for abnormal arrhythmia detection based on cardiac ecg signal,
A. Ullah, S. u. Rehman, S. Tu, R. M. Mehmood, Fawad, and M. Ehatisham-ul Haq, “A hybrid deep cnn model for abnormal arrhythmia detection based on cardiac ecg signal,”Sensors, vol. 21, no. 3, p. 951, 2021
work page 2021
-
[12]
D. Kim, K. R. Lee, D. S. Lim, K. H. Lee, J. S. Lee, D.-Y. Kim, and C.-B. Sohn, “A novel hybrid cnn- transformer model for arrhythmia detection without r-peak identification using stockwell transform,” Scientific Reports, vol. 15, no. 1, p. 7817, 2025
work page 2025
-
[13]
Classification of ecg based on hybrid features using cnns for wearable applications,
L. Xiaolin, F. Xiang, R. C. Panicker, B. Cardiff, and D. John, “Classification of ecg based on hybrid features using cnns for wearable applications,” in2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS). IEEE, 2023, pp. 1–4
work page 2023
-
[14]
Efficientnetv2: Smaller models and faster training,
M. Tan and Q. Le, “Efficientnetv2: Smaller models and faster training,” inInternational conference on machine learning. PMLR, 2021, pp. 10096–10106
work page 2021
-
[15]
B. Koonce, “Resnet 34,” inConvolutional neural networks with swift for tensorflow: image recognition and dataset categorization. Springer, 2021, pp. 51–61
work page 2021
-
[16]
Advances in medical image analysis with vision transformers: a comprehensive review,
R. Azad, A. Kazerouni, M. Heidari, E. K. Aghdam, A. Molaei, Y. Jia, A. Jose, R. Roy, and D. Merhof, “Advances in medical image analysis with vision transformers: a comprehensive review,”Medical Image Analysis, vol. 91, p. 103000, 2024. 42
work page 2024
-
[17]
R. Ganiga, M. SN, W. Choi, and S. Pan, “Resnet1d-based personal identification with multi-session surface electromyography for electronic health record integration,”Sensors, vol. 24, no. 10, p. 3140, 2024
work page 2024
-
[18]
H. J. Crocker and A. W. Costall, “An inceptiontime-inspired convolutional neural network to detect cardiac abnormalities in reduced-lead ecg data,” in2021 Computing in Cardiology (CinC), vol. 48. IEEE, 2021, pp. 1–4
work page 2021
-
[19]
Ecgnet: Learning where to attend for detection of atrial fibrillation with deep visual attention,
S. Mousavi, F. Afghah, A. Razi, and U. R. Acharya, “Ecgnet: Learning where to attend for detection of atrial fibrillation with deep visual attention,” in2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI). IEEE, 2019, pp. 1–4
work page 2019
-
[20]
Xcm: An explainable convolutional neural network for multivariate time series classification,
K. Fauvel, T. Lin, V. Masson, É. Fromont, and A. Termier, “Xcm: An explainable convolutional neural network for multivariate time series classification,”Mathematics, vol. 9, no. 23, p. 3137, 2021
work page 2021
-
[21]
Biometric human identification based on ecg,
T. S. Lugovaya, “Biometric human identification based on ecg,”PhysioNet, 2005
work page 2005
-
[22]
The impact of the mit-bih arrhythmia database,
G. B. Moody and R. G. Mark, “The impact of the mit-bih arrhythmia database,”IEEE engineering in medicine and biology magazine, vol. 20, no. 3, pp. 45–50, 2001
work page 2001
-
[23]
Nutzung der ekg-signaldatenbank cardiodat der ptb über das internet,
R. Bousseljot, D. Kreiseler, and A. Schnabel, “Nutzung der ekg-signaldatenbank cardiodat der ptb über das internet,” 1995
work page 1995
-
[24]
A. R. Yuniarti, S. Rizal, and K. M. Lim, “Single heartbeat ecg authentication: a 1d-cnn framework for robust and efficient human identification,”Frontiers in bioengineering and biotechnology, vol. 12, p. 1398888, 2024
work page 2024
-
[25]
Ecg biometric authentication using self-supervised learning for iot edge sensors,
G. Wang, S. Shanker, A. Nag, Y. Lian, and D. John, “Ecg biometric authentication using self-supervised learning for iot edge sensors,”IEEE Journal of Biomedical and Health Informatics, 2024
work page 2024
-
[26]
M. J. Ahmed, U. Afridi, H. A. Shah, H. Khan, M. W. Bhatt, A. Alwabli, and I. Ullah, “Cardioguard: Ai- driven ecg authentication hybrid neural network for predictive health monitoring in telehealth systems,” SLAS technology, vol. 29, no. 5, p. 100193, 2024
work page 2024
-
[27]
Y. Kim, G. Choi, and C. Choi, “One-dimensional shallow neural network using non-fiducial based segmented electrocardiogram for user identification system,”IEEE Access, vol. 11, pp. 102483–102491, 2023
work page 2023
-
[28]
Ecg biometric recognition: Review, system proposal, and benchmark evaluation,
P. Melzi, R. Tolosana, and R. Vera-Rodriguez, “Ecg biometric recognition: Review, system proposal, and benchmark evaluation,”IEEE Access, vol. 11, pp. 15555–15566, 2023. 43
work page 2023
-
[29]
H. Zehir, T. Hafs, and S. Daas, “Empirical mode decomposition-based biometric identification using gru and lstm deep neural networks on ecg signals,”Evolving Systems, vol. 15, no. 6, pp. 2193–2209, 2024
work page 2024
-
[30]
Out-of-distribution representation and graph neural network fusion learning for ecg biometrics,
T. Ma, Y. Huang, R. Yi, G. Yang, and Y. Yin, “Out-of-distribution representation and graph neural network fusion learning for ecg biometrics,”IEEE Transactions on Biometrics, Behavior, and Identity Science, 2024
work page 2024
-
[31]
An ecg-based authentication system using siamese neural networks,
L. Ivanciu, I.-A. Ivanciu, P. Farago, M. Roman, and S. Hintea, “An ecg-based authentication system using siamese neural networks,”Journal of Medical and Biological Engineering, vol. 41, no. 4, pp. 558–570, 2021
work page 2021
-
[32]
C. Fuster-Barceló, P. Peris-Lopez, and C. Camara, “Elektra: Elektrokardiomatrix application to bio- metric identification with convolutional neural networks,”Neurocomputing, vol. 506, pp. 37–49, 2022
work page 2022
-
[33]
A wavelet-based capsule neural network for ecg biometric identification,
I. El Boujnouni, H. Zili, A. Tali, T. Tali, and Y. Laaziz, “A wavelet-based capsule neural network for ecg biometric identification,”Biomedical Signal Processing and Control, vol. 76, p. 103692, 2022
work page 2022
-
[34]
Person identification with arrhythmic ecg signals using deep convolution neural network,
A. Al-Jibreen, S. Al-Ahmadi, S. Islam, and A. M. Artoli, “Person identification with arrhythmic ecg signals using deep convolution neural network,”Scientific Reports, vol. 14, no. 1, p. 4431, 2024
work page 2024
-
[35]
Activity-aware electrocardiogram biometric verification utilising deep learning on wearable devices,
H. S. Bıçakcı Yeşilkaya and R. Guest, “Activity-aware electrocardiogram biometric verification utilising deep learning on wearable devices,”EURASIP Journal on Information Security, vol. 2025, no. 1, pp. 1–26, 2025
work page 2025
-
[36]
J. Bi, H. Yuan, S. Li, K. Zhang, J. Zhang, and M. Zhou, “Arima-based and multiapplication work- load prediction with wavelet decomposition and savitzky–golay filter in clouds,”IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 4, pp. 2495–2506, 2024
work page 2024
-
[37]
A. İhsan and N. Doğan, “An innovative image encryption algorithm enhanced with the pan-tompkins algorithm for optimal security,”Multimedia Tools and Applications, vol. 83, no. 35, pp. 82589–82619, 2024
work page 2024
-
[38]
M. S. Islam, H. Alhichri, Y. Bazi, N. Ammour, N. Alajlan, and R. M. Jomaa, “Heartprint: A dataset of multisession ecg signal with long interval captured from fingers for biometric recognition,”Data, vol. 7, no. 10, p. 141, 2022
work page 2022
-
[39]
Deep contrastive learning-based model for ecg biometrics,
N. Ammour, R. M. Jomaa, M. S. Islam, Y. Bazi, H. Alhichri, and N. Alajlan, “Deep contrastive learning-based model for ecg biometrics,”Applied Sciences, vol. 13, no. 5, p. 3070, 2023. 44
work page 2023
-
[40]
Advancing ecg biometrics through vision trans- formers: A confidence-driven approach,
O. D’angelis, L. Bacco, L. Vollero, and M. Merone, “Advancing ecg biometrics through vision trans- formers: A confidence-driven approach,”IEEE Access, vol. 11, pp. 140710–140721, 2023
work page 2023
-
[41]
Ecg biometrics based on attention enhanced domain adaptive feature fusion network,
P. Yi, Y. Si, W. Fan, and Y. Zhang, “Ecg biometrics based on attention enhanced domain adaptive feature fusion network,”IEEE Access, vol. 12, pp. 1291–1307, 2023. 45
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.