HexagonalWarriorMamba: Superior Threshold-Dependent Multi-label Classification of 12-Lead ECG Cardiac Abnormalities
Pith reviewed 2026-05-20 11:51 UTC · model grok-4.3
The pith
HWMamba converts 12-lead ECGs to 2D images and uses hierarchical 2D selective scans to outperform prior methods on threshold-dependent classification metrics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By processing 12-lead ECGs as single-channel 2D images inside a hierarchical Mamba architecture equipped with a 2D Selective Scan mechanism, the model captures global context and complex spatial relationships, leading to better results than current state-of-the-art approaches on five threshold-dependent metrics including Challenge Score and Subset Accuracy on the PhysioNet/Computing in Cardiology Challenge 2021 dataset, while remaining near the top in Macro AUROC.
What carries the argument
Hierarchical architecture with 2D Selective Scan mechanism that models global context and spatial relationships after converting ECGs to single-channel 2D images.
If this is right
- Stronger practical performance on metrics that depend on choosing decision thresholds from training data.
- Improved ability to detect multiple co-occurring cardiac abnormalities in a single recording.
- A more balanced trade-off between discrimination and threshold selection for clinical use.
- Consistent results across data collected from different institutions and continents.
Where Pith is reading between the lines
- The 2D image conversion step could be tested on other multi-channel physiological signals to check whether the spatial-scan benefit generalizes.
- The hierarchical design might allow scaling to longer recordings or additional leads without proportional compute growth.
- If the performance edge holds on new datasets, the model could serve as a drop-in component for real-time ECG triage systems.
Load-bearing premise
That converting 12-lead ECGs into single-channel 2D images and applying a 2D selective scan will reliably capture the global context and complex spatial relationships needed for accurate multi-label classification.
What would settle it
An independent test set in which HWMamba fails to improve Challenge Score or Subset Accuracy over the previous leading methods while the Macro AUROC gap remains small.
Figures
read the original abstract
The accurate automated diagnosis of cardiac abnormalities from 12-lead electrocardiograms (ECGs) is critical for managing cardiovascular disease. However, detecting concurrent conditions remains a challenge for traditional deep learning models, which often have limited ability to model the long-range dependencies inherent in ECG signals. This manuscript proposes HexagonalWarriorMamba (HWMamba), a framework built on the Mamba architecture that processes 12-lead ECGs as single-channel 2D images rather than conventional 1D time series. By integrating a hierarchical architecture with a 2D Selective Scan mechanism, HWMamba is designed to model global context and complex spatial relationships within the data. The model is evaluated on the PhysioNet/Computing in Cardiology Challenge 2021 dataset, which includes 26 diagnostic labels and comprises recordings collected from seven institutions across four countries and three continents. Results demonstrate that HWMamba outperforms current state-of-the-art (SOTA) methods across five key threshold-dependent metrics, including Challenge Score and Subset Accuracy. These improvements provide a balance between strong discriminative capability and effective threshold selection derived from the training data, while maintaining near-SOTA performance in Macro AUROC. This Hexagonal Warrior performance, reflecting consistent performance across multiple evaluation dimensions, positions HWMamba as a robust and versatile approach for multi-label ECG classification.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces HexagonalWarriorMamba (HWMamba), a Mamba-based framework that converts 12-lead ECG signals into single-channel 2D images and processes them with a hierarchical architecture incorporating a 2D Selective Scan mechanism. The goal is to better model global context and complex spatial relationships for multi-label classification of 26 cardiac abnormalities. Evaluation is performed on the PhysioNet/Computing in Cardiology Challenge 2021 dataset (recordings from seven institutions across four countries), with claims of outperformance over SOTA methods on five threshold-dependent metrics including Challenge Score and Subset Accuracy, alongside near-SOTA Macro AUROC.
Significance. If the performance claims are robustly supported, the work could contribute to ECG analysis by showing how 2D image-based representations paired with selective scan mechanisms can improve handling of concurrent conditions through better inter-lead spatial modeling. The focus on threshold-dependent metrics derived from training data adds practical relevance for deployment.
major comments (2)
- The manuscript provides no equations, pseudocode, or detailed description of the 12-lead ECG to single-channel 2D image conversion step (including any hexagonal or grid layout and preservation of 500 Hz temporal sampling or standard lead ordering). This is load-bearing for the central claim, as outperformance on Challenge Score and Subset Accuracy is attributed to the hierarchical 2D Selective Scan capturing spatial relationships that 1D approaches miss.
- No ablation results are presented comparing HWMamba against 1D Mamba variants or standard 2D CNN baselines on the same PhysioNet/CinC 2021 data. Without these, the reported gains cannot be confidently isolated to the 2D image conversion and 2D scan mechanism rather than hyperparameter choices or other factors.
minor comments (1)
- The abstract would be strengthened by including specific numerical values for the five threshold-dependent metrics to allow immediate assessment of the magnitude of improvement.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major point below and will revise the manuscript to incorporate clarifications and additional analyses as outlined.
read point-by-point responses
-
Referee: The manuscript provides no equations, pseudocode, or detailed description of the 12-lead ECG to single-channel 2D image conversion step (including any hexagonal or grid layout and preservation of 500 Hz temporal sampling or standard lead ordering). This is load-bearing for the central claim, as outperformance on Challenge Score and Subset Accuracy is attributed to the hierarchical 2D Selective Scan capturing spatial relationships that 1D approaches miss.
Authors: We agree that the current description of the ECG-to-image conversion is insufficiently detailed for full reproducibility and to support the central claims. In the revised manuscript we will add a dedicated methods subsection containing the explicit equations for the hexagonal grid layout, the mapping from the 12 standard leads to the 2D single-channel representation, and the preservation of the original 500 Hz temporal sampling rate together with lead ordering. Pseudocode for the full conversion pipeline will also be included. revision: yes
-
Referee: No ablation results are presented comparing HWMamba against 1D Mamba variants or standard 2D CNN baselines on the same PhysioNet/CinC 2021 data. Without these, the reported gains cannot be confidently isolated to the 2D image conversion and 2D scan mechanism rather than hyperparameter choices or other factors.
Authors: We concur that direct ablations against 1D Mamba and 2D CNN baselines on the identical dataset and protocol would strengthen isolation of the contribution of the 2D image representation and 2D selective scan. While the main experiments already compare against a broad set of published SOTA methods (some of which are 1D or 2D CNN-based), we will add the requested ablation studies in the revision, reporting Challenge Score, Subset Accuracy, and Macro AUROC for the 1D Mamba and 2D CNN variants under the same training and evaluation conditions. revision: yes
Circularity Check
No significant circularity; empirical claims rest on external benchmark
full rationale
The paper proposes HWMamba as an architectural design choice (12-lead ECGs converted to single-channel 2D images processed via hierarchical 2D Selective Scan) and reports empirical outperformance on the PhysioNet/CinC 2021 dataset. All key metrics (Challenge Score, Subset Accuracy, Macro AUROC) are computed against fixed external ground-truth labels collected across institutions, not against quantities fitted or defined inside the model. No equations, self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The derivation chain is therefore self-contained: model design is an ansatz, superiority is measured externally.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Roth, G. A.et al.Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015.J. Am. college cardiology70, 1–25 (2017). 2.Prieto-Avalos, G.et al.Wearable devices for physical monitoring of heart: a review.Biosensors12, 292 (2022)
work page 1990
-
[2]
H.et al.Automatic diagnosis of the 12-lead ecg using a deep neural network.Nat
Ribeiro, A. H.et al.Automatic diagnosis of the 12-lead ecg using a deep neural network.Nat. communications11, 1760 (2020). 11/18
work page 2020
-
[3]
Siontis, K. C., Noseworthy, P. A., Attia, Z. I. & Friedman, P. A. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management.Nat. Rev. Cardiol.18, 465–478 (2021)
work page 2021
-
[4]
Abrar, M. F.et al.Automated heart disease detection using swin transformer and ecg signal processing: a high-accuracy approach.Sci. Reports15, 39338 (2025)
work page 2025
-
[5]
Dhandapani, S., Somasundaram, H. & Angamuthu, T. Hybrid deep learning framework for heart disease prediction using ecg signal images.Sci. Reports15, 33922 (2025)
work page 2025
- [6]
- [7]
-
[8]
Ansari, M. Y .et al.A survey of transformers and large language models for ecg diagnosis: advances, challenges, and future directions.Artif. Intell. Rev.58, 261 (2025)
work page 2025
- [9]
-
[10]
Zhang, Z.et al.Switch-umamba: Dynamic scanning vision mamba unet for medical image segmentation.Med. Image Analysis103792 (2025)
work page 2025
-
[11]
neural information processing systems37, 103031–103063 (2024)
Liu, Y .et al.Vmamba: Visual state space model.Adv. neural information processing systems37, 103031–103063 (2024)
work page 2024
-
[12]
In2021 Computing in Cardiology (CinC), vol
Nejedly, P.et al.Classification of ecg using ensemble of residual cnns with attention mechanism. In2021 Computing in Cardiology (CinC), vol. 48, 1–4 (IEEE, 2021)
work page 2021
-
[13]
Nejedly, P.et al.Classification of ecg using ensemble of residual cnns with or without attention mechanism.Physiol. Meas. 43, 044001 (2022)
work page 2022
-
[14]
Jiang, H., Mutahira, H., Wei, S. & Muhammad, M. S. Ecg-mamba: Cardiac abnormality classification with non-uniform- mix augmentation on 12-lead ecgs.IEEE J. Transl. Eng. Heal. Medicine13, 461–470, DOI: 10.1109/JTEHM.2025.3613609 (2025)
-
[15]
Jiang, H., Mutahira, H., Huang, G. & Muhammad, M. S. One dimensional cnn ecg mamba for multilabel abnormality classification in 12 lead ecg (2025). 2510.13046
-
[16]
InProceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021)
Liu, Z.et al.Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021)
work page 2021
-
[17]
Liu, Z.et al.A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11976–11986 (2022)
work page 2022
-
[18]
Li, J.et al.An electrocardiogram foundation model built on over 10 million recordings.NEJM AI2, AIoa2401033 (2025)
work page 2025
-
[19]
Sun, L., Li, C., Ren, Y . & Zhang, Y . A multitask dynamic graph attention autoencoder for imbalanced multilabel time series classification.IEEE Transactions on Neural Networks Learn. Syst.35, 11829–11842 (2024)
work page 2024
-
[20]
Zhang, F.-T.et al.Association between complete right bundle branch block and atrial fibrillation development.Annals Noninvasive Electrocardiol.27, e12966 (2022)
work page 2022
-
[21]
Ran, S.et al.Label correlation embedding guided network for multi-label ecg arrhythmia diagnosis.Knowledge-Based Syst.270, 110545 (2023)
work page 2023
- [22]
-
[23]
Le, D., Truong, S., Brijesh, P., Adjeroh, D. A. & Le, N. scl-st: Supervised contrastive learning with semantic transformations for multiple lead ecg arrhythmia classification.IEEE journal biomedical health informatics27, 2818–2828 (2023)
work page 2023
-
[24]
InProceedings of the 41st International Conference on Machine Learning, 62429–62442 (2024)
Zhu, L.et al.Vision mamba: efficient visual representation learning with bidirectional state space model. InProceedings of the 41st International Conference on Machine Learning, 62429–62442 (2024)
work page 2024
-
[25]
Elyamani, H. A., Salem, M. A., Melgani, F. & Yhiea, N. Deep residual 2d convolutional neural network for cardiovascular disease classification.Sci. Reports14, 22040 (2024)
work page 2024
-
[26]
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, 234–241 (Springer, 2015). 12/18 28.Kalman, R. E.et al.A new approach to linear filtering and prediction problems [j].J. basic Eng.82, 35–45 (1960)
work page 2015
-
[27]
In2020 Computing in Cardiology, 1–4 (IEEE, 2020)
Natarajan, A.et al.A wide and deep transformer neural network for 12-lead ecg classification. In2020 Computing in Cardiology, 1–4 (IEEE, 2020)
work page 2020
-
[28]
Xception: Deep learning with depthwise separable convolutions
Chollet, F. Xception: Deep learning with depthwise separable convolutions. InProceedings of the IEEE conference on computer vision and pattern recognition, 1251–1258 (2017)
work page 2017
-
[29]
Liu, H., Dai, Z., So, D. & Le, Q. V . Pay attention to mlps.Adv. neural information processing systems34, 9204–9215 (2021)
work page 2021
-
[30]
De, S.et al.Griffin: Mixing gated linear recurrences with local attention for efficient language models.arXiv preprint arXiv:2402.19427(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[31]
Reyna, M. A.et al.Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/Computing in Cardiology Challenge 2021. PhysioNet. Available from: https://physionet.org/content/challenge-2021/1.0.3/ (2021)
work page 2021
-
[32]
Reyna, M. A.et al.Will two do? varying dimensions in electrocardiography: the physionet/computing in cardiology challenge 2021. In2021 computing in cardiology (CinC), vol. 48, 1–4 (IEEE, 2021)
work page 2021
-
[33]
B.et al.A method for stochastic optimization
Kinga, D., Adam, J. B.et al.A method for stochastic optimization. InInternational conference on learning representations (ICLR), vol. 5 (California;, 2015)
work page 2015
-
[34]
Loshchilov, I. & Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. InInternational Conference on Learning Representations(2017)
work page 2017
-
[35]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[36]
Alday, E. A. P.et al.Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020.Physiol. measurement41, 124003 (2020). Author contributions statement H.J. conceived the original idea, designed and performed the simulations, and wrote the main manuscript. V .S. and M.S.M. developed and finalized the mathematical modeling for the S...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.