pith. sign in

arxiv: 2605.17875 · v1 · pith:OGAAJEM4new · submitted 2026-05-18 · 💻 cs.CV

HexagonalWarriorMamba: Superior Threshold-Dependent Multi-label Classification of 12-Lead ECG Cardiac Abnormalities

Pith reviewed 2026-05-20 11:51 UTC · model grok-4.3

classification 💻 cs.CV
keywords multi-label ECG classificationMamba architecture12-lead ECGcardiac abnormality detectionthreshold-dependent metricsPhysioNet Challenge 20212D selective scan
0
0 comments X

The pith

HWMamba converts 12-lead ECGs to 2D images and uses hierarchical 2D selective scans to outperform prior methods on threshold-dependent classification metrics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents HexagonalWarriorMamba, a Mamba-based model that treats 12-lead ECG recordings as single-channel 2D images instead of 1D signals. It applies a hierarchical architecture and 2D selective scan to model long-range dependencies and spatial patterns across the leads for simultaneous detection of multiple cardiac conditions. The evaluation uses the PhysioNet 2021 dataset with 26 labels from recordings across multiple countries and institutions. If the approach holds, it would improve practical multi-label performance on metrics that matter for clinical threshold choices while preserving strong discrimination.

Core claim

By processing 12-lead ECGs as single-channel 2D images inside a hierarchical Mamba architecture equipped with a 2D Selective Scan mechanism, the model captures global context and complex spatial relationships, leading to better results than current state-of-the-art approaches on five threshold-dependent metrics including Challenge Score and Subset Accuracy on the PhysioNet/Computing in Cardiology Challenge 2021 dataset, while remaining near the top in Macro AUROC.

What carries the argument

Hierarchical architecture with 2D Selective Scan mechanism that models global context and spatial relationships after converting ECGs to single-channel 2D images.

If this is right

  • Stronger practical performance on metrics that depend on choosing decision thresholds from training data.
  • Improved ability to detect multiple co-occurring cardiac abnormalities in a single recording.
  • A more balanced trade-off between discrimination and threshold selection for clinical use.
  • Consistent results across data collected from different institutions and continents.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The 2D image conversion step could be tested on other multi-channel physiological signals to check whether the spatial-scan benefit generalizes.
  • The hierarchical design might allow scaling to longer recordings or additional leads without proportional compute growth.
  • If the performance edge holds on new datasets, the model could serve as a drop-in component for real-time ECG triage systems.

Load-bearing premise

That converting 12-lead ECGs into single-channel 2D images and applying a 2D selective scan will reliably capture the global context and complex spatial relationships needed for accurate multi-label classification.

What would settle it

An independent test set in which HWMamba fails to improve Challenge Score or Subset Accuracy over the previous leading methods while the Macro AUROC gap remains small.

Figures

Figures reproduced from arXiv: 2605.17875 by Dongryeol Ryu, Huawei Jiang, Husna Mutahira, Jiahang Li, Juneho Yi, Mannan Saeed Muhammad, Shibo Wei, Vladimir Shin, Wonyoung Park.

Figure 1
Figure 1. Figure 1: Performance Comparison of Algorithms (Hexagon Radar Chart) inter- and intra-lead relationships within the 12-lead configuration. Accordingly, HWMamba adopts a hierarchical structure and incorporates a 2D Selective Scan (SS2D) mechanism to efficiently process 2D feature maps. The proposed method was evaluated using publicly available datasets from the PhysioNet/Computing in Cardiology (CinC) Challenge 2021.… view at source ↗
Figure 2
Figure 2. Figure 2: HWMamba architecture and submodules. (a) Full model framework; (b) HWMamba block; (c) Classifier layer; (d) Gated MLP; and (e) Downsampling module. An SS2D approach based on VMamba is adopted to model spatial dependencies, as the proposed method represents the 12-lead ECG as an image to generate a 2D feature map. Depthwise Conv2d is employed to reduce computational cost by decomposing standard convolution … view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of data sequencing strategies for Selective State Space Models(S6) in Mamba, Vision Mamba and VMamba architectures Dataset Description The experimental data used in this study are sourced from the PhysioNet/CinC Challenge 202133, 34 and consist of 88,253 multi-lead ECGs from seven international institutions. These recordings cover 26 heart conditions (25 abnormalities plus sinus rhythm) and prov… view at source ↗
Figure 4
Figure 4. Figure 4 [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Impact of Threshold Source on Various Performance Metrics Across Five Models [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Graph of the Function f(x) = e x−1 x Class-Wise Performance Analysis 16/18 [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
read the original abstract

The accurate automated diagnosis of cardiac abnormalities from 12-lead electrocardiograms (ECGs) is critical for managing cardiovascular disease. However, detecting concurrent conditions remains a challenge for traditional deep learning models, which often have limited ability to model the long-range dependencies inherent in ECG signals. This manuscript proposes HexagonalWarriorMamba (HWMamba), a framework built on the Mamba architecture that processes 12-lead ECGs as single-channel 2D images rather than conventional 1D time series. By integrating a hierarchical architecture with a 2D Selective Scan mechanism, HWMamba is designed to model global context and complex spatial relationships within the data. The model is evaluated on the PhysioNet/Computing in Cardiology Challenge 2021 dataset, which includes 26 diagnostic labels and comprises recordings collected from seven institutions across four countries and three continents. Results demonstrate that HWMamba outperforms current state-of-the-art (SOTA) methods across five key threshold-dependent metrics, including Challenge Score and Subset Accuracy. These improvements provide a balance between strong discriminative capability and effective threshold selection derived from the training data, while maintaining near-SOTA performance in Macro AUROC. This Hexagonal Warrior performance, reflecting consistent performance across multiple evaluation dimensions, positions HWMamba as a robust and versatile approach for multi-label ECG classification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces HexagonalWarriorMamba (HWMamba), a Mamba-based framework that converts 12-lead ECG signals into single-channel 2D images and processes them with a hierarchical architecture incorporating a 2D Selective Scan mechanism. The goal is to better model global context and complex spatial relationships for multi-label classification of 26 cardiac abnormalities. Evaluation is performed on the PhysioNet/Computing in Cardiology Challenge 2021 dataset (recordings from seven institutions across four countries), with claims of outperformance over SOTA methods on five threshold-dependent metrics including Challenge Score and Subset Accuracy, alongside near-SOTA Macro AUROC.

Significance. If the performance claims are robustly supported, the work could contribute to ECG analysis by showing how 2D image-based representations paired with selective scan mechanisms can improve handling of concurrent conditions through better inter-lead spatial modeling. The focus on threshold-dependent metrics derived from training data adds practical relevance for deployment.

major comments (2)
  1. The manuscript provides no equations, pseudocode, or detailed description of the 12-lead ECG to single-channel 2D image conversion step (including any hexagonal or grid layout and preservation of 500 Hz temporal sampling or standard lead ordering). This is load-bearing for the central claim, as outperformance on Challenge Score and Subset Accuracy is attributed to the hierarchical 2D Selective Scan capturing spatial relationships that 1D approaches miss.
  2. No ablation results are presented comparing HWMamba against 1D Mamba variants or standard 2D CNN baselines on the same PhysioNet/CinC 2021 data. Without these, the reported gains cannot be confidently isolated to the 2D image conversion and 2D scan mechanism rather than hyperparameter choices or other factors.
minor comments (1)
  1. The abstract would be strengthened by including specific numerical values for the five threshold-dependent metrics to allow immediate assessment of the magnitude of improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major point below and will revise the manuscript to incorporate clarifications and additional analyses as outlined.

read point-by-point responses
  1. Referee: The manuscript provides no equations, pseudocode, or detailed description of the 12-lead ECG to single-channel 2D image conversion step (including any hexagonal or grid layout and preservation of 500 Hz temporal sampling or standard lead ordering). This is load-bearing for the central claim, as outperformance on Challenge Score and Subset Accuracy is attributed to the hierarchical 2D Selective Scan capturing spatial relationships that 1D approaches miss.

    Authors: We agree that the current description of the ECG-to-image conversion is insufficiently detailed for full reproducibility and to support the central claims. In the revised manuscript we will add a dedicated methods subsection containing the explicit equations for the hexagonal grid layout, the mapping from the 12 standard leads to the 2D single-channel representation, and the preservation of the original 500 Hz temporal sampling rate together with lead ordering. Pseudocode for the full conversion pipeline will also be included. revision: yes

  2. Referee: No ablation results are presented comparing HWMamba against 1D Mamba variants or standard 2D CNN baselines on the same PhysioNet/CinC 2021 data. Without these, the reported gains cannot be confidently isolated to the 2D image conversion and 2D scan mechanism rather than hyperparameter choices or other factors.

    Authors: We concur that direct ablations against 1D Mamba and 2D CNN baselines on the identical dataset and protocol would strengthen isolation of the contribution of the 2D image representation and 2D selective scan. While the main experiments already compare against a broad set of published SOTA methods (some of which are 1D or 2D CNN-based), we will add the requested ablation studies in the revision, reporting Challenge Score, Subset Accuracy, and Macro AUROC for the 1D Mamba and 2D CNN variants under the same training and evaluation conditions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on external benchmark

full rationale

The paper proposes HWMamba as an architectural design choice (12-lead ECGs converted to single-channel 2D images processed via hierarchical 2D Selective Scan) and reports empirical outperformance on the PhysioNet/CinC 2021 dataset. All key metrics (Challenge Score, Subset Accuracy, Macro AUROC) are computed against fixed external ground-truth labels collected across institutions, not against quantities fitted or defined inside the model. No equations, self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The derivation chain is therefore self-contained: model design is an ansatz, superiority is measured externally.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities beyond the high-level architectural description; the central claim rests on the unverified assumption that the 2D-Mamba design captures the necessary spatial relationships.

pith-pipeline@v0.9.0 · 5803 in / 1255 out tokens · 50926 ms · 2026-05-20T11:51:14.416411+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 2 internal anchors

  1. [1]

    A.et al.Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015.J

    Roth, G. A.et al.Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015.J. Am. college cardiology70, 1–25 (2017). 2.Prieto-Avalos, G.et al.Wearable devices for physical monitoring of heart: a review.Biosensors12, 292 (2022)

  2. [2]

    H.et al.Automatic diagnosis of the 12-lead ecg using a deep neural network.Nat

    Ribeiro, A. H.et al.Automatic diagnosis of the 12-lead ecg using a deep neural network.Nat. communications11, 1760 (2020). 11/18

  3. [3]

    C., Noseworthy, P

    Siontis, K. C., Noseworthy, P. A., Attia, Z. I. & Friedman, P. A. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management.Nat. Rev. Cardiol.18, 465–478 (2021)

  4. [4]

    F.et al.Automated heart disease detection using swin transformer and ecg signal processing: a high-accuracy approach.Sci

    Abrar, M. F.et al.Automated heart disease detection using swin transformer and ecg signal processing: a high-accuracy approach.Sci. Reports15, 39338 (2025)

  5. [5]

    & Angamuthu, T

    Dhandapani, S., Somasundaram, H. & Angamuthu, T. Hybrid deep learning framework for heart disease prediction using ecg signal images.Sci. Reports15, 33922 (2025)

  6. [6]

    & Peng, X

    Zhao, Y ., Cheng, J., Zhang, P. & Peng, X. Ecg classification using deep cnn improved by wavelet transform.Comput. Mater. Continua(2020)

  7. [7]

    & Jin, B

    Che, C., Zhang, P., Zhu, M., Qu, Y . & Jin, B. Constrained transformer network for ecg signal processing and arrhythmia classification.BMC Med. Informatics Decis. Mak.21, 184 (2021)

  8. [8]

    Y .et al.A survey of transformers and large language models for ecg diagnosis: advances, challenges, and future directions.Artif

    Ansari, M. Y .et al.A survey of transformers and large language models for ecg diagnosis: advances, challenges, and future directions.Artif. Intell. Rev.58, 261 (2025)

  9. [9]

    & Dao, T

    Gu, A. & Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. InFirst Conference on Language Modeling(2024)

  10. [10]

    Image Analysis103792 (2025)

    Zhang, Z.et al.Switch-umamba: Dynamic scanning vision mamba unet for medical image segmentation.Med. Image Analysis103792 (2025)

  11. [11]

    neural information processing systems37, 103031–103063 (2024)

    Liu, Y .et al.Vmamba: Visual state space model.Adv. neural information processing systems37, 103031–103063 (2024)

  12. [12]

    In2021 Computing in Cardiology (CinC), vol

    Nejedly, P.et al.Classification of ecg using ensemble of residual cnns with attention mechanism. In2021 Computing in Cardiology (CinC), vol. 48, 1–4 (IEEE, 2021)

  13. [13]

    Nejedly, P.et al.Classification of ecg using ensemble of residual cnns with or without attention mechanism.Physiol. Meas. 43, 044001 (2022)

  14. [14]

    & Muhammad, M

    Jiang, H., Mutahira, H., Wei, S. & Muhammad, M. S. Ecg-mamba: Cardiac abnormality classification with non-uniform- mix augmentation on 12-lead ecgs.IEEE J. Transl. Eng. Heal. Medicine13, 461–470, DOI: 10.1109/JTEHM.2025.3613609 (2025)

  15. [15]

    & Muhammad, M

    Jiang, H., Mutahira, H., Huang, G. & Muhammad, M. S. One dimensional cnn ecg mamba for multilabel abnormality classification in 12 lead ecg (2025). 2510.13046

  16. [16]

    InProceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021)

    Liu, Z.et al.Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021)

  17. [17]

    InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11976–11986 (2022)

    Liu, Z.et al.A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11976–11986 (2022)

  18. [18]

    Li, J.et al.An electrocardiogram foundation model built on over 10 million recordings.NEJM AI2, AIoa2401033 (2025)

  19. [19]

    & Zhang, Y

    Sun, L., Li, C., Ren, Y . & Zhang, Y . A multitask dynamic graph attention autoencoder for imbalanced multilabel time series classification.IEEE Transactions on Neural Networks Learn. Syst.35, 11829–11842 (2024)

  20. [20]

    Zhang, F.-T.et al.Association between complete right bundle branch block and atrial fibrillation development.Annals Noninvasive Electrocardiol.27, e12966 (2022)

  21. [21]

    Ran, S.et al.Label correlation embedding guided network for multi-label ecg arrhythmia diagnosis.Knowledge-Based Syst.270, 110545 (2023)

  22. [22]

    & Park, Y

    Hwang, S., Cha, J., Heo, J., Cho, S. & Park, Y . Multi-label abnormality classification from 12-lead ecg using a 2d residual u-net. InICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2265–2269 (IEEE, 2024)

  23. [23]

    Le, D., Truong, S., Brijesh, P., Adjeroh, D. A. & Le, N. scl-st: Supervised contrastive learning with semantic transformations for multiple lead ecg arrhythmia classification.IEEE journal biomedical health informatics27, 2818–2828 (2023)

  24. [24]

    InProceedings of the 41st International Conference on Machine Learning, 62429–62442 (2024)

    Zhu, L.et al.Vision mamba: efficient visual representation learning with bidirectional state space model. InProceedings of the 41st International Conference on Machine Learning, 62429–62442 (2024)

  25. [25]

    A., Salem, M

    Elyamani, H. A., Salem, M. A., Melgani, F. & Yhiea, N. Deep residual 2d convolutional neural network for cardiovascular disease classification.Sci. Reports14, 22040 (2024)

  26. [26]

    & Brox, T

    Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, 234–241 (Springer, 2015). 12/18 28.Kalman, R. E.et al.A new approach to linear filtering and prediction problems [j].J. basic Eng.82, 35–45 (1960)

  27. [27]

    In2020 Computing in Cardiology, 1–4 (IEEE, 2020)

    Natarajan, A.et al.A wide and deep transformer neural network for 12-lead ecg classification. In2020 Computing in Cardiology, 1–4 (IEEE, 2020)

  28. [28]

    Xception: Deep learning with depthwise separable convolutions

    Chollet, F. Xception: Deep learning with depthwise separable convolutions. InProceedings of the IEEE conference on computer vision and pattern recognition, 1251–1258 (2017)

  29. [29]

    Liu, H., Dai, Z., So, D. & Le, Q. V . Pay attention to mlps.Adv. neural information processing systems34, 9204–9215 (2021)

  30. [30]

    De, S.et al.Griffin: Mixing gated linear recurrences with local attention for efficient language models.arXiv preprint arXiv:2402.19427(2024)

  31. [31]

    A.et al.Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/Computing in Cardiology Challenge 2021

    Reyna, M. A.et al.Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/Computing in Cardiology Challenge 2021. PhysioNet. Available from: https://physionet.org/content/challenge-2021/1.0.3/ (2021)

  32. [32]

    A.et al.Will two do? varying dimensions in electrocardiography: the physionet/computing in cardiology challenge 2021

    Reyna, M. A.et al.Will two do? varying dimensions in electrocardiography: the physionet/computing in cardiology challenge 2021. In2021 computing in cardiology (CinC), vol. 48, 1–4 (IEEE, 2021)

  33. [33]

    B.et al.A method for stochastic optimization

    Kinga, D., Adam, J. B.et al.A method for stochastic optimization. InInternational conference on learning representations (ICLR), vol. 5 (California;, 2015)

  34. [34]

    & Hutter, F

    Loshchilov, I. & Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. InInternational Conference on Learning Representations(2017)

  35. [35]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)

  36. [36]

    Alday, E. A. P.et al.Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020.Physiol. measurement41, 124003 (2020). Author contributions statement H.J. conceived the original idea, designed and performed the simulations, and wrote the main manuscript. V .S. and M.S.M. developed and finalized the mathematical modeling for the S...