pith. sign in

arxiv: 2604.15353 · v1 · submitted 2026-04-06 · 📡 eess.SP · cs.LG

A methodology to rank importance of frequencies and channels in electromyography data with Decision Tree classifiers

Pith reviewed 2026-05-10 20:24 UTC · model grok-4.3

classification 📡 eess.SP cs.LG
keywords electromyographyEMGdecision treefeature importancemuscle recoverypower spectral densitysquatclassification
0
0 comments X

The pith

Decision trees identify a small set of key frequencies and channels that classify muscle recovery from EMG data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a methodology that applies single decision tree classifiers to power spectral density features from EMG recordings of the vastus lateralis during squats performed with different rest intervals. It ranks frequencies and channels by importance and shows that only a limited subset of these features is needed to achieve reliable classification of the recovery states. A sympathetic reader would care because this reduces the complexity of EMG analysis while preserving accuracy and providing clear, interpretable rankings suitable for medical and sports use. The approach incorporates grid search for hyperparameters and cross-validation to manage class imbalance. The outcome supports the use of streamlined models for practical muscle recovery evaluation.

Core claim

By training single decision tree classifiers on power spectral density features from EMG signals recorded during squat exercises with varying rest intervals, frequencies and channels can be ranked by their importance for distinguishing recovery periods, and a limited subset of the most informative features delivers sufficient classification accuracy.

What carries the argument

Single decision tree classifiers applied to power spectral density features, with the trees' built-in importance scores used to rank frequencies and channels.

Load-bearing premise

Feature importance rankings from decision trees trained on power spectral density features from this specific vastus lateralis squat protocol and dataset accurately reflect the truly informative elements and will generalize beyond the collected data and imbalance handling.

What would settle it

If new EMG recordings from different subjects, muscles, or exercise protocols yield substantially different top-ranked frequencies and channels yet maintain similar classification accuracy, the rankings would be shown as specific to the original setup rather than general.

Figures

Figures reproduced from arXiv: 2604.15353 by Albert A. Nasybullin, Maksim A. Baranov, Nursultan Abdullaev, Viacheslav V. Koshman, Vitaly A. Mahonin.

Figure 1
Figure 1. Figure 1: The figure illustrates an experimental protocol for recording mus [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The experimental design consists of three blocks of varying lengths [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The image shows the electrode placement for EMG recording on [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Figure illustrating the analysis pipeline for EMG signal processing [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The distribution of the weighted F1 average score across all trained [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: A notable trend indicates that classification quality improves as [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: When considering only features with non-zero importance, no [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The distribution of Channel-Frequency ranks across all trained [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The ten highest-ranked features for muscular recovery period clas [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
read the original abstract

This study presents a methodology for identifying the most informative frequencies and channels in electromyography (EMG) data to evaluate muscle recovery using Decision Tree classifiers. EMG signals, recorded from the vastus lateralis muscle during squat exercises, were analyzed across varying rest intervals to assess optimal recovery periods. By employing single Decision Tree classifiers, the study enhances interpretability, offering insights into feature importance - essential for applications in medical and sports settings where transparency is critical. The experimental protocol utilized a grid search for hyperparameter tuning and cross-validation to address class imbalance, ultimately achieving a reliable classification of rest intervals based on power spectral density features. The results indicate that a limited subset of highly informative features provides sufficient accuracy, suggesting that streamlined, interpretable models are effective for the evaluation of muscle recovery. This approach can guide future research in developing compact, robust models adapted to EMG-based diagnostics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. This manuscript presents a methodology for ranking the importance of frequencies and channels in EMG signals using single Decision Tree classifiers applied to power spectral density (PSD) features. EMG data is collected from the vastus lateralis during squat exercises with varying rest intervals to classify muscle recovery states. The approach employs grid search for hyperparameter tuning and cross-validation to address class imbalance, with the central claim that a limited subset of highly informative features suffices for reliable classification while enhancing model interpretability for medical and sports applications.

Significance. If the reported feature rankings prove stable and the limited-feature models achieve high accuracy with supporting quantitative evidence, this could advance interpretable, compact EMG-based diagnostics for muscle recovery assessment. Strengths include the focus on single DTs for transparency, use of cross-validation, and grid search; these align with needs for explainable AI in biomedical signal processing. However, without performance metrics the practical significance remains difficult to evaluate.

major comments (2)
  1. [Abstract] Abstract: The claims that the method 'achieves a reliable classification of rest intervals' and that 'a limited subset of highly informative features provides sufficient accuracy' are not accompanied by any reported accuracy values, confusion matrices, ablation results (full vs. reduced features), or baseline comparisons. This absence makes it impossible to verify the central claim that limited features suffice.
  2. [Methods] Methods/Results: Feature importances are obtained from single Decision Trees without aggregation across cross-validation folds, stability testing (e.g., via bootstrapping or multiple runs), or comparison to ensemble methods such as Random Forests. Single-tree importance scores are known to exhibit high variance with small data perturbations, which undermines the reliability of the frequency/channel rankings in the presence of EMG subject variability and class imbalance.
minor comments (1)
  1. [Methods] The description of PSD feature extraction (frequency bands, window length, overlap) could be expanded with explicit parameters and equations for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major point below and describe the revisions we will implement to strengthen the presentation of results and the reliability of the reported feature rankings.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claims that the method 'achieves a reliable classification of rest intervals' and that 'a limited subset of highly informative features provides sufficient accuracy' are not accompanied by any reported accuracy values, confusion matrices, ablation results (full vs. reduced features), or baseline comparisons. This absence makes it impossible to verify the central claim that limited features suffice.

    Authors: We agree that the abstract would be strengthened by including quantitative support for these claims. In the revised manuscript we will update the abstract to report the specific classification accuracy obtained with the reduced feature set (and the corresponding full-feature baseline), while directing readers to the Results section for the full confusion matrices, ablation study, and cross-validation details. This change will make the central claims directly verifiable without altering the manuscript's length or focus. revision: yes

  2. Referee: [Methods] Methods/Results: Feature importances are obtained from single Decision Trees without aggregation across cross-validation folds, stability testing (e.g., via bootstrapping or multiple runs), or comparison to ensemble methods such as Random Forests. Single-tree importance scores are known to exhibit high variance with small data perturbations, which undermines the reliability of the frequency/channel rankings in the presence of EMG subject variability and class imbalance.

    Authors: We deliberately selected single Decision Trees to preserve full interpretability of individual frequency and channel contributions, which is essential for the intended medical and sports applications. The existing cross-validation and grid-search procedure already provides some robustness against class imbalance. To directly address the concern about ranking stability, we will add to the revised manuscript an explicit stability analysis: feature importances will be extracted from every cross-validation fold, and we will report the mean importance and standard deviation for the top-ranked features. This will quantify the consistency of the reported rankings while retaining the transparency advantage of single trees over ensembles. revision: partial

Circularity Check

0 steps flagged

No circularity: standard empirical ML pipeline on EMG PSD features

full rationale

The paper applies off-the-shelf Decision Tree classifiers to power spectral density features extracted from EMG recordings, tunes hyperparameters via grid search, and uses cross-validation to handle class imbalance. Feature importances are extracted directly from the fitted trees on the observed data splits; no equations, ansatzes, or uniqueness theorems are invoked that would reduce the reported rankings or accuracy claims back to the same fitted quantities by construction. The methodology is self-contained against external ML benchmarks and does not rely on self-citations or prior author results for its central steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The methodology rests on the domain assumption that decision-tree feature importances on PSD features will highlight physiologically meaningful frequencies and channels for recovery classification, plus the practical choice of grid-search-tuned hyperparameters for the specific dataset.

free parameters (1)
  • decision tree hyperparameters
    Tuned via grid search to optimize classification of rest intervals on the collected EMG data.
axioms (1)
  • domain assumption Single decision trees trained on PSD features can produce reliable and interpretable rankings of informative frequencies and channels in EMG signals.
    Invoked to justify the use of decision trees for both classification and feature importance extraction.

pith-pipeline@v0.9.0 · 5476 in / 1227 out tokens · 78491 ms · 2026-05-10T20:24:56.243108+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

  1. [1]

    Surface emg signal classification by using wpd and ensemble tree classifiers

    Amnah A Abdullah, Abdulhamit Subasi, and Saeed Mian Qaisar. Surface emg signal classification by using wpd and ensemble tree classifiers. In CMBEBIH 2017: Proceedings of the International Conference on Medical and Biological Engineering 2017, pages 475--481. Springer, 2017

  2. [2]

    A comparative analysis of neuropathic and healthy emg signal using psd

    Akash Kumar Bhoi, Karma Sherpa, and Pradeep Kumar Mallick. A comparative analysis of neuropathic and healthy emg signal using psd. 11 2014. doi:10.1109/ICCSP.2014.6950074

  3. [3]

    Surface electromyography: Why, when and how to use it

    Marco Antonio Cavalcanti Garcia and Taian Vieira. Surface electromyography: Why, when and how to use it. Revista Andaluza de Medicina del Deporte, 4: 0 17--28, 04 2011

  4. [4]

    Design of computer based 12 lead ecg using stm32f401 microcontroller

    Reyhan Darmawan, Arif Surtono, Donni Apriyanto, and Amir Supriyanto. Design of computer based 12 lead ecg using stm32f401 microcontroller. Journal of Energy, Material, and Instrumentation Technology, 3: 0 147--156, 11 2022. doi:10.23960/jemit.v3i4.127

  5. [5]

    Signal acquisition using surface emg and circuit design considerations for robotic prosthesis

    Muhammad Zahak Jamal. Signal acquisition using surface emg and circuit design considerations for robotic prosthesis. In Ganesh R. Naik, editor, Computational Intelligence in Electromyography Analysis, chapter 18. IntechOpen, Rijeka, 2012. doi:10.5772/52556. URL https://doi.org/10.5772/52556

  6. [6]

    A new approach to preprocessing of emg signal to assess the correctness of muscle condition

    Dariusz Komorowski, Barbara Mika, and Piotr Kaczmarek. A new approach to preprocessing of emg signal to assess the correctness of muscle condition. Scientific Papers of Silesian University of Technology Organization and Management Series, SERIES NO. 186: 0 217--238, 04 2024. doi:10.29119/1641-3466.2023.186.17

  7. [7]

    Increased neuromuscular activity, force output and resistance exercise volume when using 5-minute compared to 2-minute rest intervals between sets

    Gerard Mcmahon, Nathan Best, Timothy Coulter, and Robert Erskine. Increased neuromuscular activity, force output and resistance exercise volume when using 5-minute compared to 2-minute rest intervals between sets. The Journal of Strength and Conditioning Research, 03 2024

  8. [8]

    Merletti and S

    R. Merletti and S. Muceli. Tutorial. surface emg detection in space and time: Best practices. Journal of Electromyography and Kinesiology, 49: 0 102363, 2019. ISSN 1050-6411. doi:https://doi.org/10.1016/j.jelekin.2019.102363. URL https://www.sciencedirect.com/science/article/pii/S1050641119302536

  9. [9]

    On the detection of activity patterns in electromyographic signals via decision trees

    Vanessa Ram \' rez-P \'e rez, Jos \'e A Guerrero-D \' az-de Le \'o n, and Jorge E Mac \' as-D \' az. On the detection of activity patterns in electromyographic signals via decision trees. Evolutionary Intelligence, 17 0 (1): 0 577--588, 2024

  10. [10]

    Comparison of machine learning algorithms for emg signal classification

    Chingiz Seyidbayli, Fedi Salhi, and Erhan Akdogan. Comparison of machine learning algorithms for emg signal classification. Periodicals of Engineering and Natural Sciences, 8 0 (2): 0 1165--1176, 2020

  11. [11]

    Emg-based essential tremor detection using psd features with recurrent feedforward back propogation neural network

    Natarajan Sriraam. Emg-based essential tremor detection using psd features with recurrent feedforward back propogation neural network. International Journal of E-Health and Medical Communications, 12: 0 1--16, 01 2021. doi:10.4018/IJEHMC.20211101.oa10

  12. [12]

    Impact of rest interval duration on muscle performance in hodgkin's lymphoma survivors

    Ritielli Valeriano, Martim Bottaro, Lorena Cruz, Maurilio Dutra, Filipe Lima, Carlos Vieira, Claudio Battaglini, and Ricardo Oliveira. Impact of rest interval duration on muscle performance in hodgkin's lymphoma survivors. Journal of Physical Education and Sport, 23: 0 2835 -- 2841, 10 2023. doi:10.7752/jpes.2023.10324