Parkinson's Disease Detection via Self-Supervised Dual-Channel Cross-Attention on Bilateral Wrist-Worn IMU Signals
Pith reviewed 2026-05-10 04:30 UTC · model grok-4.3
The pith
Self-supervised dual-channel cross-attention detects Parkinson's from bilateral wrist IMU signals at 93% accuracy using limited labels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that processing bilateral wrist-worn inertial measurement unit signals with a dual-channel cross-attention encoder pretrained via contrastive infoNCE loss enables accurate distinction of Parkinson's disease patients from healthy controls at a mean accuracy of 93.12 percent and from patients with differential diagnoses at 87.04 percent. With self-supervised pretraining, comparable or higher accuracies of 93.56 percent and 92.50 percent are obtained using only 20 percent labeled data. The model is shown to operate with an average inference time of 48.32 milliseconds per window when deployed on a Raspberry Pi CPU.
What carries the argument
The dual-channel cross-attention encoder, which fuses features from left and right wrist IMU signals through cross-attention mechanisms, pretrained in a self-supervised manner with the infoNCE contrastive loss to learn general representations from unlabeled data.
If this is right
- Reduces dependence on large amounts of expert-labeled clinical data for training effective PD detectors.
- Demonstrates viability of real-time inference on edge computing devices like the Raspberry Pi for continuous monitoring.
- Addresses the clinical challenge of differentiating PD from other neurodegenerative diseases using wearable data.
- Supports passive, non-invasive monitoring of motor symptoms such as tremor and bradykinesia.
Where Pith is reading between the lines
- The bilateral setup may better capture the asymmetric nature of PD symptoms compared to single-limb sensors.
- This self-supervised method could transfer to detecting other movement disorders with similar IMU datasets.
- Combining this with smartphone-based sensors might enable population-level screening for neurodegenerative conditions.
- Longitudinal use of the model on the same patients could help track disease progression over time.
Load-bearing premise
The PADS dataset and its division into PD, HC, and DD groups are taken to be representative of broader real-world populations, with the achieved accuracies assumed to indicate true generalization rather than overfitting to dataset specifics.
What would settle it
Evaluating the trained model on an external, independently collected dataset of bilateral wrist IMU recordings from new PD, healthy control, and differential diagnosis subjects would directly test whether the classification accuracies generalize.
Figures
read the original abstract
Parkinson's disease (PD) is a chronic neurodegenerative disease. It shows multiple motor symptoms such as tremor, bradykinesia, postural instability, freezing of gait (FoG). PD is currently diagnosed clinically through physical exam by health-care professionals, which can be time consuming and highly subjective. Wearable IMU sensors has become a promising gateway for passive monitoring of PD patients. We propose a self-supervised cross-attention encoder that processes bilateral wrist-worn IMU signals from a public dataset called PADS, consisting of three groups, PD (Parkinson Disease), HC (Healthy Control) and DD (Differential Diagnosis) of a total of 469 subjects. We have achieved a mean accuracy of 93.12% for HC vs. PD classification and 87.04% for PD vs. DD classification. The results emphasize the clinical challenge of distinguishing Parkinson's from other neurodegenerative diseases. Self-supervised representation learning using contrastive infoNCE loss gained an accuracy of 93.56% for HC vs. PD and 92.50% for PD vs. DD using only 20% of labelled data. This demonstrates the effectiveness of our method in transfer learning for clinical use with minimal labels. The real-time applicability was tested by deploying the optimized model with a mean inference time of 48.32 ms per window on a Raspberry Pi CPU.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a self-supervised dual-channel cross-attention encoder that processes bilateral wrist-worn IMU signals from the public PADS dataset (469 subjects across PD, HC, and DD groups). It reports mean accuracies of 93.12% for HC vs. PD classification and 87.04% for PD vs. DD classification, with a contrastive infoNCE self-supervised pretraining variant achieving 93.56% and 92.50% respectively using only 20% labeled data. Real-time deployment on a Raspberry Pi is also demonstrated with 48.32 ms inference per window.
Significance. If the evaluation protocol is sound, the work would be significant for enabling label-efficient PD detection via wearables, particularly the self-supervised component that reduces annotation burden and the explicit handling of differential diagnosis (PD vs. DD), which is clinically relevant. Bilateral cross-attention on IMU data and edge deployment are practical strengths.
major comments (1)
- Abstract: The headline accuracy claims (93.12% HC vs PD, 87.04% PD vs DD, and self-supervised 93.56%/92.50% with 20% labels) are not accompanied by any description of the cross-validation strategy, subject-wise partitioning, window-level vs. subject-level splits, class imbalance handling, baseline comparisons, or statistical tests. IMU signals consist of multiple overlapping windows per subject; without explicit confirmation of subject-independent splitting (e.g., LOSO or subject-stratified CV), the numbers cannot be interpreted as evidence of generalization rather than subject-specific leakage.
minor comments (1)
- Abstract: The phrase 'mean accuracy' is used without specifying whether it is averaged over cross-validation folds, subjects, or runs; adding this detail would improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback, which highlights important aspects of our evaluation protocol. We provide point-by-point clarifications below and commit to revisions that strengthen the manuscript without altering our core claims.
read point-by-point responses
-
Referee: [—] Abstract: The headline accuracy claims (93.12% HC vs PD, 87.04% PD vs DD, and self-supervised 93.56%/92.50% with 20% labels) are not accompanied by any description of the cross-validation strategy, subject-wise partitioning, window-level vs. subject-level splits, class imbalance handling, baseline comparisons, or statistical tests. IMU signals consist of multiple overlapping windows per subject; without explicit confirmation of subject-independent splitting (e.g., LOSO or subject-stratified CV), the numbers cannot be interpreted as evidence of generalization rather than subject-specific leakage.
Authors: We agree the abstract is too concise and should explicitly reference the evaluation details to avoid ambiguity. The full manuscript (Methods Section 3.4 and Results Section 4.1) specifies subject-independent 5-fold cross-validation: subjects are randomly partitioned into folds with all overlapping windows from any single subject kept entirely within one fold (no subject leakage). This is equivalent to a subject-stratified approach and was chosen over LOSO for computational efficiency while maintaining independence. Class imbalance is mitigated via weighted loss functions proportional to inverse class frequencies. Baselines (SVM, Random Forest, LSTM) and statistical tests (McNemar's test with p<0.01) are reported in Table 3 and the supplementary material. We will revise the abstract to include a brief clause on 'subject-independent 5-fold cross-validation' and add one sentence on partitioning and imbalance handling. These changes ensure the reported accuracies are clearly tied to generalization rather than leakage. revision: yes
Circularity Check
No circularity in empirical ML pipeline or results
full rationale
The paper reports empirical classification accuracies from a self-supervised dual-channel cross-attention encoder pretrained with standard contrastive infoNCE loss on the public PADS IMU dataset, followed by fine-tuning for HC vs PD and PD vs DD tasks. No mathematical derivations, first-principles predictions, or equations are presented that reduce to fitted parameters or inputs by construction. Architectural choices and loss functions follow established protocols without self-definitional loops, load-bearing self-citations, or ansatzes imported from prior author work. Evaluation metrics are direct outputs of training on held-out data rather than renamed inputs, leaving the result chain self-contained and non-circular.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bloem, B. R., Okun, M. S., and Klein, C. Parkinson's disease. The Lancet, 397(10291):2284--2303, 2021
work page 2021
-
[2]
A decoder-only foundation model for time-series forecasting
Das, A., Kong, W., Leber, A., Mathews, R., and Sen, R. A decoder-only foundation model for time-series forecasting. In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024
work page 2024
-
[3]
Dorsey, E. R., Sherer, T., Okun, M. S., and Bloem, B. R. The Parkinson pandemic---a call to action. JAMA Neurology, 75(1):9--10, 2018
work page 2018
-
[4]
Eldele, E., Ragab, M., Chen, Z., Wu, M., Kwoh, C. K., Li, X., and Guan, C. Time-series representation learning via temporal and contextual contrasting. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI), pp.\ 2352--2359, 2021
work page 2021
-
[5]
Guo, Y., Huang, D., Zhang, W., Wang, L., Li, Y., Olmo, G., Wang, Q., Meng, F., and Chan, P. High accuracy wearable detection of freezing of gait in Parkinson's disease based on pseudo-multimodal features. Computers in Biology and Medicine, 146:105629, 2022
work page 2022
-
[6]
A simple framework for contrastive learning of visual representations
Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning (ICML), 2020
work page 2020
-
[7]
J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. LoRA : Low-rank adaptation of large language models. In International Conference on Learning Representations (ICLR), 2022
work page 2022
-
[8]
Loshchilov, I. and Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR), 2019
work page 2019
- [9]
-
[10]
Representation Learning with Contrastive Predictive Coding
van den Oord, A., Li, Y., and Vinyals, O. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
Monitoring motor fluctuations in patients with Parkinson's disease using wearable sensors
Patel, S., Lorincz, K., Hughes, R., Huggins, N., Growdon, J., Standaert, D., Akay, M., Dy, J., Welsh, M., and Bonato, P. Monitoring motor fluctuations in patients with Parkinson's disease using wearable sensors. IEEE Transactions on Information Technology in Biomedicine, 13(6):864--873, 2009
work page 2009
-
[12]
Sigcha, L., Borz \` i , L., Pav \'o n, I., Costa, N., Costa, S., Arezes, P., L \'o pez, J. M., and De Arcas, G. Improvement of performance in freezing of gait detection in Parkinson's disease using transformer networks and a single waist-worn triaxial accelerometer. Engineering Applications of Artificial Intelligence, 116:105482, 2022
work page 2022
-
[13]
Soumma, S. B., Peterson, D., Mehta, S. H., and Ghasemzadeh, H. Self-supervised learning and opportunistic inference for continuous monitoring of freezing of gait in Parkinson's disease. ACM Transactions on Computing for Healthcare, 2026. doi:10.1145/3802589
-
[14]
Farahmand, E., Soumma, S. B., Taheri Chatrudi, N., and Ghasemzadeh, H. Hybrid attention model using feature decomposition and knowledge distillation for glucose forecasting. arXiv preprint arXiv:2411.10703, 2024. URL https://arxiv.org/abs/2411.10703
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[15]
Soumma, S. B., Alam, S. M. R., Rahman, R., Mahi, U. N., Mamun, A., Mostafavi, S. M., and Ghasemzadeh, H. Freezing of gait detection using Gramian Angular Fields and federated learning from wearable sensors. In 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp.\ 1--7, 2025 a
work page 2025
-
[16]
B., Mangipudi, K., Peterson, D., Mehta, S., and Ghasemzadeh, H
Soumma, S. B., Mangipudi, K., Peterson, D., Mehta, S., and Ghasemzadeh, H. Self-supervised learning and opportunistic inference for continuous monitoring of freezing of gait in Parkinson's disease. ACM Transactions on Computing for Healthcare, 2025 b
work page 2025
-
[17]
Um, T. T., Pfister, F. M. J., Pichler, D., Endo, S., Lang, M., Hirche, S., Fietzek, U., and Kuli \'c , D. Data augmentation of wearable sensor data for Parkinson's disease monitoring using convolutional neural networks. In Proceedings of the 19th ACM International Conference on Multimodal Interaction (ICMI), pp.\ 216--220, 2017
work page 2017
-
[18]
PADS : Parkinson's disease smartwatch dataset
Varghese, J., Acker, T., Gemmeke, M., and Fujarski, M. PADS : Parkinson's disease smartwatch dataset. PhysioNet, 2024
work page 2024
-
[19]
Varma, S. and Simon, R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7(1):91, 2006
work page 2006
-
[20]
N., Kaiser, ., and Polosukhin, I
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ., and Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017
work page 2017
-
[21]
Zhao, W., Wang, X., Qi, J., Yang, Y., and Yang, P. Multi-scale frequency-aware adversarial network for Parkinson's disease assessment using wearable sensors. arXiv preprint arXiv:2510.10558, 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.