Recognition: unknown
BasketHAR: A Multimodal Dataset for Human Activity Recognition and Sport Analysis in Basketball Training Scenarios
Pith reviewed 2026-05-10 06:33 UTC · model grok-4.3
The pith
BasketHAR supplies a multimodal dataset of professional basketball actions for human activity recognition and sports analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BasketHAR is a novel multimodal HAR dataset tailored for basketball training that encompasses a diverse set of professional-level actions, including comprehensive motion data from inertial measurement units, angular velocity, magnetic field, heart rate, skin temperature, and synchronized video recordings, along with a baseline multimodal alignment method that underscores the dataset's complexity and suitability for advanced HAR tasks and sports analytics.
What carries the argument
The BasketHAR dataset itself, which combines multiple sensor streams from IMUs, physiological monitors, and video for basketball-specific activities, supported by a baseline alignment method for multimodal data.
If this is right
- Researchers can develop and test HAR models specifically for sports performance analysis using the provided data.
- Training sessions can be analyzed to produce specialized performance reports based on recognized actions.
- The baseline method offers a reproducible starting point for comparing multimodal fusion approaches in HAR.
- The dataset's focus on professional actions demonstrates greater complexity than standard basic activity datasets.
Where Pith is reading between the lines
- Models trained on this data could enable real-time coaching tools that detect technique errors during practice.
- Similar datasets might be created for other sports to expand specialized HAR applications.
- The public availability allows for community-driven improvements in alignment techniques and activity classification.
Load-bearing premise
The recordings from the sensors and video accurately capture and represent professional-level basketball activities in a way that is generalizable to real training scenarios.
What would settle it
Demonstrating that standard HAR models achieve similar performance on BasketHAR as on basic activity datasets without needing specialized multimodal methods would challenge the claim of its suitability for advanced tasks.
Figures
read the original abstract
Human Activity Recognition (HAR) involves the automatic identification of user activities and has gained significant research interest due to its broad applicability. Most HAR systems rely on supervised learning, which necessitates large, diverse, and well-annotated datasets. However, existing datasets predominantly focus on basic activities such as walking, standing, and stair navigation, limiting their utility in specialized contexts like sports performance analysis. To address this gap, we present BasketHAR, a novel multimodal HAR dataset tailored for basketball training, encompassing a diverse set of professional-level actions. BasketHAR includes comprehensive motion data from inertial measurement units (accelerometers and gyroscopes), angular velocity, magnetic field, heart rate, skin temperature, and synchronized video recordings. We also provide a baseline multimodal alignment method to benchmark performance. Experimental results underscore the dataset's complexity and suitability for advanced HAR tasks. Furthermore, we highlight its potential applications in the analysis of basketball training sessions and in the generation of specialized performance reports, representing a valuable resource for future research in HAR and sports analytics. The dataset are publicly accessible at https://huggingface.co/datasets/Xian-Gao/BasketHAR licensed under Apache License 2.0.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces BasketHAR, a novel multimodal dataset for human activity recognition (HAR) in basketball training scenarios. It collects synchronized streams from inertial measurement units (accelerometers, gyroscopes, angular velocity, magnetic field), physiological sensors (heart rate, skin temperature), and video recordings covering a diverse set of professional-level basketball actions. The authors also provide a baseline multimodal alignment method and report experimental results intended to demonstrate the dataset's complexity and suitability for advanced HAR tasks and sports analytics. The dataset is released publicly on Hugging Face under the Apache 2.0 license.
Significance. If the data collection protocols, synchronization accuracy, and annotation quality are as described, BasketHAR would address a clear gap in specialized sports-focused HAR datasets, enabling new work on performance analysis, training optimization, and multimodal modeling in dynamic, high-variance environments. The public release with an open license directly supports reproducibility and community extension.
major comments (1)
- §4 (Baseline Experiments): The multimodal alignment method is positioned as a benchmark, yet the manuscript provides no quantitative metrics, ablation studies, or comparisons against unimodal baselines or prior alignment techniques; without these, the experimental results cannot fully substantiate the claim that they 'underscore the dataset's complexity and suitability for advanced HAR tasks.'
minor comments (2)
- Abstract: The sentence 'The dataset are publicly accessible' contains a subject-verb agreement error and should read 'The dataset is publicly accessible.'
- §2 (Related Work): The discussion of existing HAR datasets would benefit from a table summarizing key attributes (modalities, activity types, scale) to more clearly position BasketHAR's novelty.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript and the positive overall assessment of BasketHAR. We address the major comment below and will revise the paper accordingly to strengthen the experimental section.
read point-by-point responses
-
Referee: [—] §4 (Baseline Experiments): The multimodal alignment method is positioned as a benchmark, yet the manuscript provides no quantitative metrics, ablation studies, or comparisons against unimodal baselines or prior alignment techniques; without these, the experimental results cannot fully substantiate the claim that they 'underscore the dataset's complexity and suitability for advanced HAR tasks.'
Authors: We agree that the current description of the baseline multimodal alignment method in Section 4 would benefit from additional quantitative support to fully substantiate the claims regarding dataset complexity and utility. In the revised manuscript, we will expand this section to include: (1) quantitative metrics such as alignment error rates, synchronization accuracy (e.g., temporal offset statistics), and downstream HAR performance (precision, recall, F1-score) using the aligned multimodal streams; (2) ablation studies isolating the contribution of each modality (IMU, physiological, video); and (3) comparisons against standard unimodal baselines and prior alignment techniques (e.g., dynamic time warping and cross-modal attention). These additions will be supported by new tables and figures while keeping the focus on the dataset itself. revision: yes
Circularity Check
No significant circularity in dataset release paper
full rationale
This is a dataset presentation paper with no mathematical derivations, fitted parameters, predictions, or load-bearing self-citations. The central contribution is the public release of BasketHAR (with IMU, physiological, and video streams) plus a baseline alignment method; the claim of utility is directly supported by the Apache 2.0 Hugging Face release and does not reduce to any internal fit or self-referential definition. No steps match the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, and Jorge L Reyes- Ortiz. 2013. A Public Domain Dataset for Human Activity Recognition Using Smartphones.Computational Intelligence(2013)
2013
-
[2]
Sara Ashry, Tetsuji Ogawa, and Walid Gomaa. 2020. CHARM-Deep: Continuous Human Activity Recognition Model Based on Deep Neural Network Using IMU Sensors of Smartwatch.IEEE Sensors Journal20, 15 (Aug. 2020), 8757–8770. doi:10.1109/JSEN.2020.2985374
-
[3]
Eduardo Casilari, Jose A. Santoyo-Ramón, and Jose M. Cano-García. 2017. UMAFall: A Multisensor Dataset for the Research on Automatic Fall Detection. Procedia Computer Science110 (2017), 32–39. doi:10.1016/j.procs.2017.06.110
-
[4]
Ricardo Chavarriaga, Hesam Sagha, Alberto Calatroni, Sundara Tejaswi Digu- marti, Gerhard Tröster, José Del R. Millán, and Daniel Roggen. 2013. The Op- portunity Challenge: A Benchmark Database for on-Body Sensor-Based Ac- tivity Recognition.Pattern Recognition Letters34, 15 (Nov. 2013), 2033–2042. doi:10.1016/j.patrec.2012.12.014
- [5]
-
[6]
Daniel Garcia-Gonzalez, Daniel Rivero, Enrique Fernandez-Blanco, and Miguel R. Luaces. 2020. A Public Domain Dataset for Real-Life Human Activity Recognition Using Smartphone Sensors.Sensors20, 8 (April 2020), 2200. doi:10.3390/s20082200
-
[7]
Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. 2023. ImageBind One Embedding Space to Bind Them All. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, BC, Canada, 15180–15190. doi:10.1109/ CVPR52729.2023.01457
-
[8]
Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition Using Wearables.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies1, 2 (June 2017), 1–28. doi:10.1145/3090076
-
[9]
LoRA: Low-Rank Adaptation of Large Language Models
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. doi:10.48550/arXiv.2106.09685 arXiv:2106.09685 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2106.09685 2021
-
[10]
Wenbo Huang, Lei Zhang, Wenbin Gao, Fuhong Min, and Jun He. 2021. Shallow Convolutional Neural Networks for Human Activity Recognition Using Wearable Sensors.IEEE Transactions on Instrumentation and Measurement70 (2021), 1–11. doi:10.1109/TIM.2021.3091990
-
[11]
Masaya Inoue, Sozo Inoue, and Takeshi Nishida. 2018. Deep Recurrent Neural Net- work for Mobile Human Activity Recognition with High Throughput.Artificial Life and Robotics23, 2 (June 2018), 173–185. doi:10.1007/s10015-017-0422-x
-
[12]
Nobuo Kawaguchi, Nobuhiro Ogawa, Yohei Iwasaki, Katsuhiko Kaji, Tsutomu Terada, Kazuya Murao, Sozo Inoue, Yoshihiro Kawahara, Yasuyuki Sumi, and Nobuhiko Nishio. 2011. HASC Challenge: Gathering Large Scale Human Activity Corpus for the Real-World Activity Understandings. InProceedings of the 2nd Augmented Human International Conference. ACM, Tokyo Japan, ...
-
[13]
Adam: A Method for Stochastic Optimization
Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Opti- mization. doi:10.48550/arXiv.1412.6980 arXiv:1412.6980 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980 2017
-
[14]
Jennifer R. Kwapisz, Gary M. Weiss, and Samuel A. Moore. 2011. Activity Recog- nition Using Cell Phone Accelerometers.ACM SIGKDD Explorations Newsletter 12, 2 (March 2011), 74–82. doi:10.1145/1964897.1964918
-
[15]
Seungwhan Moon, Andrea Madotto, Zhaojiang Lin, Alireza Dirafzoon, Aparajita Saraf, Amy Bearman, and Babak Damavandi. 2022. IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text. doi:10.48550/arXiv.2210.14395 arXiv:2210.14395 [cs]
-
[16]
Abdulmajid Murad and Jae-Young Pyun. 2017. Deep Recurrent Neural Networks for Human Activity Recognition.Sensors17, 11 (Nov. 2017), 2556. doi:10.3390/ s17112556
2017
-
[17]
Jorge-Luis Reyes-Ortiz, Luca Oneto, Alessandro Ghio, Albert Samá, Davide An- guita, and Xavier Parra. 2014. Human Activity Recognition on Smartphones with Awareness of Basic Activities and Postural Transitions. InArtificial Neural Networks and Machine Learning – ICANN 2014, Stefan Wermter, Cornelius Weber, Włodzisław Duch, Timo Honkela, Petia Koprinkova-H...
-
[18]
Charissa Ann Ronao and Sung-Bae Cho. 2016. Human Activity Recognition with Smartphone Sensors Using Deep Learning Neural Networks.Expert Systems with Applications59 (Oct. 2016), 235–244. doi:10.1016/j.eswa.2016.04.032
-
[19]
Mah- fuzul Islam, and Md
Swapnil Sayan Saha, Shafizur Rahman, Miftahul Jannat Rasna, A.K.M. Mah- fuzul Islam, and Md. Atiqur Rahman Ahad. 2018. DU-MD: An Open-Source Human Action Dataset for Ubiquitous Wearable Sensors. In2018 Joint 7th In- ternational Conference on Informatics, Electronics & Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recogn...
2018
-
[20]
doi:10.1109/ICIEV.2018.8641051
-
[21]
Niloy Sikder and Abdullah-Al Nahid. 2021. KU-HAR: An Open Dataset for Heterogeneous Human Activity Recognition.Pattern Recognition Letters146 (June 2021), 46–54. doi:10.1016/j.patrec.2021.02.024
-
[22]
Allen Y. Yang, Roozbeh Jafari, S. Shankar Sastry, and Ruzena Bajcsy. 2009. Dis- tributed Recognition of Human Actions Using Wearable Motion Sensor Networks. Journal of Ambient Intelligence and Smart Environments1, 2 (2009), 103–115. doi:10.3233/AIS-2009-0016
-
[23]
Ming Zeng, Le T. Nguyen, Bo Yu, Ole J. Mengshoel, Jiang Zhu, Pang Wu, and Joy Zhang. 2014. Convolutional Neural Networks for Human Activity Recognition Using Mobile Sensors. In6th International Conference on Mobile Computing, Applications and Services. 197–205. doi:10.4108/icst.mobicase.2014.257786
-
[24]
Mi Zhang and Alexander A. Sawchuk. 2012. USC-HAD: A Daily Activity Dataset for Ubiquitous Activity Recognition Using Wearable Sensors. InProceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM, Pittsburgh Pennsylvania, 1036–1043. doi:10.1145/2370216.2370438
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.