FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality Assessment
Pith reviewed 2026-05-19 11:36 UTC · model grok-4.3
The pith
Multimodal multiview data with sEMG and knowledge graphs improves fitness action quality assessment
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FLEX is the first large-scale multimodal multiview dataset for fitness AQA incorporating sEMG, with expert annotations organized into a Fitness Knowledge Graph supporting compositional scoring. It enables multimodal fusion, cross-modal prediction like Video to EMG, and the FLEX-VideoQA benchmark for hierarchical queries. Baseline experiments demonstrate that multimodal inputs, multiview video, and fine-grained annotations significantly enhance AQA performance.
What carries the argument
The Fitness Knowledge Graph (FKG) that links actions, key steps, error types, and feedback to enable structured, interpretable quality assessment and compositional scoring.
If this is right
- Multimodal inputs significantly enhance AQA performance.
- Multiview video contributes to improved assessment accuracy.
- Fine-grained annotations from the knowledge graph boost results.
- New tasks such as predicting EMG signals from video are supported.
- The VideoQA benchmark promotes cross-modal reasoning in vision-language models.
Where Pith is reading between the lines
- AI-powered fitness coaching systems could provide real-time form corrections using similar multimodal setups.
- The structured annotations may facilitate transfer of quality assessment models to other physical training domains.
- Integration with wearable sensors could extend this approach to everyday exercise monitoring.
- Cross-modal learning from this data might uncover new biomechanical relationships between observed form and muscle activity.
Load-bearing premise
The data collected from 38 subjects and their expert Fitness Knowledge Graph annotations represent the diversity of skill levels, error patterns, and conditions necessary for models to generalize to real-world use.
What would settle it
A test showing that AQA models trained with only single-view RGB video perform as well as or better than those using the full multimodal and multiview FLEX data on new fitness recordings.
Figures
read the original abstract
Action Quality Assessment (AQA) -- the task of quantifying how well an action is performed -- has great potential for detecting errors in gym weight training, where accurate feedback is critical to prevent injuries and maximize gains. Existing AQA datasets, however, are limited to single-view competitive sports and RGB video, lacking multimodal signals and professional assessment of fitness actions. We introduce FLEX, the first large-scale, multimodal, multiview dataset for fitness AQA that incorporates surface electromyography (sEMG). FLEX contains over 7,500 multiview recordings of 20 weight-loaded exercises performed by 38 subjects of diverse skill levels, with synchronized RGB video, 3D pose, sEMG, and physiological signals. Expert annotations are organized into a Fitness Knowledge Graph (FKG) linking actions, key steps, error types, and feedback, supporting a compositional scoring function for interpretable quality assessment. FLEX enables multimodal fusion, cross-modal prediction -- including the novel Video$\rightarrow$EMG task -- and biomechanically oriented representation learning. Building on the FKG, we further introduce FLEX-VideoQA, a structured question-answering benchmark with hierarchical queries that drive cross-modal reasoning in vision-language models. Baseline experiments demonstrate that multimodal inputs, multiview video, and fine-grained annotations significantly enhance AQA performance. FLEX thus advances AQA toward richer multimodal settings and provides a foundation for AI-powered fitness assessment and coaching. Dataset and code are available at \href{https://github.com/HaoYin116/FLEX}{https://github.com/HaoYin116/FLEX}. Link to Project \href{https://haoyin116.github.io/FLEX_Dataset}{page}.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces FLEX, a large-scale multimodal multiview dataset for fitness action quality assessment (AQA) consisting of over 7,500 synchronized recordings (RGB video, 3D pose, sEMG, physiological signals) of 20 weight-loaded exercises performed by 38 subjects of varying skill levels. Expert annotations are structured via a Fitness Knowledge Graph (FKG) supporting compositional scoring; the work also releases the FLEX-VideoQA benchmark and reports baseline results claiming that multimodal fusion, multiview inputs, and fine-grained FKG annotations yield significant AQA performance gains over unimodal or single-view alternatives.
Significance. If the reported baseline improvements are shown to hold under subject-disjoint evaluation protocols and the 38-subject cohort adequately samples error patterns and skill variation, FLEX would constitute a valuable addition to the AQA literature by moving beyond single-view RGB sports datasets and enabling cross-modal tasks such as Video-to-EMG prediction. The provision of the FKG and the associated VideoQA benchmark further supports interpretable, biomechanically grounded modeling.
major comments (2)
- [Baseline experiments] Baseline experiments section: the manuscript does not state whether train/test splits are subject-disjoint. With only 38 subjects, any subject overlap would allow models to exploit person-specific sEMG signatures, movement idiosyncrasies, or annotation biases rather than learning transferable skill representations, directly undermining the central claim that multimodal and multiview inputs produce generalizable AQA improvements.
- [Dataset construction] Dataset description and Table 1 (or equivalent subject statistics): no breakdown is provided of how the 38 subjects are distributed across skill levels, nor are inter-annotator agreement statistics or error-type coverage reported for the FKG. These omissions make it impossible to assess whether the weakest assumption—that the recordings represent the range of real-world fitness errors—holds.
minor comments (2)
- [Abstract] Abstract: 'Largescale' should be hyphenated as 'Large-scale'.
- [Abstract] The abstract states that baselines 'significantly enhance AQA performance' yet supplies no numerical deltas, error bars, or statistical tests; these quantitative details should appear in the abstract or be clearly cross-referenced to the results tables.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. These have helped us identify important clarifications needed to strengthen the presentation of our work. We respond to each major comment below and commit to revisions that directly address the concerns raised.
read point-by-point responses
-
Referee: [Baseline experiments] Baseline experiments section: the manuscript does not state whether train/test splits are subject-disjoint. With only 38 subjects, any subject overlap would allow models to exploit person-specific sEMG signatures, movement idiosyncrasies, or annotation biases rather than learning transferable skill representations, directly undermining the central claim that multimodal and multiview inputs produce generalizable AQA improvements.
Authors: We agree that subject-disjoint splits are essential for validating generalizable AQA improvements, especially with a modest cohort size. Our baseline experiments were conducted using subject-disjoint train/test splits to prevent leakage of person-specific patterns. This protocol was followed but not explicitly documented in the section. We will revise the baseline experiments section to clearly state that all reported results use subject-disjoint splits, provide the exact split ratios, and describe the subject partitioning procedure. revision: yes
-
Referee: [Dataset construction] Dataset description and Table 1 (or equivalent subject statistics): no breakdown is provided of how the 38 subjects are distributed across skill levels, nor are inter-annotator agreement statistics or error-type coverage reported for the FKG. These omissions make it impossible to assess whether the weakest assumption—that the recordings represent the range of real-world fitness errors—holds.
Authors: We thank the referee for pointing out these omissions. We will expand the dataset description and update Table 1 to include a breakdown of the 38 subjects by skill level (beginner, intermediate, advanced) as assessed by experts. We will also add inter-annotator agreement statistics (e.g., Cohen's kappa) for the FKG annotations. For error-type coverage, we will include a summary of the error categories and their frequencies in the dataset to better demonstrate representation of real-world fitness errors. revision: yes
Circularity Check
No circularity: empirical dataset and benchmarking paper
full rationale
This paper introduces a new multimodal fitness AQA dataset (FLEX) with recordings, sEMG, 3D pose, and Fitness Knowledge Graph annotations from 38 subjects, then reports baseline experiments on multimodal fusion and VideoQA. No mathematical derivations, equations, or predictions are present that could reduce to fitted parameters or self-defined quantities by construction. The central claims rest on data collection and empirical performance lifts, which are independent of any internal definitions or self-citation chains. No load-bearing steps match the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
invented entities (1)
-
Fitness Knowledge Graph (FKG)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Baseline experiments demonstrate that multimodal inputs, multiview video, and fine-grained annotations significantly enhance AQA performance.
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
FLEX contains over 7,500 multiview recordings of 20 weight-loaded exercises performed by 38 subjects
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
ExpertEdit: Learning Skill-Aware Motion Editing from Expert Videos
ExpertEdit edits novice motions to expert skill levels by learning a motion prior from unpaired videos and infilling masked skill-critical spans.
Reference graph
Works this paper leans on
-
[1]
Assessing the quality of actions
Hamed Pirsiavash, Carl V ondrick, and Antonio Torralba. Assessing the quality of actions. In European Conference on Computer Vision, pages 556–571. Springer, 2014. 3, 4, 5, 19
work page 2014
-
[2]
Learning to score olympic events
Paritosh Parmar and Brendan Tran Morris. Learning to score olympic events. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 20–28,
-
[3]
Yixin Gao, S Swaroop Vedula, Carol E Reiley, Narges Ahmidi, Balakrishnan Varadarajan, Henry C Lin, Lingling Tao, Luca Zappella, Benjamın Béjar, and David D Yuh. Jhu-isi gesture and skill assessment working set (jigsaws): A surgical activity dataset for human motion model- ing. In Medical Image Computing and Computer Assisted Intervention Workshop, volume ...
work page 2014
-
[4]
A data set of human body movements for physical rehabilitation exercises
Aleksandar Vakanski, Hyung-pil Jun, David Paul, and Russell Baker. A data set of human body movements for physical rehabilitation exercises. Data, 3(1):2, 2018. 3
work page 2018
-
[5]
Marianna Capecci, Maria Gabriella Ceravolo, Francesco Ferracuti, Sabrina Iarlori, Andrea Monteriu, Luca Romeo, and Federica Verdini. The kimore dataset: Kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 27(7):1436–1448, 2019. 3, 5
work page 2019
-
[6]
Domain knowledge-informed self-supervised representations for workout form assessment
Paritosh Parmar, Amol Gharat, and Helge Rhodin. Domain knowledge-informed self-supervised representations for workout form assessment. In European Conference on Computer Vision, pages 105–123. Springer, 2022. 3, 4, 15
work page 2022
-
[7]
Egoexo-fitness: Towards egocentric and exocentric full-body action understanding
Yuan-Ming Li, Wei-Jin Huang, An-Lan Wang, Ling-An Zeng, Jing-Ke Meng, and Wei-Shi Zheng. Egoexo-fitness: Towards egocentric and exocentric full-body action understanding. In European Conference on Computer Vision, 2024. 3, 4, 5, 15, 16
work page 2024
-
[8]
Temporal distance matrices for squat classification
Ryoji Ogata, Edgar Simo-Serra, Satoshi Iizuka, and Hiroshi Ishikawa. Temporal distance matrices for squat classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 0–0, 2019. 3, 4, 15
work page 2019
-
[9]
Assembly101: A large-scale multi-view video dataset for understanding procedural activities
Fadime Sener, Dibyadip Chatterjee, Daniel Shelepov, Kun He, Dipika Singhania, Robert Wang, and Angela Yao. Assembly101: A large-scale multi-view video dataset for understanding procedural activities. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21096–21106, 2022. 3
work page 2022
-
[10]
Gaia: Rethinking action quality assessment for ai-generated videos
Zijian Chen, Wei Sun, Yuan Tian, Jun Jia, Zicheng Zhang, Jiarui Wang, Ru Huang, Xiongkuo Min, Guangtao Zhai, and Wenjun Zhang. Gaia: Rethinking action quality assessment for ai-generated videos. In Advances in Neural Information Processing Systems, 2024. 3
work page 2024
-
[11]
Action quality assessment across multiple actions
Paritosh Parmar and Brendan Morris. Action quality assessment across multiple actions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1468–1476. IEEE, 2019. 3
work page 2019
-
[12]
What and how well you performed? a multitask learning approach to action quality assessment
Paritosh Parmar and Brendan Tran Morris. What and how well you performed? a multitask learning approach to action quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 304–313, 2019. 3, 5
work page 2019
-
[13]
Learning to score figure skating sport videos
Chengming Xu, Yanwei Fu, Bing Zhang, Zitian Chen, Yu-Gang Jiang, and Xiangyang Xue. Learning to score figure skating sport videos. IEEE Transactions on Circuits and Systems for Video Technology, 30(12):4578–4590, 2019. 3
work page 2019
-
[14]
An asymmetric modeling for action assessment
Jibin Gao, Wei-Shi Zheng, Jia-Hui Pan, Chengying Gao, Yaowei Wang, Wei Zeng, and Jian- huang Lai. An asymmetric modeling for action assessment. In European Conference on Computer Vision, pages 222–238. Springer, 2020. 3
work page 2020
-
[15]
Hybrid dynamic-static context-aware attention network for action assessment in long videos
Ling-An Zeng, Fa-Ting Hong, Wei-Shi Zheng, Qi-Zhi Yu, Wei Zeng, Yao-Wei Wang, and Jian-Huang Lai. Hybrid dynamic-static context-aware attention network for action assessment in long videos. In Proceedings of the ACM International Conference on Multimedia, pages 2526–2534, 2020. 3 10
work page 2020
-
[16]
Tsa-net: Tube self-attention network for action quality assessment
Shunli Wang, Dingkang Yang, Peng Zhai, Chixiao Chen, and Lihua Zhang. Tsa-net: Tube self-attention network for action quality assessment. In Proceedings of the ACM International Conference on Multimedia, pages 4902–4910, 2021. 3
work page 2021
-
[17]
Finediving: A fine-grained dataset for procedure-aware action quality assessment
Jinglin Xu, Yongming Rao, Xumin Yu, Guangyi Chen, Jie Zhou, and Jiwen Lu. Finediving: A fine-grained dataset for procedure-aware action quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2949–2958, 2022. 3, 4
work page 2022
-
[18]
Logo: A long-form video dataset for group action quality assessment
Shiyi Zhang, Wenxun Dai, Sujia Wang, Xiangwei Shen, Jiwen Lu, Jie Zhou, and Yansong Tang. Logo: A long-form video dataset for group action quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2405–2414, 2023. 3
work page 2023
-
[19]
Localization- assisted uncertainty score disentanglement network for action quality assessment
Yanli Ji, Lingfeng Ye, Huili Huang, Lijing Mao, Yang Zhou, and Lingling Gao. Localization- assisted uncertainty score disentanglement network for action quality assessment. In Proceed- ings of the ACM International Conference on Multimedia, pages 8590–8597, 2023. 3
work page 2023
-
[20]
Automatic modelling for interactive action assessment
Jibin Gao, Jia-Hui Pan, Shao-Jie Zhang, and Wei-Shi Zheng. Automatic modelling for interactive action assessment. International Journal of Computer Vision, 131(3):659–679, 2023. 3
work page 2023
-
[21]
Lucidaction: A hierarchical and multi- model dataset for comprehensive action quality assessment
Linfeng Dong, Wei Wang, Yu Qiao, and Xiao Sun. Lucidaction: A hierarchical and multi- model dataset for comprehensive action quality assessment. In Advances in Neural Information Processing Systems, 2024. 3, 4
work page 2024
-
[22]
Current developments in surface electromyography
Veysel ALCAN and Murat Z˙INNURO ˘GLU. Current developments in surface electromyography. Turkish Journal of Medical Sciences, 53(5):1019–1031, 2023. 3
work page 2023
-
[23]
Yi Zhang, Peiyang Li, Xuyang Zhu, Steven W Su, Qing Guo, Peng Xu, and Dezhong Yao. Extracting time-frequency feature of single-channel vastus medialis emg signals for knee exercise pattern recognition. PloS one, 12(7):e0180526, 2017. 3
work page 2017
-
[24]
Individuals have unique muscle activation signatures as revealed during gait and pedaling
François Hug, Clément V ogel, Kylie Tucker, Sylvain Dorel, Thibault Deschamps, Éric Le Car- pentier, and Lilian Lacourpaille. Individuals have unique muscle activation signatures as revealed during gait and pedaling. Journal of Applied Physiology, 127(4):1165–1174, 2019. 3
work page 2019
-
[25]
Ilaria Mileti, Aurora Serra, Nerses Wolf, Victor Munoz-Martel, Antonis Ekizos, Eduardo Palermo, Adamantios Arampatzis, and Alessandro Santuz. Muscle activation patterns are more constrained and regular in treadmill than in overground human locomotion. Frontiers in Bioengineering and Biotechnology, 8:581619, 2020. 3
work page 2020
-
[26]
A large calibrated database of hand movements and grasps kinematics
Néstor J Jarque-Bou, Manfredo Atzori, and Henning Müller. A large calibrated database of hand movements and grasps kinematics. Scientific data, 7(1):12, 2020. 3
work page 2020
-
[27]
Sex-specific tuning of modular muscle activation patterns for locomotion in young and older adults
Alessandro Santuz, Lars Janshen, Leon Brüll, Victor Munoz-Martel, Juri Taborri, Stefano Rossi, and Adamantios Arampatzis. Sex-specific tuning of modular muscle activation patterns for locomotion in young and older adults. PLoS One, 17(6):e0269417, 2022. 3
work page 2022
-
[28]
semg dataset of routine activities
Asad Mansoor Khan, Sajid Gul Khawaja, Muhammad Usman Akram, and Ali Saeed Khan. semg dataset of routine activities. Data in brief, 33:106543, 2020. 3
work page 2020
-
[29]
Hristo Dimitrov, Anthony M. J. Bull, and Dario Farina. High-density EMG, IMU, kinetic, and kinematic open-source data for comprehensive locomotion activities. Scientific Data, 10(1):1–10, 2023. 3
work page 2023
-
[30]
Raphaël Hamard, Jeroen Aeles, Simon Avrillon, Taylor JM Dick, and François Hug. A comparison of neural control of the biarticular gastrocnemius muscles between knee flexion and ankle plantar flexion. Journal of Applied Physiology, 135(2):394–404, 2023. 3
work page 2023
-
[31]
A wearable real-time kinetic measurement sensor setup for human locomotion
Huawei Wang, Akash Basu, Guillaume Durandau, and Massimo Sartori. A wearable real-time kinetic measurement sensor setup for human locomotion. Wearable technologies, 4:e11, 2023. 3 11
work page 2023
-
[32]
Electromyo- graphy data for non-invasive naturally-controlled robotic hand prostheses
Manfredo Atzori, Arjan Gijsberts, Claudio Castellini, Barbara Caputo, Anne-Gabrielle Mittaz Hager, Simone Elsig, Giorgio Giatsidis, Franco Bassetto, and Henning Müller. Electromyo- graphy data for non-invasive naturally-controlled robotic hand prostheses. Scientific data, 1(1):1–13, 2014. 3
work page 2014
-
[33]
Neuropose: 3d hand pose tracking using emg wearables
Yilin Liu, Shijia Zhang, and Mahanth Gowda. Neuropose: 3d hand pose tracking using emg wearables. In Proceedings of the Web Conference, pages 1471–1482, 2021. 3
work page 2021
-
[34]
Sensing the full dynamics of the human hand with a neural interface and deep learning
Raul C Sîmpetru, Andreas Arkudas, Dominik I Braun, Marius Osswald, Daniela Souza de Oliveira, Bjoern Eskofier, Thomas M Kinfe, and Alessandro Del Vecchio. Sensing the full dynamics of the human hand with a neural interface and deep learning. BioRxiv, pages 2022–07, 2022. 3
work page 2022
-
[35]
Dataset for multi- channel surface electromyography (semg) signals of hand gestures
Mehmet Akif Ozdemir, Deniz Hande Kisa, Onan Guren, and Aydin Akan. Dataset for multi- channel surface electromyography (semg) signals of hand gestures. Data in brief, 41:107921,
-
[36]
emg2pose: A large and diverse benchmark for surface electromyographic hand pose estimation
Sasha Salter, Richard Warren, Collin Schlager, Adrian Spurr, Shangchen Han, Rohin Bhasin, Yujun Cai, Peter Walkington, Anuoluwapo Bolarinwa, Robert J Wang, et al. emg2pose: A large and diverse benchmark for surface electromyographic hand pose estimation. Advances in Neural Information Processing Systems, 37:55703–55728, 2024. 3, 9
work page 2024
- [37]
-
[38]
Fastmove 3d motion for realtime, 2024
FASTMOVE. Fastmove 3d motion for realtime, 2024. 4
work page 2024
- [39]
-
[40]
M.zuiko digital ed 14-150mm f4.0-5.6, 2022
OLYMPUS. M.zuiko digital ed 14-150mm f4.0-5.6, 2022. 5
work page 2022
- [41]
-
[42]
Paritosh Parmar, Jaiden Reddy, and Brendan Morris. Piano skills assessment. In IEEE Interna- tional Workshop on Multimedia Signal Processing, pages 1–5. IEEE, 2021. 5
work page 2021
-
[43]
Flag3d: A 3d fitness activity dataset with language instruction
Yansong Tang, Jinpeng Liu, Aoyang Liu, Bin Yang, Wenxun Dai, Yongming Rao, Jiwen Lu, Jie Zhou, and Xiu Li. Flag3d: A 3d fitness activity dataset with language instruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22106–22117, 2023. 5
work page 2023
-
[44]
National occupational skill standard — social sports instructor (occupational code: 4-13-04-01)
Ministry of Human Resources and Social Security of the People’s Republic of China and General Administration of Sport of China. National occupational skill standard — social sports instructor (occupational code: 4-13-04-01). Standard, Ministry of Human Resources and Social Security of the People’s Republic of China and General Administration of Sport of C...
work page 2020
-
[45]
Human Resources Development Center of the General Administration of Sport of China. Occupational Competency Training Textbook for Social Sports Instructors—Fitness Coaches (with Technical Action Videos). Higher Education Press, 2023. 6
work page 2023
-
[46]
Fitness and Bodybuilding Tutorial
Beijing Sport University. Fitness and Bodybuilding Tutorial. Beijing Sport University Press,
-
[47]
Joe Weider’s Bodybuilding System
Joe Weider. Joe Weider’s Bodybuilding System. Weider Pubns, 1998. 6
work page 1998
-
[48]
Group-aware contrastive regression for action quality assessment
Xumin Yu, Yongming Rao, Wenliang Zhao, Jiwen Lu, and Jie Zhou. Group-aware contrastive regression for action quality assessment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7919–7928, 2021. 7, 8, 19, 20
work page 2021
-
[49]
Spatial temporal graph convolutional networks for skeleton-based action recognition
Sijie Yan, Yuanjun Xiong, and Dahua Lin. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018. 8 12
work page 2018
-
[50]
Ntu rgb+ d: A large scale dataset for 3d human activity analysis
Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1010–1019, 2016. 8
work page 2016
-
[51]
Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, et al. Qwen2. 5-vl technical report. arXiv preprint arXiv:2502.13923,
work page internal anchor Pith review Pith/arXiv arXiv
-
[52]
Hao Yin, Paritosh Parmar, Daoliang Xu, Yang Zhang, Tianyou Zheng, and Weiwei Fu. A decade of action quality assessment: Largest systematic survey of trends, challenges, and future directions. arXiv, 2025. 15
work page 2025
-
[53]
Quo vadis, action recognition? a new model and the kinetics dataset
Joao Carreira and Andrew Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6299–6308, 2017. 19
work page 2017
-
[54]
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, Zhangchi Feng, and Yongqiang Ma. Llamafactory: Unified efficient fine-tuning of 100+ language models. arXiv preprint arXiv:2403.13372, 2024. 20 13 Checklist
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[55]
For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper’s contributions and scope? [Yes] Please refer to section 1. (b) Did you describe the limitations of your work? [Yes] Please refer to section 5. (c) Did you discuss any potential negative societal impacts of your work? [Yes] Please refer to the Appe...
-
[56]
If you are including theoretical results... (a) Did you state the full set of assumptions of all theoretical results? [NA] (b) Did you include complete proofs of all theoretical results? [NA]
-
[57]
If you ran experiments (e.g. for benchmarks)... (a) Did you include the code, data, and instructions needed to reproduce the main experi- mental results (either in the supplemental material or as a URL)? [Yes] Please refer to the Appendix. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] Please...
-
[58]
(a) If your work uses existing assets, did you cite the creators? [Yes] Please refer to section 4
If you are using existing assets (e.g., code, data, models) or curating/releasing new assets... (a) If your work uses existing assets, did you cite the creators? [Yes] Please refer to section 4. (b) Did you mention the license of the assets? [Yes] Please refer to the Appendix. (c) Did you include any new assets either in the supplemental material or as a ...
-
[59]
If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [Yes] Please refer to the Appendix. (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [Yes] Please refer ...
-
[60]
Kneeling Push-ups 2. Push-ups 1. Pectoralis major 2. Anterior deltoid3. Kneeling Torso Twist 4. Knee Raise + Abs Contract3. Triceps brachii 4. External obliques5. Shoulder Bridge 6. Sit-ups 5. Internal obliques 6. Rectus abdominis7. Leg Reverse Lunge 8. Leg Lunge with Knee Lift7. Iliopsoas 8. Gluteus maximus9. Sumo Squat 10. Jumping Jacks 9. Hamstrings 10...
-
[61]
Standing Barbell Overhead Press2. Barbell Bicep Curl 1. Pectoralis major 2. Anterior deltoid3. Barbell Upright Row 4. Dumbbell Front Raise 3. Middle deltoid 4. Posterior deltoid5. Dumbbell Bicep Curl 6. Dumbbell Lateral Raise 5. Triceps brachii 6. Biceps brachii7. Bent-Over Dumbbell Reverse Fly8. Flat Barbell Bench Press7. Brachialis 8. Supraspinatus9. In...
-
[62]
action recognition → action standards → action evaluation → action scoring,
We also report the top 20 most frequent error types. Additionally, we provide the weight loads each subject uses for each action Figure 8. A01A02A03A04A05A06A07A08A09A10A11A12A13A14A15A16A17A18A19A20 350 355 360 365 370 375 380Sample Number Action 150 200 250 300 350 400 A01A02A03A04A05A06A07A08A09A10A11A12A13A14A15A16A17A18A19A20 40 50 60 70 80 DurationS...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.