A Comprehensive Survey of Action Quality Assessment: Method and Benchmark
Pith reviewed 2026-05-23 06:52 UTC · model grok-4.3
The pith
A modality-driven hierarchical taxonomy organizes AQA methods by input type while a unified benchmark standardizes comparisons for video-based approaches.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Existing AQA studies rely on heterogeneous datasets and evaluation settings that make systematic comparisons across methods difficult. The survey proposes a modality-driven hierarchical taxonomy that organizes methods into video-based, skeleton-based, and multi-modal approaches and analyzes the evolution of representative models. It further establishes a unified benchmark that integrates diverse datasets and applies standardized evaluation protocols to representative video-based AQA methods, allowing consistent comparison on accuracy and computational efficiency. The paper then examines emerging trends, identifies key challenges, and outlines future directions from near-term methodological进步
What carries the argument
The modality-driven hierarchical taxonomy that classifies AQA methods according to input modality, together with the unified benchmark that combines datasets and protocols for video-based methods.
If this is right
- Video-based AQA methods can be compared directly on both accuracy and efficiency using the shared protocols.
- Methodological changes across video, skeleton, and multi-modal categories become easier to track over time.
- Key challenges in current AQA work are listed for focused attention in follow-on studies.
- Future research directions are separated into near-term modeling advances and longer-term uses of new AI paradigms.
Where Pith is reading between the lines
- The taxonomy could be updated to include emerging input types such as depth or wearable sensor streams.
- The benchmark protocols might be reused or adapted to create similar standardized tests for skeleton-based or multi-modal methods.
- Efficiency results from the benchmark could inform deployment choices in real-time applications like coaching tools.
- The organization of methods by modality may reveal under-explored combinations that future work could test.
Load-bearing premise
The chosen representative methods and datasets, when placed under standardized protocols, still support valid cross-method comparisons despite differences among the original datasets.
What would settle it
Re-running the benchmark protocols on a fresh collection of datasets or with alternate evaluation metrics produces substantially reordered accuracy or efficiency rankings among the same methods.
Figures
read the original abstract
Action Quality Assessment (AQA) aims to automatically evaluate how well human actions are performed and has been widely applied in sports analysis, skill assessment, and healthcare. However, AQA studies are often developed under heterogeneous datasets and evaluation settings, making systematic comparison across methods difficult. To address these challenges, we present a comprehensive survey of recent advances in AQA. In particular, we propose a modality-driven hierarchical taxonomy that organizes existing methods into video-based, skeleton-based, and multi-modal approaches, and analyze the methodological evolution of representative models. We further establish a unified benchmark for representative video-based AQA methods by integrating diverse datasets and standardized evaluation protocols, enabling consistent comparison in terms of both accuracy and computational efficiency. Finally, we analyze emerging research trends, identify key challenges in current AQA research, and outline future directions ranging from near-term methodological advances to longer-term opportunities enabled by emerging AI paradigms. The project web page can be found at https://ZhouKanglei.github.io/AQA-Survey.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys recent advances in Action Quality Assessment (AQA), proposing a modality-driven hierarchical taxonomy that classifies methods into video-based, skeleton-based, and multi-modal categories while analyzing their methodological evolution. It further establishes a unified benchmark for representative video-based AQA methods through integration of diverse datasets under standardized evaluation protocols, enabling comparisons on accuracy and efficiency, and concludes with analysis of trends, challenges, and future directions.
Significance. If the taxonomy provides a clear organizing framework and the benchmark delivers reproducible, valid cross-method comparisons, the work would offer a useful reference point for a fragmented research area, potentially reducing redundant experimentation and highlighting efficiency-accuracy trade-offs in AQA.
major comments (2)
- [Abstract] Abstract and benchmark description: the claim that standardized protocols enable 'consistent comparison' across heterogeneous datasets is load-bearing for the central benchmark contribution, yet the manuscript provides no explicit description of normalization for differing scoring scales (absolute vs. relative) or domain shifts (sports vs. healthcare), leaving open the possibility that reported rankings reflect unification artifacts rather than intrinsic method properties.
- [Benchmark section] Benchmark integration section: without reported per-dataset score rescaling, subset selection criteria, or domain-adaptation checks, the unified evaluation protocol risks invalidating cross-dataset accuracy and efficiency comparisons; this directly affects the validity of the 'representative' method rankings presented.
minor comments (2)
- [Abstract] The project webpage URL is given but no details on whether benchmark code or dataset splits are released, which would strengthen reproducibility claims.
- [Taxonomy section] Taxonomy figure or table would benefit from explicit inclusion criteria for methods to avoid selection bias in the hierarchical organization.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which correctly identify areas where the benchmark contribution requires greater transparency. We will revise the manuscript to supply the missing methodological details on normalization and protocol standardization.
read point-by-point responses
-
Referee: [Abstract] Abstract and benchmark description: the claim that standardized protocols enable 'consistent comparison' across heterogeneous datasets is load-bearing for the central benchmark contribution, yet the manuscript provides no explicit description of normalization for differing scoring scales (absolute vs. relative) or domain shifts (sports vs. healthcare), leaving open the possibility that reported rankings reflect unification artifacts rather than intrinsic method properties.
Authors: We agree that the abstract and benchmark description would be strengthened by explicit statements on these points. In revision we will add a concise paragraph to the abstract and a dedicated methods subsection that describes: (i) the score normalization applied to each dataset (min-max to [0,1] for absolute scores and rank-based conversion for relative scores), (ii) the criteria used to select comparable action subsets across sports and healthcare domains, and (iii) the absence of explicit domain-adaptation modules together with the rationale that cross-dataset comparison is performed only after per-dataset standardization. These additions will make the unification process reproducible and will allow readers to assess whether rankings reflect method properties. revision: yes
-
Referee: [Benchmark section] Benchmark integration section: without reported per-dataset score rescaling, subset selection criteria, or domain-adaptation checks, the unified evaluation protocol risks invalidating cross-dataset accuracy and efficiency comparisons; this directly affects the validity of the 'representative' method rankings presented.
Authors: The referee correctly notes that the current text does not report these implementation details. We will expand the benchmark integration section with: (1) the exact rescaling formulas and code-level implementation for each dataset, (2) the subset selection rules (e.g., action classes present in at least three datasets, minimum sample size thresholds), and (3) a short discussion of domain shift mitigation (or its absence) together with any post-hoc checks performed. If certain normalizations prove infeasible for particular datasets, we will state the limitation and qualify the corresponding rankings accordingly. revision: yes
Circularity Check
Survey paper with no derivation chain exhibits no circularity
full rationale
This is a literature survey that organizes existing AQA methods into a modality-driven taxonomy and re-evaluates representative video-based methods on integrated datasets under standardized protocols. No equations, fitted parameters, predictions, or uniqueness theorems appear in the manuscript. All claims rest on citations to external prior work rather than any self-contained derivation that reduces to the paper's own inputs by construction.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Parameter-Efficient Multi-View Proficiency Estimation: From Discriminative Classification to Generative Feedback
SkillFormer, PATS, and ProfVLM deliver state-of-the-art multi-view proficiency estimation on Ego-Exo4D with up to 20x fewer parameters by combining selective fusion, dense sampling, and generative feedback.
Reference graph
Works this paper leans on
-
[1]
End-to-end learning for action quality assessment,
Y. Li, X. Chai, and X. Chen, “End-to-end learning for action quality assessment,” in PCM, pp. 125–134, Springer, 2018
work page 2018
-
[2]
A novel blind action quality assessment based on multi-headed gru network and attention mechanism,
W. Sun, Y. Hu, B. Zhang, X. Chen, C. Hao, and Y. Gao, “A novel blind action quality assessment based on multi-headed gru network and attention mechanism,” in AIAHPC, vol. 12717, pp. 835–843, SPIE, 2023
work page 2023
-
[3]
Action quality assessment for asd behaviour evaluation,
D. Zhang, D. Zhou, and H. Liu, “Action quality assessment for asd behaviour evaluation,” in ICMLC, pp. 483–488, IEEE, 2023
work page 2023
-
[4]
Towards unified surgical skill assessment,
D. Liu, Q. Li, T. Jiang, Y. Wang, R. Miao, F. Shan, and Z. Li, “Towards unified surgical skill assessment,” in CVPR, pp. 9522– 9531, 2021
work page 2021
-
[5]
Video-based skill assessment for golf: Estimating golf handicap,
C. K. Ingwersen, A. Xarles, A. Clap ´es, M. Madadi, J. N. Jensen, M. R. Hannemose, A. B. Dahl, and S. Escalera, “Video-based skill assessment for golf: Estimating golf handicap,” in International Workshop on Multimedia Content Analysis in Sports, pp. 31–39, 2023
work page 2023
-
[6]
Automated video assessment of human perfor- mance,
A. S. Gordon, “Automated video assessment of human perfor- mance,” in AI-ED, vol. 2, p. 10, 1995
work page 1995
-
[7]
Learning to score figure skating sport videos,
C. Xu, Y. Fu, B. Zhang, Z. Chen, Y.-G. Jiang, and X. Xue, “Learning to score figure skating sport videos,” IEEE TCSVT, vol. 30, no. 12, pp. 4578–4590, 2019
work page 2019
-
[8]
Learning time-aware features for action quality assessment,
Y. Zhang, W. Xiong, and S. Mi, “Learning time-aware features for action quality assessment,” PRL, vol. 158, pp. 104–110, 2022
work page 2022
-
[9]
Eagle-eye: Extreme-pose action grader using detail bird’s-eye view,
M. Nekoui, F. O. T. Cruz, and L. Cheng, “Eagle-eye: Extreme-pose action grader using detail bird’s-eye view,” in WACV, pp. 394– 402, 2021
work page 2021
-
[10]
Y. LIU, X. CHENG, and T. IKENAGA, “A hierarchical joint training based replay-guided contrastive transformer for action quality assessment of figure skating,” IEICE Transactions on Fun- damentals of Electronics, Communications and Computer Sciences , 2024
work page 2024
-
[11]
M. Capecci, M. G. Ceravolo, F. Ferracuti, S. Iarlori, A. Monteriu, L. Romeo, and F. Verdini, “The kimore dataset: Kinematic as- sessment of movement and clinical scores for remote monitoring of physical rehabilitation,” TNSRE, vol. 27, no. 7, pp. 1436–1448, 2019. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 17
work page 2019
-
[12]
Aifit: Automatic 3d human-interpretable feedback models for fitness training,
M. Fieraru, M. Zanfir, S. C. Pirlea, V . Olaru, and C. Sminchisescu, “Aifit: Automatic 3d human-interpretable feedback models for fitness training,” in CVPR, pp. 9919–9928, 2021
work page 2021
-
[13]
K. Zhou, R. Cai, Y. Ma, Q. Tan, X. Wang, J. Li, H. P . Shum, F. W. Li, S. Jin, and X. Liang, “A video-based augmented reality system for human-in-the-loop muscle strength assessment of juvenile dermatomyositis,” IEEE TVCG , vol. 29, no. 5, pp. 2456–2466, 2023
work page 2023
-
[14]
P . Parmar, J. Reddy, and B. Morris, “Piano skills assessment,” in MMSP, pp. 1–5, IEEE, 2021
work page 2021
-
[15]
Relative hidden markov models for video- based evaluation of motion skills in surgical training,
Q. Zhang and B. Li, “Relative hidden markov models for video- based evaluation of motion skills in surgical training,” IEEE TP AMI, vol. 37, no. 6, pp. 1206–1218, 2014
work page 2014
-
[16]
Action recognition with improved trajectories,
H. Wang and C. Schmid, “Action recognition with improved trajectories,” in ICCV, pp. 3551–3558, 2013
work page 2013
-
[17]
A 3-dimensional sift de- scriptor and its application to action recognition,
P . Scovanner, S. Ali, and M. Shah, “A 3-dimensional sift de- scriptor and its application to action recognition,” in ACM MM, pp. 357–360, 2007
work page 2007
-
[18]
V . Venkataraman, P . Turaga, N. Lehrer, M. Baran, T. Rikakis, and S. Wolf, “Attractor-shape for dynamical analysis of hu- man movement: Applications in stroke rehabilitation and action recognition,” in CVPRW, pp. 514–520, 2013
work page 2013
-
[19]
S. Chi, H.-g. Chi, Q. Huang, and K. Ramani, “Infogcn++: Learn- ing representation by predicting the future for online skeleton- based action recognition,” IEEE TP AMI, 2024
work page 2024
-
[20]
A survey of vision-based human action evaluation methods,
Q. Lei, J.-X. Du, H.-B. Zhang, S. Ye, and D.-S. Chen, “A survey of vision-based human action evaluation methods,” Sensors, vol. 19, no. 19, p. 4129, 2019
work page 2019
-
[21]
A survey of video-based action quality assessment,
S. Wang, D. Yang, P . Zhai, Q. Yu, T. Suo, Z. Sun, K. Li, and L. Zhang, “A survey of video-based action quality assessment,” in INSAI, pp. 1–9, IEEE, 2021
work page 2021
-
[22]
Vision- based human action quality assessment: A systematic review,
J. Liu, H. Wang, K. Stawarz, S. Li, Y. Fu, and H. Liu, “Vision- based human action quality assessment: A systematic review,” Expert Systems with Applications, p. 125642, 2024
work page 2024
-
[23]
A comprehensive survey of continual learning: theory, method and application,
L. Wang, X. Zhang, H. Su, and J. Zhu, “A comprehensive survey of continual learning: theory, method and application,” TP AMI, 2024
work page 2024
-
[24]
Hierarchical graph convolutional networks for action quality assessment,
K. Zhou, Y. Ma, H. P . Shum, and X. Liang, “Hierarchical graph convolutional networks for action quality assessment,” IEEE TCSVT, vol. 33, no. 12, pp. 7749–7763, 2023
work page 2023
-
[25]
Group-aware contrastive regression for action quality assessment,
X. Yu, Y. Rao, W. Zhao, J. Lu, and J. Zhou, “Group-aware contrastive regression for action quality assessment,” in ICCV, pp. 7919–7928, 2021
work page 2021
-
[26]
Vision- language action knowledge learning for semantic-aware action quality assessment,
H. Xu, X. Ke, Y. Li, R. Xu, H. Wu, X. Lin, and W. Guo, “Vision- language action knowledge learning for semantic-aware action quality assessment,” in ECCV, 2024
work page 2024
-
[27]
Narrative action evaluation with prompt-guided multimodal interaction,
S. Zhang, S. Bai, G. Chen, L. Chen, J. Lu, J. Wang, and Y. Tang, “Narrative action evaluation with prompt-guided multimodal interaction,” in CVPR, pp. 18430–18439, 2024
work page 2024
-
[28]
Ricaˆ 2: Rubric-informed, calibrated assessment of actions,
A. Majeedi, V . R. Gajjala, S. S. S. N. GNVV , and Y. Li, “Ricaˆ 2: Rubric-informed, calibrated assessment of actions,” arXiv preprint arXiv:2408.02138, 2024
-
[29]
Multimodal action quality assess- ment,
L.-A. Zeng and W.-S. Zheng, “Multimodal action quality assess- ment,” IEEE TIP, 2024
work page 2024
-
[30]
Semi-supervised action quality assessment with self-supervised segment feature recovery,
S.-J. Zhang, J.-H. Pan, J. Gao, and W.-S. Zheng, “Semi-supervised action quality assessment with self-supervised segment feature recovery,” IEEE TCSVT, vol. 32, no. 9, pp. 6017–6028, 2022
work page 2022
-
[31]
Semi-supervised teacher- reference-student architecture for action quality assessment,
W. Yun, M. Qi, F. Peng, and H. Ma, “Semi-supervised teacher- reference-student architecture for action quality assessment,” arXiv preprint arXiv:2407.19675, 2024
-
[32]
Magr: Manifold-aligned graph regularization for con- tinual action quality assessment,
K. Zhou, L. Wang, X. Zhang, H. P . Shum, F. W. Li, J. Li, and X. Liang, “Magr: Manifold-aligned graph regularization for con- tinual action quality assessment,” arXiv preprint arXiv:2403.04398, 2024
-
[33]
Continual action assessment via task-consistent score-discriminative feature distribution modeling,
Y.-M. Li, L.-A. Zeng, J.-K. Meng, and W.-S. Zheng, “Continual action assessment via task-consistent score-discriminative feature distribution modeling,” IEEE TCSVT, 2024
work page 2024
-
[34]
Pecop: Parameter efficient continual pretraining for action quality as- sessment,
A. Dadashzadeh, S. Duan, A. Whone, and M. Mirmehdi, “Pecop: Parameter efficient continual pretraining for action quality as- sessment,” in WACV, pp. 42–52, 2024
work page 2024
-
[35]
Techcoach: Towards technical keypoint-aware descriptive action coaching,
Y.-M. Li, A.-L. Wang, K.-Y. Lin, T. Yu-Ming, L.-A. Zeng, J.-F. Hu, and W.-S. Zheng, “Techcoach: Towards technical keypoint-aware descriptive action coaching,” arXiv preprint arXiv:2411.17130 , 2024
-
[36]
Likert scoring with grade decoupling for long-term action assessment,
A. Xu, L.-A. Zeng, and W.-S. Zheng, “Likert scoring with grade decoupling for long-term action assessment,” in CVPR, pp. 3232– 3241, 2022
work page 2022
-
[37]
What and how well you performed? a multitask learning approach to action quality assessment,
P . Parmar and B. T. Morris, “What and how well you performed? a multitask learning approach to action quality assessment,” in CVPR, pp. 304–313, 2019
work page 2019
-
[38]
A figure skating jumping dataset for replay-guided action quality assessment,
Y. Liu, X. Cheng, and T. Ikenaga, “A figure skating jumping dataset for replay-guided action quality assessment,” in ACM MM, pp. 2437–2445, 2023
work page 2023
-
[39]
T. Wang, Y. Wang, and M. Li, “Towards accurate and interpretable surgical skill assessment: A video-based method incorporat- ing recognized surgical gestures and skill levels,” in MICCAI, pp. 668–678, Springer, 2020
work page 2020
-
[40]
Who’s better? who’s best? pairwise deep ranking for skill determination,
H. Doughty, D. Damen, and W. Mayol-Cuevas, “Who’s better? who’s best? pairwise deep ranking for skill determination,” in CVPR, pp. 6057–6066, 2018
work page 2018
-
[41]
Which is the better teacher action? a new ranking model and dataset,
M. Fang, X. Du, Q. Liu, Y. Zhou, Q. Liang, and S. Liu, “Which is the better teacher action? a new ranking model and dataset,” in ICASSP, pp. 7695–7699, IEEE, 2024
work page 2024
-
[42]
Imagenet clas- sification with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet clas- sification with deep convolutional neural networks,” NeurIPS, vol. 25, 2012
work page 2012
-
[43]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in CVPR, pp. 770–778, 2016
work page 2016
-
[44]
Learning spatiotemporal features with 3d convolutional net- works,
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3d convolutional net- works,” in ICCV, pp. 4489–4497, 2015
work page 2015
-
[45]
Learning spatio-temporal representa- tion with pseudo-3d residual networks,
Z. Qiu, T. Yao, and T. Mei, “Learning spatio-temporal representa- tion with pseudo-3d residual networks,” in ICCV, pp. 5533–5541, 2017
work page 2017
-
[46]
Quo vadis, action recognition? a new model and the kinetics dataset,
J. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the kinetics dataset,” in CVPR, pp. 6299–6308, 2017
work page 2017
-
[47]
Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” in CVPR, pp. 3202–3211, 2022
work page 2022
-
[48]
Tsa-net: Tube self-attention network for action quality assessment,
S. Wang, D. Yang, P . Zhai, C. Chen, and L. Zhang, “Tsa-net: Tube self-attention network for action quality assessment,” in ACM MM, pp. 4902–4910, 2021
work page 2021
-
[49]
Action quality assessment with ignoring scene con- text,
T. Nagai, S. Takeda, M. Matsumura, S. Shimizu, and S. Ya- mamoto, “Action quality assessment with ignoring scene con- text,” in ICIP, pp. 1189–1193, IEEE, 2021
work page 2021
-
[50]
Action assessment by joint relation graphs,
J.-H. Pan, J. Gao, and W.-S. Zheng, “Action assessment by joint relation graphs,” in ICCV, pp. 6331–6340, 2019
work page 2019
-
[51]
J.-H. Pan, J. Gao, and W.-S. Zheng, “Adaptive action assessment,” IEEE TP AMI, vol. 44, no. 12, pp. 8779–8795, 2021
work page 2021
-
[52]
Self- supervised subaction parsing network for semi-supervised action quality assessment,
K. Gedamu, Y. Ji, Y. Yang, J. Shao, and H. T. Shen, “Self- supervised subaction parsing network for semi-supervised action quality assessment,” IEEE TIP, 2024
work page 2024
-
[53]
Fine-grained spatio-temporal parsing network for action quality assessment,
K. Gedamu, Y. Ji, Y. Yang, J. Shao, and H. T. Shen, “Fine-grained spatio-temporal parsing network for action quality assessment,” IEEE TIP, vol. 32, pp. 6386–6400, 2023
work page 2023
-
[54]
Surgical skill assessment via video semantic aggregation,
Z. Li, L. Gu, W. Wang, R. Nakamura, and Y. Sato, “Surgical skill assessment via video semantic aggregation,” inMICCAI, pp. 410– 420, Springer, 2022
work page 2022
-
[55]
Hierarchical neurosymbolic ap- proach for comprehensive and explainable action quality assess- ment,
L. Okamoto and P . Parmar, “Hierarchical neurosymbolic ap- proach for comprehensive and explainable action quality assess- ment,” in CVPRW, pp. 3204–3213, 2024
work page 2024
-
[56]
In- terpretable long-term action quality assessment,
X. Dong, X. Liu, W. Li, A. Adeyemi-Ejeye, and A. Gilbert, “In- terpretable long-term action quality assessment,” arXiv preprint arXiv:2408.11687, 2024
-
[57]
Finediving: A fine-grained dataset for procedure-aware action quality assess- ment,
J. Xu, Y. Rao, X. Yu, G. Chen, J. Zhou, and J. Lu, “Finediving: A fine-grained dataset for procedure-aware action quality assess- ment,” in CVPR, pp. 2949–2958, 2022
work page 2022
-
[58]
Action quality assessment with temporal parsing transformer,
Y. Bai, D. Zhou, S. Zhang, J. Wang, E. Ding, Y. Guan, Y. Long, and J. Wang, “Action quality assessment with temporal parsing transformer,” in ECCV, pp. 422–438, Springer, 2022
work page 2022
-
[59]
J. Xu, S. Yin, G. Zhao, Z. Wang, and Y. Peng, “Fineparser: A fine- grained spatio-temporal action parser for human-centric action quality assessment,” in CVPR, pp. 14628–14637, 2024
work page 2024
-
[60]
Iris: Interpretable rubric-informed segmentation for action quality assessment,
H. Matsuyama, N. Kawaguchi, and B. Y. Lim, “Iris: Interpretable rubric-informed segmentation for action quality assessment,” in ICIUI, pp. 368–378, 2023
work page 2023
-
[61]
Uncertainty-aware score distribution learning for action quality assessment,
Y. Tang, Z. Ni, J. Zhou, D. Zhang, J. Lu, Y. Wu, and J. Zhou, “Uncertainty-aware score distribution learning for action quality assessment,” in CVPR, pp. 9839–9848, 2020
work page 2020
-
[62]
Uncertainty-driven action quality assessment,
C. Zhou, Y. Huang, and H. Ling, “Uncertainty-driven action quality assessment,” arXiv preprint arXiv:2207.14513, 2022
-
[63]
B. Zhang, J. Chen, Y. Xu, H. Zhang, X. Yang, and X. Geng, “Auto-encoding score distribution regression for action quality JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 18 assessment,” Neural Computing and Applications , vol. 36, no. 2, pp. 929–942, 2024
work page 2021
-
[64]
Localization- assisted uncertainty score disentanglement network for action quality assessment,
Y. Ji, L. Ye, H. Huang, L. Mao, Y. Zhou, and L. Gao, “Localization- assisted uncertainty score disentanglement network for action quality assessment,” in ACM MM, pp. 8590–8597, 2023
work page 2023
-
[65]
Cofinal: Enhancing action quality assessment with coarse-to-fine instruc- tion alignment,
K. Zhou, J. Li, R. Cai, L. Wang, X. Zhang, and X. Liang, “Cofinal: Enhancing action quality assessment with coarse-to-fine instruc- tion alignment,” in IJCAI, 2024
work page 2024
-
[66]
Pairwise contrastive learning network for action quality assessment,
M. Li, H.-B. Zhang, Q. Lei, Z. Fan, J. Liu, and J.-X. Du, “Pairwise contrastive learning network for action quality assessment,” in ECCV, pp. 457–473, Springer, 2022
work page 2022
-
[67]
Two-path target-aware contrastive regression for action quality assessment,
X. Ke, H. Xu, X. Lin, and W. Guo, “Two-path target-aware contrastive regression for action quality assessment,” Information Sciences, vol. 664, p. 120347, 2024
work page 2024
-
[68]
Multi-stage contrastive regression for action quality assessment,
Q. An, M. Qi, and H. Ma, “Multi-stage contrastive regression for action quality assessment,” in ICASSP, pp. 4110–4114, IEEE, 2024
work page 2024
-
[69]
Rhyth- mer: Ranking-based skill assessment with rhythm-aware trans- former,
Z. Luo, Y. Xiao, F. Yang, J. T. Zhou, and Z. Fang, “Rhyth- mer: Ranking-based skill assessment with rhythm-aware trans- former,” IEEE TCSVT, 2024
work page 2024
-
[70]
Realtime multi-person 2d pose estimation using part affinity fields,
Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person 2d pose estimation using part affinity fields,” in CVPR, pp. 7291– 7299, 2017
work page 2017
-
[71]
MediaPipe: A Framework for Building Perception Pipelines
C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M. G. Yong, J. Lee, et al. , “Mediapipe: A framework for building perception pipelines,” arXiv preprint arXiv:1906.08172, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[72]
Vitpose: Simple vi- sion transformer baselines for human pose estimation,
Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “Vitpose: Simple vi- sion transformer baselines for human pose estimation,” NeurIPS, vol. 35, pp. 38571–38584, 2022
work page 2022
-
[73]
Skeleton-based action quality assess- ment via partially connected lstm with triplet losses,
X. Wang, J. Li, and H. Hu, “Skeleton-based action quality assess- ment via partially connected lstm with triplet losses,” in PRCV, pp. 220–232, Springer, 2022
work page 2022
-
[74]
B. X. B. Yu, Y. Liu, X. Zhang, G. Chen, and K. C. C. Chan, “EGCN: an ensemble-based learning framework for exploring effective skeleton-based rehabilitation exercise assessment,” in IJCAI, pp. 3681–3687, 2022
work page 2022
-
[75]
X. Bruce, Y. Liu, K. C. Chan, and C. W. Chen, “Egcn++: A new fusion strategy for ensemble learning in skeleton-based rehabilitation exercise assessment,” IEEE TP AMI, 2024
work page 2024
-
[76]
C. Li, X. Ling, and S. Xia, “A graph convolutional siamese network for the assessment and recognition of physical rehabili- tation exercises,” in ICANN, pp. 229–240, Springer, 2023
work page 2023
-
[77]
X. Bruce, Y. Liu, K. C. Chan, Q. Yang, and X. Wang, “Skeleton- based human action evaluation using graph convolutional net- work for monitoring alzheimer’s progression,” PR, vol. 119, p. 108095, 2021
work page 2021
-
[78]
A deep learning framework for assessing physical rehabilitation exercises,
Y. Liao, A. Vakanski, and M. Xian, “A deep learning framework for assessing physical rehabilitation exercises,” IEEE TNSRE , vol. 28, no. 2, pp. 468–477, 2020
work page 2020
-
[79]
Spatial temporal graph convolu- tional networks for skeleton-based action recognition,
S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolu- tional networks for skeleton-based action recognition,” in AAAI, vol. 32, 2018
work page 2018
-
[80]
C. Zhou, J. Zeng, L. Qiu, S. Wang, P . Liu, and J. Pan, “An attention-based adaptive spatial–temporal graph convolutional network for long-video ergonomic risk assessment,” Engineering Applications of Artificial Intelligence, vol. 131, p. 107780, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.