A Machine Learning Framework for Real-Time Personalized Ergonomic Pose Analysis

Bruno Simoes; Julen Balzategui; Manex Atxa

arxiv: 2606.12988 · v1 · pith:PDBYPRZ4new · submitted 2026-06-11 · 💻 cs.CV · cs.AI

A Machine Learning Framework for Real-Time Personalized Ergonomic Pose Analysis

Manex Atxa , Bruno Simoes , Julen Balzategui This is my paper

Pith reviewed 2026-06-27 07:18 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords ergonomic pose analysisreal-time inference3D point cloudsRGB-D cameraspersonalized deep learningvolumetric videopose estimationworkplace monitoring

0 comments

The pith

A framework trains a personalized classifier exclusively on user-selected 3D poses to enable real-time ergonomic inference from RGB-D camera streams.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes a methodology for predicting ergonomic and non-ergonomic poses in real time from three-dimensional point cloud data. The method addresses the restriction of fixed camera viewpoints by processing volumetric video that supports multiple angles and handles occlusions. Training occurs only on poses that the user manually chooses and labels, while the resulting deep learning model automatically analyzes new streaming data. Such a system supports practical ergonomic evaluations in work settings and can extend to other real-time posture analysis needs. The case study with load-lifting tasks illustrates the full pipeline from capture to inference.

Core claim

The paper claims that combining state-of-the-art 3D data processing with a deep learning classifier, trained solely on user-manually-selected poses from RGB-D captured data, allows continuous automatic pose inference on live streaming inputs for ergonomic assessment, overcoming the data limitations of traditional fixed-view cameras.

What carries the argument

The personalized deep learning classifier, trained exclusively on manually selected and labeled poses from 3D volumetric video, which then infers poses automatically on real-time streams.

If this is right

The system performs real-time skeletal labeling on subjects during load-lifting tasks.
Multi-angle analysis from 3D point clouds mitigates issues with occlusions and fixed viewpoints.
Traditional 2D pose estimation algorithms integrate with 3D technologies for scalable workplace monitoring.
The method adapts to other applications needing real-time human posture analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Minimal manual labeling focused on representative poses may suffice for effective personalization across users.
Such frameworks could support proactive health interventions by flagging non-ergonomic poses during actual work activities.
Testing on diverse body types or task variations would reveal the limits of the training approach.

Load-bearing premise

Poses that users manually select and label in a training phase will enable the classifier to make accurate real-time ergonomic inferences on new unlabeled 3D streaming data.

What would settle it

An experiment comparing the classifier's real-time predictions on fresh RGB-D streams against ground-truth labels obtained independently, where low agreement would indicate the training method does not generalize.

Figures

Figures reproduced from arXiv: 2606.12988 by Bruno Simoes, Julen Balzategui, Manex Atxa.

read the original abstract

This paper introduces a new methodology for real-time prediction of ergonomic and non-ergonomic human poses using volumetric video data in three dimensions. Although the methodology was designed for ergonomic assessments, it can be adapted to other applications requiring real-time analysis of human posture. One aspect that makes this system stand out is its ability to analyze 3D point clouds during the assessment, enabling computation from multiple angles. This overcomes a critical limitation of cameras which provide often a fixed viewpoint, thereby restricting the data available for a thorough postural evaluation, especially when occlusions occur. The system continuously and automatically performs pose inference using the chosen perspective on the real-time streaming data; however, only the poses manually selected and labeled by the user are used to train the personalized deep learning classifier. The methodology has been refined through a case study in which RGB-D cameras captured subjects performing load-lifting tasks, enabling real-time skeletal labeling. The model was trained on this data and, following the training phase, performs inference on new streaming data in real time. This research offers a scalable and pragmatic approach for real-time ergonomic evaluation by combining state-of-the-art 3D data technologies and traditional 2D pose estimation algorithms. It addresses the increasing need for safety and health monitoring in workplace environments, marking a notable contribution to the domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a 3D RGB-D ergonomic monitoring pipeline with user-labeled personalization but reports no metrics, validation, or test results at all.

read the letter

This paper describes a system for real-time ergonomic pose classification from volumetric RGB-D data. The core idea is a two-stage process: a user manually picks and labels some poses to train a personalized deep learning classifier, after which the system runs inference on new streaming point clouds.

What is actually new is the application focus on load-lifting tasks with explicit handling of multiple viewpoints to reduce occlusion problems that plague single-camera setups. The case study framing around workplace safety monitoring is a reasonable practical angle.

The paper does a clear job laying out why 3D point clouds help and how the training and inference phases are separated.

The soft spot is the total lack of evidence. The text gives no architecture details, no count of labeled examples, no accuracy or F1 numbers, no held-out streaming test sequences, and no comparison to baselines. The central claim that the manually trained classifier will label new data accurately therefore sits as an untested assumption.

This is for applied ergonomics or industrial safety people who want a high-level system sketch to adapt. Readers looking for validated methods or reproducible results will get almost nothing from it.

I would not send this to peer review. It needs a results section with concrete numbers before any referee should spend time on it.

Referee Report

2 major / 1 minor

Summary. The paper introduces a methodology for real-time prediction of ergonomic and non-ergonomic human poses using volumetric 3D point cloud data from RGB-D cameras. It describes a personalized deep learning classifier trained exclusively on user-manually selected and labeled poses from a load-lifting case study; after training, the system performs automatic inference on new streaming 3D data from multiple viewpoints to overcome fixed-camera occlusions. The approach combines 3D technologies with traditional 2D pose estimation for workplace safety monitoring.

Significance. If the central generalization claim were supported by quantitative evidence, the framework could provide a pragmatic, scalable tool for personalized real-time ergonomic assessment that handles viewpoint limitations better than 2D methods. This would address a practical need in occupational health monitoring.

major comments (2)

[Abstract] Abstract and case-study description: no quantitative results, error metrics (accuracy, F1, confusion matrix), validation procedure, dataset size, held-out streaming sequences, architecture details, feature extraction method from volumetric data, or loss function are reported. This leaves the claim that the classifier produces accurate ergonomic labels on new unlabeled streaming point clouds as an untested assumption.
[Case study] Case study: the text states that 'only the poses manually selected and labeled by the user are used to train' and that 'following the training phase, performs inference on new streaming data in real time,' yet supplies zero performance numbers or cross-validation results on continuous sequences. This is load-bearing for the central claim of real-time personalized inference.

minor comments (1)

[Abstract] Abstract sentence 'The system continuously and automatically performs pose inference using the chosen perspective on the real-time streaming data; however, only the poses manually selected...' is awkwardly phrased and could be clarified.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and constructive feedback. We agree that the current manuscript is primarily a methodological description and lacks the quantitative validation needed to support claims of accurate real-time inference. We will revise the paper to address these gaps.

read point-by-point responses

Referee: [Abstract] Abstract and case-study description: no quantitative results, error metrics (accuracy, F1, confusion matrix), validation procedure, dataset size, held-out streaming sequences, architecture details, feature extraction method from volumetric data, or loss function are reported. This leaves the claim that the classifier produces accurate ergonomic labels on new unlabeled streaming point clouds as an untested assumption.

Authors: We agree that the abstract and case-study description omit all quantitative results, metrics, validation details, dataset sizes, architecture specifications, feature extraction methods, and loss functions. This is a substantive omission that leaves performance claims unverified. In the revised manuscript we will add these elements, including accuracy, F1 scores, confusion matrices, cross-validation procedures on held-out streaming sequences, model architecture, volumetric feature extraction approach, and training loss, drawn from the load-lifting case study. revision: yes
Referee: [Case study] Case study: the text states that 'only the poses manually selected and labeled by the user are used to train' and that 'following the training phase, performs inference on new streaming data in real time,' yet supplies zero performance numbers or cross-validation results on continuous sequences. This is load-bearing for the central claim of real-time personalized inference.

Authors: We concur that the case-study section provides no performance numbers or cross-validation results on continuous sequences, which is essential to substantiate the real-time personalized inference claim. The revised version will incorporate these quantitative results, including metrics on held-out streaming data, to demonstrate the classifier's behavior after training on user-labeled poses. revision: yes

Circularity Check

0 steps flagged

No circularity: high-level system description with no derivations or fitted parameters

full rationale

The paper is a conceptual methodology overview for a real-time ergonomic pose analysis system using RGB-D data and a personalized deep learning classifier. It describes a training phase on user-selected labeled poses followed by inference on streaming data, but supplies no equations, derivations, model architectures, loss functions, or quantitative results. No load-bearing steps reduce to self-definition, fitted inputs renamed as predictions, or self-citation chains. The central claim rests on an unverified generalization assumption, which is a validation gap rather than circularity. The derivation chain is self-contained as a high-level architecture sketch with no mathematical content that could be circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, background axioms, or new postulated entities.

pith-pipeline@v0.9.1-grok · 5765 in / 1117 out tokens · 32148 ms · 2026-06-27T07:18:49.132029+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 21 canonical work pages · 3 internal anchors

[1]

The application of artificial intelligence (AI) technologies in ergonomic assessments has substantially improved the ability to assess and manage workplace safety concerns

INTRODUCTION. The application of artificial intelligence (AI) technologies in ergonomic assessments has substantially improved the ability to assess and manage workplace safety concerns. AI technology and sophisticated ergonomic assessment techniques have yet to become established in most industry sectors and organizational structures. Current techniques ...
[2]

LITERATURE. Several studies have explored the applicability of artificial intelligence within the field of ergonomics, with some reviews covering its application for safety within the workplace, human behavior analytics, and injury forecasting [5]. AI methodologies own capabilities such as handling big data datasets [ 6], observing human postures and move...
[3]

All these models achieved encouraging results for human pose estimation

that apply Transformer network architecture [18] and GCN (Graph Convolutional Network) based models [19]. All these models achieved encouraging results for human pose estimation. Nevertheless, today’s algorithms for 2D detection do not handle point clouds or 3D information as input. Models using three-dimensional inputs operate on depth-enhanced data in t...
[4]

Consequently, the presented application addresses the previously mentioned drawbacks

are well-optimized and trained on large, diverse datasets such as COCO [22]. Consequently, the presented application addresses the previously mentioned drawbacks. Firstly, it allows 3D data as input, enabling the handling of an interactive point cloud with infinite view perspectives. Additionally, it projects the 3D data onto a 2D space, enabling the appl...
[5]

This section describes the two -part system which begins with inputting point cloud data and culminates with obtaining classified skeletal structures

SYSTEM ARCHITECTURE. This section describes the two -part system which begins with inputting point cloud data and culminates with obtaining classified skeletal structures. The first component involves training a deep learning model for pose classification, while the second component details the data processing steps for model inference. Figure 1: Two-Part...
[6]

A specific use case was created to assess the functioning and effectiveness of the system

USE CASE. A specific use case was created to assess the functioning and effectiveness of the system. The aim was to categorize the poses of workers who were load lifting as either 7 ergonomic or non -ergonomic. In achieving this aim, the previously described architecture was adopted for the given scenario devised for the experiment. First, a custom datase...
[7]

RESULTS. As a result, we achieved real-time processing of new point clouds, displaying the input point cloud projected in two dimensions, the detected skeleton in both 2D and 3D — reconstructed using the capabilities of MMPose—and the predicted label, as shown in Figure 7. 9 Figure 7: Ergonomic and non-ergonomic pose inferences. Within this study, time to...
[8]

This paper describes an advanced machine learning system which combines volumetric video data, ergonomic pose analysis, and two-dimensional pose detection in real time

CONCLUSION. This paper describes an advanced machine learning system which combines volumetric video data, ergonomic pose analysis, and two-dimensional pose detection in real time. The system employs point clouds to capture and encode 3D spatial representations of obj ects within an environment. When integrated with robust computer vision methods such as ...
[9]

A., Alcaide-Marzal, J., & Poveda -Bautista, R

Diego-Mas, J. A., Alcaide-Marzal, J., & Poveda -Bautista, R. (2017). Errors using observational methods for ergonomics assessment in real practice. Human Factors, 59(8), 1173–1187. https://doi.org/10.1177/0018720817723496

work page doi:10.1177/0018720817723496 2017
[10]

Priyanka, M., & Subashini, R. (2024). Does artificial intelligence mediate between ergonomics and the drivers of ergonomics innovations – an empirical evidence. International Research Journal of Multidisciplinary Scope , 5(2), 162 –174. https://doi.org/10.47857/irjms.2024.v05i02.0398 11

work page doi:10.47857/irjms.2024.v05i02.0398 2024
[11]

Wang, Q. (2019). Automatic checks from 3D point cloud data for safety regulation compliance for scaffold work platforms. Automation in Construction , 104, 38 –51. https://doi.org/10.1016/j.autcon.2019.04.008

work page doi:10.1016/j.autcon.2019.04.008 2019
[12]

B., Xiao, Y., Fukumura, Y

Rodrigues, P. B., Xiao, Y., Fukumura, Y. E., Awada, M., Aryal, A., Becerik-Gerber, B., Lucas, G., & Roll, S. C. (2022). Ergonomic assessment of office worker postures using 3D automated joint angle assessment. Advanced Engineering Informatics, 52, 101596. https://doi.org/10.1016/j.aei.2022.101596

work page doi:10.1016/j.aei.2022.101596 2022
[13]

Petrat, D. (2021). Artificial intelligence in human factors and ergonomics: An overview of the current state of research. Discover Artificial Intelligence , 1(3). https://doi.org/10.1007/s44163-021-00001-5

work page doi:10.1007/s44163-021-00001-5 2021
[14]

M., Azhir, E., Ali, S., Mohammadi, M., Ahmed, O

Rahmani, A. M., Azhir, E., Ali, S., Mohammadi, M., Ahmed, O. H., Ghafour, M. Y., Ahmed, S. H., & Hosseinzadeh, M. (2021). Artificial intelligence approaches and mechanisms for big data analytics: A systematic study. PeerJ Computer Science , 7, e488. https://doi.org/10.7717/peerj-cs.488

work page doi:10.7717/peerj-cs.488 2021
[15]

C., Dairywala, M

Hamilton, B. C., Dairywala, M. I., Highet, A., Nguyen, T. C., O'Sullivan, P., Chern, H., & Soriano, I. S. (2023). Artificial intelligence based real -time video ergonomic assessment and training improves resident ergonomics. American Journal of Surgery, 226(5), 741–746. https://doi.org/10.1016/j.amjsurg.2023.07.028

work page doi:10.1016/j.amjsurg.2023.07.028 2023
[16]

E., Nguyen, P

Mudiyanselage, S. E., Nguyen, P. H. D., Rajabi, M. S., & Akhavian , R. (2021). Automated Workers’ Ergonomic Risk Assessment in Manual Material Handling Using sEMG Wearable Sensors and Machine Learning. Electronics, 10(20), 2558. https://doi.org/10.3390/electronics10202558

work page doi:10.3390/electronics10202558 2021
[17]

G., Skoulariki, K., & Gazis, A

Karypidis, E., Mouslech, S. G., Skoulariki, K., & Gazis, A. (2022). Comparison analysis of traditional machine learning and deep learning techniques for data and image classification. WSEAS Transactions on Mathematics, 21 , 122 –130. https://doi.org/10.37394/23206.2022.21.19

work page doi:10.37394/23206.2022.21.19 2022
[18]

Ioannidou, A., Chatzilari, E., Nikolopoulos, S., & Kompatsiaris, I. (2018). Deep learning advances in computer vision with 3D data: A survey. ACM Computing Surveys, 50(2), 20. https://doi.org/10.1145/3042064

work page doi:10.1145/3042064 2018
[19]

OpenMMLab Team. (2020). MMPose: OpenMMLab Pose Estimation Toolbox and Benchmark. https://github.com/open-mmlab/mmpose

2020
[20]

OpenMMLab Team. (2025). Benchmark — MMPose 1.3.2 documentation . Available at: https://mmpose.readthedocs.io/en/latest/notes/benchmark.html

2025
[21]

Jhuang, H., Gall, J., Zuffi, S., Schmid, C., & Black, M. J. (2013). Towards understanding action recognition. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3192 –3199). IEEE. https://doi.org/10.1109/ICCV.2013.396 12

work page doi:10.1109/iccv.2013.396 2013
[22]

-S., & Lu, C

Li, J., Wang, C., Zhu, H., Mao, Y., Fang, H. -S., & Lu, C. (2019). CrowdPose: Efficient crowded scenes pose estimation and a new benchmark . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10863– 10872). IEEE. https://doi.org/10.1109/CVPR.2019.01113

work page doi:10.1109/cvpr.2019.01113 2019
[23]

Toshev, A., & Szegedy, C. (2014). DeepPose: Human pose estimation via deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1653–1660. https://doi.org/10.1109/CVPR.2014.214

work page doi:10.1109/cvpr.2014.214 2014
[24]

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Cao, Z., Hidalgo, G., Simon, T., Wei, S. -E., & Sheikh, Y . (2018). OpenPose: Realtime multi -person 2D pose estimation using part affinity fields . arXiv. https://arxiv.org/abs/1812.08008

work page internal anchor Pith review Pith/arXiv arXiv 2018
[25]

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers . arXiv. https://arxiv.org/abs/2005.12872

work page arXiv 2020
[26]

Attention Is All You Need

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need . arXiv. https://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2017
[27]

Zou, Z., & Tang, W. (2021). Modulated graph convolutional network for 3D human pose estimation. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 11477–11487. https://doi.org/10.1109/ICCV48922.2021.01128

work page doi:10.1109/iccv48922.2021.01128 2021
[28]

Ballester, I., Peterka, O., & Kampel, M. (2024). SPiKE: 3D human pose from point cloud sequences . In A. Antonacopoulos, S. Chaudhuri, R. Chellappa, C. Liu, S. Bhattacharya, & U. Pal (Eds.), Pattern recognition (pp. 470 –486). Springer. https://doi.org/10.1007/978-3-031-78456-9_30

work page doi:10.1007/978-3-031-78456-9_30 2024
[29]

Zhou, Y., Dong, H., & El Saddik, A. (2022). Learning to estimate 3D human pose from point cloud. arXiv. https://arxiv.org/abs/2212.12910

work page arXiv 2022
[30]

Microsoft COCO: Common Objects in Context

Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., & Dollár, P. (2014). Microsoft COCO: Common objects in context. arXiv. https://arxiv.org/abs/1405.0312

work page internal anchor Pith review Pith/arXiv arXiv 2014
[31]

Cabrero Barros, S., Elosegi, A., Tamayo, I., Domínguez Fanlo, A., & Zorrilla, M. J. (2024). Volumetric video on the web: A platform prototype and empirical study . In Proceedings of the 29th International ACM Conference on 3D Web Technology (pp. 1–10). https://doi.org/10.1145/3665318.3677170

work page doi:10.1145/3665318.3677170 2024
[32]

(2023, August 12)

National Library of Medicine (US). (2023, August 12). Lifting and bending the right way. MedlinePlus. Retrieved May 21, 2025, from https://medlineplus.gov/ency/patientinstructions/000414.htm

2023
[33]

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: 13 Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://www.jmlr.org/papers/vo...

2011

[1] [1]

The application of artificial intelligence (AI) technologies in ergonomic assessments has substantially improved the ability to assess and manage workplace safety concerns

INTRODUCTION. The application of artificial intelligence (AI) technologies in ergonomic assessments has substantially improved the ability to assess and manage workplace safety concerns. AI technology and sophisticated ergonomic assessment techniques have yet to become established in most industry sectors and organizational structures. Current techniques ...

[2] [2]

LITERATURE. Several studies have explored the applicability of artificial intelligence within the field of ergonomics, with some reviews covering its application for safety within the workplace, human behavior analytics, and injury forecasting [5]. AI methodologies own capabilities such as handling big data datasets [ 6], observing human postures and move...

[3] [3]

All these models achieved encouraging results for human pose estimation

that apply Transformer network architecture [18] and GCN (Graph Convolutional Network) based models [19]. All these models achieved encouraging results for human pose estimation. Nevertheless, today’s algorithms for 2D detection do not handle point clouds or 3D information as input. Models using three-dimensional inputs operate on depth-enhanced data in t...

[4] [4]

Consequently, the presented application addresses the previously mentioned drawbacks

are well-optimized and trained on large, diverse datasets such as COCO [22]. Consequently, the presented application addresses the previously mentioned drawbacks. Firstly, it allows 3D data as input, enabling the handling of an interactive point cloud with infinite view perspectives. Additionally, it projects the 3D data onto a 2D space, enabling the appl...

[5] [5]

This section describes the two -part system which begins with inputting point cloud data and culminates with obtaining classified skeletal structures

SYSTEM ARCHITECTURE. This section describes the two -part system which begins with inputting point cloud data and culminates with obtaining classified skeletal structures. The first component involves training a deep learning model for pose classification, while the second component details the data processing steps for model inference. Figure 1: Two-Part...

[6] [6]

A specific use case was created to assess the functioning and effectiveness of the system

USE CASE. A specific use case was created to assess the functioning and effectiveness of the system. The aim was to categorize the poses of workers who were load lifting as either 7 ergonomic or non -ergonomic. In achieving this aim, the previously described architecture was adopted for the given scenario devised for the experiment. First, a custom datase...

[7] [7]

RESULTS. As a result, we achieved real-time processing of new point clouds, displaying the input point cloud projected in two dimensions, the detected skeleton in both 2D and 3D — reconstructed using the capabilities of MMPose—and the predicted label, as shown in Figure 7. 9 Figure 7: Ergonomic and non-ergonomic pose inferences. Within this study, time to...

[8] [8]

This paper describes an advanced machine learning system which combines volumetric video data, ergonomic pose analysis, and two-dimensional pose detection in real time

CONCLUSION. This paper describes an advanced machine learning system which combines volumetric video data, ergonomic pose analysis, and two-dimensional pose detection in real time. The system employs point clouds to capture and encode 3D spatial representations of obj ects within an environment. When integrated with robust computer vision methods such as ...

[9] [9]

A., Alcaide-Marzal, J., & Poveda -Bautista, R

Diego-Mas, J. A., Alcaide-Marzal, J., & Poveda -Bautista, R. (2017). Errors using observational methods for ergonomics assessment in real practice. Human Factors, 59(8), 1173–1187. https://doi.org/10.1177/0018720817723496

work page doi:10.1177/0018720817723496 2017

[10] [10]

Priyanka, M., & Subashini, R. (2024). Does artificial intelligence mediate between ergonomics and the drivers of ergonomics innovations – an empirical evidence. International Research Journal of Multidisciplinary Scope , 5(2), 162 –174. https://doi.org/10.47857/irjms.2024.v05i02.0398 11

work page doi:10.47857/irjms.2024.v05i02.0398 2024

[11] [11]

Wang, Q. (2019). Automatic checks from 3D point cloud data for safety regulation compliance for scaffold work platforms. Automation in Construction , 104, 38 –51. https://doi.org/10.1016/j.autcon.2019.04.008

work page doi:10.1016/j.autcon.2019.04.008 2019

[12] [12]

B., Xiao, Y., Fukumura, Y

Rodrigues, P. B., Xiao, Y., Fukumura, Y. E., Awada, M., Aryal, A., Becerik-Gerber, B., Lucas, G., & Roll, S. C. (2022). Ergonomic assessment of office worker postures using 3D automated joint angle assessment. Advanced Engineering Informatics, 52, 101596. https://doi.org/10.1016/j.aei.2022.101596

work page doi:10.1016/j.aei.2022.101596 2022

[13] [13]

Petrat, D. (2021). Artificial intelligence in human factors and ergonomics: An overview of the current state of research. Discover Artificial Intelligence , 1(3). https://doi.org/10.1007/s44163-021-00001-5

work page doi:10.1007/s44163-021-00001-5 2021

[14] [14]

M., Azhir, E., Ali, S., Mohammadi, M., Ahmed, O

Rahmani, A. M., Azhir, E., Ali, S., Mohammadi, M., Ahmed, O. H., Ghafour, M. Y., Ahmed, S. H., & Hosseinzadeh, M. (2021). Artificial intelligence approaches and mechanisms for big data analytics: A systematic study. PeerJ Computer Science , 7, e488. https://doi.org/10.7717/peerj-cs.488

work page doi:10.7717/peerj-cs.488 2021

[15] [15]

C., Dairywala, M

Hamilton, B. C., Dairywala, M. I., Highet, A., Nguyen, T. C., O'Sullivan, P., Chern, H., & Soriano, I. S. (2023). Artificial intelligence based real -time video ergonomic assessment and training improves resident ergonomics. American Journal of Surgery, 226(5), 741–746. https://doi.org/10.1016/j.amjsurg.2023.07.028

work page doi:10.1016/j.amjsurg.2023.07.028 2023

[16] [16]

E., Nguyen, P

Mudiyanselage, S. E., Nguyen, P. H. D., Rajabi, M. S., & Akhavian , R. (2021). Automated Workers’ Ergonomic Risk Assessment in Manual Material Handling Using sEMG Wearable Sensors and Machine Learning. Electronics, 10(20), 2558. https://doi.org/10.3390/electronics10202558

work page doi:10.3390/electronics10202558 2021

[17] [17]

G., Skoulariki, K., & Gazis, A

Karypidis, E., Mouslech, S. G., Skoulariki, K., & Gazis, A. (2022). Comparison analysis of traditional machine learning and deep learning techniques for data and image classification. WSEAS Transactions on Mathematics, 21 , 122 –130. https://doi.org/10.37394/23206.2022.21.19

work page doi:10.37394/23206.2022.21.19 2022

[18] [18]

Ioannidou, A., Chatzilari, E., Nikolopoulos, S., & Kompatsiaris, I. (2018). Deep learning advances in computer vision with 3D data: A survey. ACM Computing Surveys, 50(2), 20. https://doi.org/10.1145/3042064

work page doi:10.1145/3042064 2018

[19] [19]

OpenMMLab Team. (2020). MMPose: OpenMMLab Pose Estimation Toolbox and Benchmark. https://github.com/open-mmlab/mmpose

2020

[20] [20]

OpenMMLab Team. (2025). Benchmark — MMPose 1.3.2 documentation . Available at: https://mmpose.readthedocs.io/en/latest/notes/benchmark.html

2025

[21] [21]

Jhuang, H., Gall, J., Zuffi, S., Schmid, C., & Black, M. J. (2013). Towards understanding action recognition. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3192 –3199). IEEE. https://doi.org/10.1109/ICCV.2013.396 12

work page doi:10.1109/iccv.2013.396 2013

[22] [22]

-S., & Lu, C

Li, J., Wang, C., Zhu, H., Mao, Y., Fang, H. -S., & Lu, C. (2019). CrowdPose: Efficient crowded scenes pose estimation and a new benchmark . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10863– 10872). IEEE. https://doi.org/10.1109/CVPR.2019.01113

work page doi:10.1109/cvpr.2019.01113 2019

[23] [23]

Toshev, A., & Szegedy, C. (2014). DeepPose: Human pose estimation via deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1653–1660. https://doi.org/10.1109/CVPR.2014.214

work page doi:10.1109/cvpr.2014.214 2014

[24] [24]

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Cao, Z., Hidalgo, G., Simon, T., Wei, S. -E., & Sheikh, Y . (2018). OpenPose: Realtime multi -person 2D pose estimation using part affinity fields . arXiv. https://arxiv.org/abs/1812.08008

work page internal anchor Pith review Pith/arXiv arXiv 2018

[25] [25]

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers . arXiv. https://arxiv.org/abs/2005.12872

work page arXiv 2020

[26] [26]

Attention Is All You Need

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need . arXiv. https://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2017

[27] [27]

Zou, Z., & Tang, W. (2021). Modulated graph convolutional network for 3D human pose estimation. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 11477–11487. https://doi.org/10.1109/ICCV48922.2021.01128

work page doi:10.1109/iccv48922.2021.01128 2021

[28] [28]

Ballester, I., Peterka, O., & Kampel, M. (2024). SPiKE: 3D human pose from point cloud sequences . In A. Antonacopoulos, S. Chaudhuri, R. Chellappa, C. Liu, S. Bhattacharya, & U. Pal (Eds.), Pattern recognition (pp. 470 –486). Springer. https://doi.org/10.1007/978-3-031-78456-9_30

work page doi:10.1007/978-3-031-78456-9_30 2024

[29] [29]

Zhou, Y., Dong, H., & El Saddik, A. (2022). Learning to estimate 3D human pose from point cloud. arXiv. https://arxiv.org/abs/2212.12910

work page arXiv 2022

[30] [30]

Microsoft COCO: Common Objects in Context

Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., & Dollár, P. (2014). Microsoft COCO: Common objects in context. arXiv. https://arxiv.org/abs/1405.0312

work page internal anchor Pith review Pith/arXiv arXiv 2014

[31] [31]

Cabrero Barros, S., Elosegi, A., Tamayo, I., Domínguez Fanlo, A., & Zorrilla, M. J. (2024). Volumetric video on the web: A platform prototype and empirical study . In Proceedings of the 29th International ACM Conference on 3D Web Technology (pp. 1–10). https://doi.org/10.1145/3665318.3677170

work page doi:10.1145/3665318.3677170 2024

[32] [32]

(2023, August 12)

National Library of Medicine (US). (2023, August 12). Lifting and bending the right way. MedlinePlus. Retrieved May 21, 2025, from https://medlineplus.gov/ency/patientinstructions/000414.htm

2023

[33] [33]

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: 13 Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://www.jmlr.org/papers/vo...

2011