pith. sign in

arxiv: 2605.21421 · v1 · pith:WTN73SHKnew · submitted 2026-05-20 · 💻 cs.CV

AIGaitor: Privacy-preserving and cloud-free motion analysis for everyone, using edge computing

Pith reviewed 2026-05-21 05:12 UTC · model grok-4.3

classification 💻 cs.CV
keywords gait analysismotion captureon-device processingprivacy-preserving AIsmartphoneedge computingrehabilitationmonocular pose estimation
0
0 comments X

The pith

AIGaitor runs complete monocular motion capture and gait analysis entirely on a smartphone with no cloud upload.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AIGaitor as a system that executes markerless monocular motion capture pipelines and downstream deep learning analysis fully on consumer smartphones using on-device accelerators. It targets barriers to clinical adoption of motion analysis by removing costs, technical complexity, and privacy risks tied to external servers. A survey of 74 rehabilitation clinicians found 92 percent interest in an accurate low-cost tool, with privacy, cost, and training as top obstacles. The authors optimized and timed iOS versions of pose estimation, skeleton-based models, and related components, showing a 10-second 4K video processes in 77 seconds on an iPhone 14, competitive with cloud servers once network transfer time is included. This establishes feasibility for private, accessible movement analysis on everyday devices.

Core claim

AIGaitor is the first monocular system to demonstrate end-to-end on-device motion capture and downstream deep-learning analysis, supporting clinically applicable movement analysis that is low-cost, private, and accessible to smartphone users.

What carries the argument

The Time-Priority end-to-end on-device pipeline integrating optimized 2D and 3D pose estimation, pose optimization, skeleton-based deep-learning analysis, and vision-language models on mobile neural accelerators.

If this is right

  • Rehabilitation clinicians can adopt motion analysis tools at low operating cost without specialized hardware or extensive training.
  • Patient data remains on the device throughout capture and analysis, eliminating upload-related privacy risks.
  • Analysis becomes available to any user with a modern smartphone, extending reach beyond equipped clinics.
  • Short video clips can be processed in times that match or beat cloud pipelines when network transfer delays are counted.
  • Keypoint extraction runs in real time and gait classification finishes in sub-milliseconds on the same hardware.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Home-based continuous monitoring of gait changes becomes practical for long-term rehab follow-up without clinic visits.
  • The approach could combine with other phone sensors to track broader movement and balance patterns over time.
  • Lower hardware barriers may speed adoption in regions with limited access to motion labs.
  • Accuracy validation against lab systems on varied populations and movement disorders would be a direct next test.

Load-bearing premise

The mobile-optimized pose estimation and skeleton-based models retain enough accuracy for clinical gait analysis despite the paper reporting only processing times and no quantitative accuracy metrics or gold-standard comparisons.

What would settle it

A direct comparison of joint angles, stride parameters, or gait classifications produced by AIGaitor against simultaneous laboratory optical motion capture on the same walking subjects, quantifying any systematic differences.

read the original abstract

Motion capture is the gold standard for measuring human movement, but clinical use remains limited by cost, technical complexity, and privacy concerns. AIGaitor is a privacy-preserving, cloud-free motion analysis system that runs markerless monocular motion-capture pipelines and downstream deep-learning analysis entirely on a consumer smartphone using on-device neural accelerators. To motivate its design, we surveyed 74 rehabilitation clinicians: 92 percent said they would adopt an accurate, cost-effective, easy-to-use AI gait analysis tool, while 79.7 percent cited operating cost, 68.9 percent insufficient training, and 64.9 percent privacy concerns as leading barriers. We then optimized and benchmarked mobile iOS implementations of current monocular pipeline components, including 2D and 3D pose estimation, pose optimization, skeleton-based deep-learning analysis, and a vision-language model. A Time-Priority end-to-end on-device pipeline processes a 10 s 4K 60 fps video clip in 77 s on an iPhone 14, matching or beating the same pipeline on a high-end NVIDIA H200 cloud server when network transfer is included: 94 s at global mobile-average uplink and 66 s at developed-world Wi-Fi. Lightweight models such as ViTPose-s achieve real-time keypoint extraction, and skeleton-based action-recognition models provide sub-millisecond gait classification on the same clip. To our knowledge, AIGaitor is the first monocular system to demonstrate end-to-end on-device motion capture and downstream deep-learning analysis, supporting clinically applicable movement analysis that is low-cost, private, and accessible to smartphone users.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces AIGaitor, a privacy-preserving on-device system for monocular motion capture and downstream deep-learning gait analysis that runs entirely on consumer smartphones. Motivated by a survey of 74 rehabilitation clinicians (92% would adopt an accurate low-cost tool; cost, training, and privacy as top barriers), it optimizes and benchmarks iOS implementations of 2D/3D pose estimation, pose optimization, skeleton-based DL models, and a vision-language model. End-to-end timing results show a Time-Priority pipeline processes a 10 s 4K60 clip in 77 s on an iPhone 14, competitive with or faster than the same pipeline on an NVIDIA H200 cloud server once network transfer is included (94 s at global mobile uplink, 66 s at developed-world Wi-Fi). Lightweight models achieve real-time keypoint extraction and sub-millisecond classification. The central claim is that this is the first monocular end-to-end on-device system supporting clinically applicable movement analysis that is low-cost, private, and accessible.

Significance. If the accuracy of the mobile-optimized pipeline is shown to be sufficient, the work would address documented clinician barriers and enable private, low-cost gait analysis on ubiquitous hardware. The empirical demonstration of competitive or superior end-to-end latency versus cloud-plus-network is a concrete strength, as is the focus on on-device neural accelerators and the survey grounding.

major comments (2)
  1. [Abstract] Abstract: the assertion that the system supports 'clinically applicable movement analysis' is not supported by any quantitative accuracy results. Only latency numbers (77 s on iPhone 14, real-time keypoint extraction, sub-millisecond classification) are reported; no MPJPE, PCK, gait-parameter errors, comparison to optical motion-capture ground truth, or validation on patient cohorts appear.
  2. [Abstract] Abstract and Results: the central claim that mobile-optimized models 'retain enough accuracy for clinical gait analysis' rests on an untested assumption. The manuscript supplies no error metrics or clinical validation after quantization and on-device execution, leaving the clinical-utility conclusion unsupported by the presented evidence.
minor comments (2)
  1. [Abstract] Abstract: the survey is cited with specific percentages but the sampling method, response rate, or clinician demographics are not described.
  2. Consider adding a table or figure that reports both latency and accuracy metrics side-by-side for the on-device versus cloud pipelines.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback emphasizing the need for quantitative accuracy evidence to support claims of clinical applicability. We address each major comment point by point below, with revisions planned where the manuscript text can be clarified or qualified without introducing unsupported data.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that the system supports 'clinically applicable movement analysis' is not supported by any quantitative accuracy results. Only latency numbers (77 s on iPhone 14, real-time keypoint extraction, sub-millisecond classification) are reported; no MPJPE, PCK, gait-parameter errors, comparison to optical motion-capture ground truth, or validation on patient cohorts appear.

    Authors: We agree that the abstract and presented results focus exclusively on latency, on-device execution, and the clinician survey without reporting accuracy metrics such as MPJPE, PCK, or gait-parameter errors. The manuscript positions AIGaitor as an end-to-end on-device implementation of established monocular pipelines (e.g., ViTPose and skeleton-based models) whose accuracy is documented in prior literature; our contribution is the mobile optimization and latency benchmarking rather than new accuracy validation. To address this, we will revise the abstract to qualify the 'clinically applicable' phrasing as referring to the potential enabled by accurate base models running privately on-device, and we will add a brief discussion of expected accuracy retention based on the cited model papers. revision: yes

  2. Referee: [Abstract] Abstract and Results: the central claim that mobile-optimized models 'retain enough accuracy for clinical gait analysis' rests on an untested assumption. The manuscript supplies no error metrics or clinical validation after quantization and on-device execution, leaving the clinical-utility conclusion unsupported by the presented evidence.

    Authors: This observation is correct: the manuscript does not include post-quantization or on-device accuracy measurements, nor any new clinical validation. The central claims rest on the assumption that the accuracy of the source models is largely preserved under the optimizations we describe. We will revise the abstract and results sections to remove or soften language implying that the mobile pipeline has been shown to retain clinical-grade accuracy, instead framing the work as demonstrating feasible on-device execution of existing accurate models. Any available model-specific accuracy figures from the literature will be referenced more explicitly. revision: yes

standing simulated objections not resolved
  • Direct quantitative validation against optical motion-capture ground truth or on patient cohorts, as the current manuscript does not contain such experiments and adding them would require new data collection beyond the scope of this work.

Circularity Check

0 steps flagged

No circularity: empirical benchmarks of existing models with no derivations or self-referential predictions

full rationale

The manuscript describes a clinician survey and reports runtime benchmarks for off-the-shelf mobile-optimized pose estimation and skeleton-based models running on iOS devices. No equations, parameter fitting, or derivation chain appear in the provided text. Claims of being the first end-to-end on-device system are presented as an empirical observation rather than a mathematical result derived from prior self-citations or fitted inputs. The central assertions rest on direct latency measurements (e.g., 77 s for a 10 s clip) and qualitative statements about model performance, which do not reduce to the inputs by construction. This is a standard engineering demonstration paper whose content is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an applied systems and benchmarking paper. No new mathematical axioms, free parameters, or invented entities are introduced; the work relies on standard computer vision pipelines and mobile optimization techniques.

pith-pipeline@v0.9.0 · 5840 in / 1166 out tokens · 31786 ms · 2026-05-21T05:12:30.379626+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

100 extracted references · 100 canonical work pages · 6 internal anchors

  1. [1]

    Gait analysis methods in rehabilitation

    Baker R. Gait analysis methods in rehabilitation. Journal of NeuroEngineering and Rehabilitation. 2006;3(1):4. doi:10.1186/1743-0003-3-4

  2. [2]

    The Effect of Preoperative Gait Analysis on Orthopaedic Decision Making

    Kay RM, Dennis S, Rethlefsen S, Reynolds RAK, Skaggs DL, Tolo VT. The Effect of Preoperative Gait Analysis on Orthopaedic Decision Making. Clinical Orthopaedics and Related Research. 2000;372:217-22. doi:10.1097/00003086-200003000-00023

  3. [3]

    Efficacy of clinical gait analysis: A systematic review

    Wren TAL, Gorton GE, ˜Ounpuu S, Tucker CA. Efficacy of clinical gait analysis: A systematic review. Gait & Posture. 2011;34(2):149-53. doi:10.1016/j.gaitpost.2011.03.027

  4. [4]

    Clinical gait analysis 1973–2023: Evaluating progress to guide the future

    Stebbins J, Harrington M, Stewart C. Clinical gait analysis 1973–2023: Evaluating progress to guide the future. Journal of Biomechanics. 2023;160:111827. doi:10.1016/j.jbiomech.2023.111827

  5. [5]

    A Review of the Evolution of Vision-Based Motion Analysis and the Integration of Advanced Computer Vision Methods Towards Developing a Markerless System

    Colyer SL, Evans M, Cosker DP, Salo AIT. A Review of the Evolution of Vision-Based Motion Analysis and the Integration of Advanced Computer Vision Methods Towards Developing a Markerless System. Sports Medicine - Open. 2018;4(1):24. doi:10.1186/s40798-018-0139-y

  6. [6]

    Reliability of Observational Kinematic Gait Analysis

    Krebs DE, Edelstein JE, Fishman S. Reliability of Observational Kinematic Gait Analysis. Physical Therapy. 1985;65(7):1027-33. doi:10.1093/ptj/65.7.1027

  7. [7]

    Reliability of videotaped observational gait analysis in patients with orthopedic impairments

    Brunnekreef JJ, van Uden CJ, van Moorsel S, Kooloos JG. Reliability of videotaped observational gait analysis in patients with orthopedic impairments. BMC Musculoskeletal Disorders. 2005;6(1):17. doi:10.1186/1471-2474-6-17

  8. [8]

    A review of observational gait assessment in clinical practice

    Toro B, Nester C, Farren P. A review of observational gait assessment in clinical practice. Physiotherapy Theory and Practice. 2003;19(3):137-49. doi:10.1080/09593980307964

  9. [9]

    Interrater reliability of videotaped observational gait-analysis assessments

    Eastlack ME, Arvidson J, Snyder-Mackler L, Danoff JV, McGarvey CL. Interrater Reliability of Videotaped Observational Gait-Analysis Assessments. Physical Therapy. 1991;71(6):465-72. doi:10.1093/ptj/71.6.465

  10. [10]

    Reliability of visual classification of sagittal gait patterns in patients with bilateral spastic cerebral palsy

    Kim DJ, Park ES, Sim EG, Kim KJ, Kim YU, Rha Dw. Reliability of Visual Classification of Sagittal Gait Patterns in Patients with Bilateral Spastic Cerebral Palsy. Annals of Rehabilitation Medicine. 2011;35(3):354. doi:10.5535/arm.2011.35.3.354

  11. [11]

    OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields

    Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;43(1):172-86. doi:10.1109/tpami.2019.2929257

  12. [12]

    Deep High-Resolution Representation Learning for Human Pose Estimation

    Sun K, Xiao B, Liu D, Wang J. Deep High-Resolution Representation Learning for Human Pose Estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2019. p. 5686-96. doi:10.1109/cvpr.2019.00584

  13. [13]

    DeepLabCut: markerless pose estimation of user-defined body parts with deep learning

    Mathis A, Mamidanna P, Cury KM, Abe T, Murthy VN, Mathis MW, et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nature Neuroscience. 2018;21(9):1281-9. doi:10.1038/s41593-018-0209-y

  14. [14]

    3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training

    Pavllo D, Feichtenhofer C, Grangier D, Auli M. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2019. p. 7745-54. doi:10.1109/cvpr.2019.00794

  15. [15]

    Smpl: a skinned multi-person linear model,

    Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ. SMPL: a skinned multi-person linear model. ACM Transactions on Graphics. 2015;34(6):1-16. doi:10.1145/2816795.2818013

  16. [16]

    End-to-End Recovery of Human Shape and Pose

    Kanazawa A, Black MJ, Jacobs DW, Malik J. End-to-End Recovery of Human Shape and Pose. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2018. p. 7122-31. doi:10.1109/cvpr.2018.00744. May 21, 2026 12/18

  17. [17]

    Concurrent assessment of gait kinematics using marker-based and markerless motion capture

    Kanko RM, Laende EK, Davis EM, Selbie WS, Deluzio KJ. Concurrent assessment of gait kinematics using marker-based and markerless motion capture. Journal of Biomechanics. 2021;127:110665. doi:10.1016/j.jbiomech.2021.110665

  18. [18]

    Evaluation of 3D Markerless Motion Capture Accuracy Using OpenPose With Multiple Video Cameras

    Nakano N, Sakura T, Ueda K, Omura L, Kimura A, Iino Y, et al. Evaluation of 3D Markerless Motion Capture Accuracy Using OpenPose With Multiple Video Cameras. Frontiers in Sports and Active Living. 2020;2:50. doi:10.3389/fspor.2020.00050

  19. [19]

    Anipose: A toolkit for robust markerless 3D pose estimation

    Karashchuk P, Rupp KL, Dickinson ES, Walling-Bell S, Sanders E, Azim E, et al. Anipose: A toolkit for robust markerless 3D pose estimation. Cell Reports. 2021;36(13):109730. doi:10.1016/j.celrep.2021.109730

  20. [20]

    The accuracy of several pose estimation methods for 3D joint centre localisation

    Needham L, Evans M, Cosker DP, Wade L, McGuigan PM, Bilzon JL, et al. The accuracy of several pose estimation methods for 3D joint centre localisation. Scientific Reports. 2021;11(1):20673. doi:10.1038/s41598-021-00212-x

  21. [21]

    Comparison of kinematics between Theia markerless and conventional marker-based gait analysis in clinical patients

    Wren TAL, Isakov P, Rethlefsen SA. Comparison of kinematics between Theia markerless and conventional marker-based gait analysis in clinical patients. Gait & Posture. 2023;104:9-14. doi:10.1016/j.gaitpost.2023.05.029

  22. [22]

    Pose2Sim: An open-source Python package for multiview markerless kinematics

    Pagnon D, Domalain M, Reveret L. Pose2Sim: An open-source Python package for multiview markerless kinematics. Journal of Open Source Software. 2022;7(77):4362. doi:10.21105/joss.04362

  23. [23]

    OpenCap: Human movement dynamics from smartphone videos

    Uhlrich SD, Falisse A, Kidzi´ nski L, Muccini J, Ko M, Chaudhari AS, et al. OpenCap: Human movement dynamics from smartphone videos. PLOS Computational Biology. 2023;19(10):e1011462. doi:10.1371/journal.pcbi.1011462

  24. [24]

    OpenCap Monocular: 3D Human Kinematics and Musculoskeletal Dynamics from a Single Smartphone Video

    Gilon S, Miller EY, Uhlrich SD. OpenCap Monocular: 3D Human Kinematics and Musculoskeletal Dynamics from a Single Smartphone Video. arXiv preprint arXiv:260324733. 2026. Available from: https://arxiv.org/abs/2603.24733. arXiv:2603.24733

  25. [25]

    Two-dimensional video-based analysis of human gait using pose estimation

    Stenum J, Rossi C, Roemmich RT. Two-dimensional video-based analysis of human gait using pose estimation. PLOS Computational Biology. 2021;17(4):e1008935. doi:10.1371/journal.pcbi.1008935

  26. [26]

    Moving outside the lab: Markerless motion capture accurately quantifies sagittal plane kinematics during the vertical jump

    Drazan JF, Phillips WT, Seethapathi N, Hullfish TJ, Baxter JR. Moving outside the lab: Markerless motion capture accurately quantifies sagittal plane kinematics during the vertical jump. Journal of Biomechanics. 2021;125:110547. doi:10.1016/j.jbiomech.2021.110547

  27. [27]

    Clinical gait analysis using video-based pose estimation: Multiple perspectives, clinical populations, and measuring change

    Stenum J, Hsu MM, Pantelyat AY, Roemmich RT. Clinical gait analysis using video-based pose estimation: Multiple perspectives, clinical populations, and measuring change. PLOS Digital Health. 2024;3(3):e0000467. doi:10.1371/journal.pdig.0000467

  28. [28]

    BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos

    Koleini F, Saleem MU, Wang P, Xue H, Helmy A, Fenwick A. BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos. In: WACV; 2025. ArXiv:2501.07800

  29. [29]

    Portable Biomechanics Laboratory: Clinically Accessible Movement Analysis from a Handheld Smartphone

    Peiffer JD, Shah K, Djuraskovic I, Anarwala S, Abdou K, Patel R, et al. Portable Biomechanics Laboratory: Clinically Accessible Movement Analysis from a Handheld Smartphone. arXiv preprint arXiv:250708268. 2025. Available from:https://arxiv.org/abs/2507.08268. arXiv:2507.08268

  30. [30]

    Sapiens2

    Khirodkar R, Wen H, Martinez J, Dong Y, Zhaoen S, Saito S. Sapiens2: Pretraining 1K Resolution Vision Transformers on 1B Human Images. In: International Conference on Learning Representations (ICLR); 2026. Available from:https://arxiv.org/abs/2604.21681. arXiv:2604.21681

  31. [31]

    The Great GPU Shortage – Rental Capacity – Launching our H100 1 Year Rental Price Index; 2026

    Nishball D, Nanos J, Wen CK, et al.. The Great GPU Shortage – Rental Capacity – Launching our H100 1 Year Rental Price Index; 2026. Published 2026-04-02. Accessed 2026-05-11. SemiAnalysis Newsletter. Available from: https://newsletter.semianalysis.com/p/the-great-gpu-shortage-rental-capacity. May 21, 2026 13/18

  32. [32]

    Cost of a Data Breach Report 2024

    IBM Security, Ponemon Institute. Cost of a Data Breach Report 2024. IBM Corporation; 2024. Available from:https://www.ibm.com/reports/data-breach

  33. [33]

    Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information; 2024

    U S Department of Health and Human Services, Office for Civil Rights. Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information; 2024. Accessed 2025. https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf

  34. [34]

    Measuring Digital Development: Facts and Figures 2024

    International Telecommunication Union. Measuring Digital Development: Facts and Figures 2024. Geneva: ITU Publications; 2024. Available from: https://www.itu.int/itu-d/reports/statistics/facts-figures-2024/

  35. [35]

    Deploying medical AI in low-resource settings: a scoping review of challenges and strategies

    Al-Ganad A, Al-Shahdhi A, Al-Dhaifi O, Hajeb E, Hajeb H, Al-Motarreb A. Deploying medical AI in low-resource settings: a scoping review of challenges and strategies. Frontiers in Digital Health. 2026;8:1743634. doi:10.3389/fdgth.2026.1743634

  36. [36]

    Incidental data: observation of privacy compromising data on social media platforms

    Kutschera S. Incidental data: observation of privacy compromising data on social media platforms. International Cybersecurity Law Review. 2023;4(1):91-114. doi:10.1365/s43439-022-00071-w

  37. [37]

    Data Privacy: Are We Accidentally Sharing Too Much Information? In: Proceedings of the Conference for Information Systems Applied Research (CONISAR)

    Chawdhry AA, Paullet K, Douglas DM. Data Privacy: Are We Accidentally Sharing Too Much Information? In: Proceedings of the Conference for Information Systems Applied Research (CONISAR). San Antonio, TX, USA; 2013. ISSN 2167-1508, v6 n2818. Available from: https://iscap.us/proceedings/conisar/2013/pdf/2818.pdf

  38. [38]

    What to do when you have accidentally shared something online

    The Cyber Helpline. What to do when you have accidentally shared something online;. Accessed: 2026-04-29.https://www.thecyberhelpline.com/guides/accidental-info-share-usa

  39. [39]

    BlazePose: On-device Real-time Body Pose Tracking

    Bazarevsky V, Grishchenko I, Raveendran K, Zhu T, Zhang F, Grundmann M. BlazePose: On-device Real-time Body Pose Tracking. In: CVPR Workshop on Computer Vision for Augmented and Virtual Reality; 2020. arXiv:2006.10204

  40. [40]

    BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

    Grishchenko I, Bazarevsky V, Zanfir A, Bazavan EG, Zanfir M, Yee R, et al. BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation. In: CVPR Workshop on Computer Vision for Augmented and Virtual Reality; 2022. arXiv:2206.11678

  41. [41]

    Rtmpose: Real-time multi-person pose estimation based on mmpose,

    Jiang T, Lu P, Zhang L, Ma N, Han R, Lyu C, et al. RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose. arXiv preprint arXiv:230307399. 2023. arXiv:2303.07399

  42. [42]

    Apple Neural Engine documentation; CoreML performance guide; Apple MLX framework

    Apple Inc . Apple Neural Engine documentation; CoreML performance guide; Apple MLX framework

  43. [43]

    Available from: https://machinelearning.apple.com/research/neural-engine-transformers

  44. [44]

    Snapdragon 8 Elite Hexagon NPU brief; Google LiteRT; ONNX Runtime Mobile

    Qualcomm Inc . Snapdragon 8 Elite Hexagon NPU brief; Google LiteRT; ONNX Runtime Mobile

  45. [45]

    Available from:https://www.qualcomm.com/products/mobile/snapdragon/smartphones/ snapdragon-8-series-mobile-platforms/snapdragon-8-elite-mobile-platform

  46. [46]

    Core ML: Integrate machine learning models into your app; 2023

    Apple Inc . Core ML: Integrate machine learning models into your app; 2023. https://developer.apple.com/documentation/coreml

  47. [47]

    TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems

    David R, Duke J, Jain A, Reddi VJ, Jeffries N, Li J, et al. TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems. In: Proceedings of Machine Learning and Systems (MLSys)

  48. [48]

    arXiv:2010.08678

    Available from:https://arxiv.org/abs/2010.08678. arXiv:2010.08678

  49. [49]

    Skeleton-based abnormal gait recognition with spatio-temporal attention enhanced gait-structural graph convolutional networks

    Tian H, Ma X, Wu H, Li Y. Skeleton-based abnormal gait recognition with spatio-temporal attention enhanced gait-structural graph convolutional networks. Neurocomputing. 2022;473:116-26. doi:10.1016/j.neucom.2021.12.004

  50. [50]

    WM–STGCN: A Novel Spatiotemporal Modeling Method for Parkinsonian Gait Recognition

    Zhang J, Lim J, Kim MH, Hur S, Chung TM. WM–STGCN: A Novel Spatiotemporal Modeling Method for Parkinsonian Gait Recognition. Sensors. 2023;23(10):4980. doi:10.3390/s23104980. May 21, 2026 14/18

  51. [51]

    Hybrid Deep Neural Network Framework Combining Skeleton and Gait Features for Pathological Gait Recognition

    Jun K, Lee K, Lee S, Lee H, Kim MS. Hybrid Deep Neural Network Framework Combining Skeleton and Gait Features for Pathological Gait Recognition. Bioengineering. 2023;10(10):1133. doi:10.3390/bioengineering10101133

  52. [52]

    ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

    Tao D, Xu Y, Zhang J, Zhang Q. ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation. In: Advances in Neural Information Processing Systems 35. Neural Information Processing Systems Foundation, Inc. (NeurIPS); 2022. p. 38571-84. doi:10.52202/068431-2795

  53. [53]

    MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation

    Sarandi I, Linder T, Arras KO, Leibe B. MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation. IEEE Transactions on Biometrics, Behavior, and Identity Science. 2021;3(1):16-30. doi:10.1109/tbiom.2020.3037257

  54. [54]

    360mvsnet: Deep multi-view stereo network with 360° images for indoor scene reconstruction,

    Sarandi I, Hermans A, Leibe B. Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE; 2023. p. 2955-65. doi:10.1109/wacv56688.2023.00297

  55. [55]

    Humans in 4D: Reconstructing and Tracking Humans with Transformers

    Goel S, Pavlakos G, Rajasegaran J, Kanazawa A, Malik J. Humans in 4D: Reconstructing and Tracking Humans with Transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2023. Available from:https://arxiv.org/abs/2305.20091. arXiv:2305.20091

  56. [56]

    CameraHMR: Aligning People with Perspective

    Patel P, Black MJ. CameraHMR: Aligning People with Perspective. In: 2025 International Conference on 3D Vision (3DV). IEEE; 2025. p. 1562-71. doi:10.1109/3dv66043.2025.00146

  57. [57]

    Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition

    Ord´ o˜ nez F, Roggen D. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition. Sensors. 2016;16(1):115. doi:10.3390/s16010115

  58. [58]

    Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

    Shi L, Zhang Y, Cheng J, Lu H. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2019. p. 12018-27. doi:10.1109/cvpr.2019.01230

  59. [59]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR); 2020. Available from:https://arxiv.org/abs/2010.11929. arXiv:2010.11929

  60. [60]

    BiomechGPT: Towards a Biomechanically Fluent Multimodal Foundation Model for Clinically Relevant Motion Tasks

    Yang R, Kennedy A, Cotton RJ. BiomechGPT: Towards a Biomechanically Fluent Multimodal Foundation Model for Clinically Relevant Motion Tasks. arXiv preprint arXiv:250518465. 2025. Available from:https://arxiv.org/abs/2505.18465. arXiv:2505.18465

  61. [61]

    Gemma 4 Model Card; 2026

    Google DeepMind. Gemma 4 Model Card; 2026. Accessed 2026. Google AI for Developers Documentation. Available from:https://ai.google.dev/gemma/docs/core/model_card_4

  62. [62]

    MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices

    Xu V, Gao C, Hoffmann H, Ahuja K. MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices. In: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology. ACM; 2024. p. 1-11. doi:10.1145/3654777.3676461

  63. [63]

    The Mobile Economy 2024; 2024

    GSMA. The Mobile Economy 2024; 2024. Available from: https://www.gsma.com/r/mobileeconomy/

  64. [64]

    The Swift Programming Language; 2015

    Apple Inc , The Swift Project Authors. The Swift Programming Language; 2015. Open-sourced December 2015 under the Apache 2.0 license with a Runtime Library Exception. Accessed 2026-05-14. https://swift.org

  65. [65]

    Xcode; 2026

    Apple Inc . Xcode; 2026. Integrated development environment for Apple platforms. Accessed 2026-05-14. Apple Developer. Available from:https://developer.apple.com/xcode/. May 21, 2026 15/18

  66. [66]

    TestFlight — Beta Testing Made Simple; 2024

    Apple Inc . TestFlight — Beta Testing Made Simple; 2024. Apple’s official beta-distribution service for iOS apps. Accessed 2026-05-09. Apple Developer. Available from: https://developer.apple.com/testflight/

  67. [67]

    In: CVPR (2024)

    Shin S, Kim J, Halilaj E, Black MJ. WHAM: Reconstructing World-Grounded Humans with Accurate 3D Motion. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2024. p. 2070-80. doi:10.1109/cvpr52733.2024.00202

  68. [68]

    FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging with Diffusion Decoding

    Mehraban S, Iaboni A, Taati B. FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging with Diffusion Decoding. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); 2026. Available from:https://arxiv.org/abs/2510.10868. arXiv:2510.10868

  69. [69]

    An Explainable Spatial-Temporal Graphical Convolutional Network to Score Freezing of Gait in Parkinsonian Patients

    Kwon H, Clifford GD, Genias I, Bernhard D, Esper CD, Factor SA, et al. An Explainable Spatial-Temporal Graphical Convolutional Network to Score Freezing of Gait in Parkinsonian Patients. Sensors. 2023;23(4):1766. doi:10.3390/s23041766

  70. [70]

    Vision Framework Documentation; 2024

    Apple Inc . Vision Framework Documentation; 2024. On-device computer vision APIs including VNDetectHumanRectanglesRequest. Accessed 2026-05-09. Apple Developer Documentation. Available from:https://developer.apple.com/documentation/vision

  71. [71]

    Accelerate Framework; 2024

    Apple Inc . Accelerate Framework; 2024. High-performance vectorized CPU primitives spanning BLAS, LAPACK, vDSP, vImage, and BNNS. Accessed 2026-05-14. Apple Developer Documentation. Available from:https://developer.apple.com/accelerate/

  72. [72]

    Apple introduces iPhone 13 and iPhone 13 mini; 2021

    Apple Inc . Apple introduces iPhone 13 and iPhone 13 mini; 2021. Announcement of the A15 Bionic system-on-chip; six-core CPU, five-core GPU, and 16-core Neural Engine. Accessed 2026-05-09. Apple Newsroom. Available from:https: //www.apple.com/newsroom/2021/09/apple-introduces-iphone-13-and-iphone-13-mini/

  73. [73]

    iPhone 14 Technical Specifications; 2023

    Apple Inc . iPhone 14 Technical Specifications; 2023. iPhone 14 specifications including A15 Bionic chip configuration. Accessed 2026-05-09. Apple Support. Available from: https://support.apple.com/en-us/111872

  74. [74]

    NVIDIA H200 Tensor Core GPU Datasheet; 2024

    NVIDIA Corporation. NVIDIA H200 Tensor Core GPU Datasheet; 2024. Specifications for the H200 NVL (141 GB HBM3e, 4.8 TB/s memory bandwidth, dual-slot PCIe, 600 W TGP). Accessed 2026-05-10. NVIDIA Product Documentation. Available from:https://resources.nvidia.com/ en-us-data-center-overview-mc/en-us-data-center-overview/hpc-datasheet-sc23-h200

  75. [75]

    Intel Xeon 6731P Processor (144 M Cache, 2.50 GHz); 2025

    Intel Corporation. Intel Xeon 6731P Processor (144 M Cache, 2.50 GHz); 2025. 32 cores, 64 threads, 2.5 GHz base / 4.1 GHz max turbo, 144 MB cache, 245 W TDP, DDR5 6400 MT/s, PCIe 5.0. Accessed 2026-05-10. Intel Product Specifications (ARK). Available from: https://www.intel.com/content/www/us/en/products/sku/242635/ intel-xeon-6731p-processor-144m-cache-2...

  76. [76]

    Benchmarking Quantum Red TEA on CPUs, GPUs, and TPUs

    Ansel J, Yang E, He H, Gimelshein N, Jain A, Voznesensky M, et al. PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation. In: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Association for Computing Machinery; 2024. p....

  77. [77]

    Classifying simulated gait impairments using privacy-preserving explainable artificial intelligence and mobile phone videos

    Reddy L, Anand K, Kaushik S, Rodrigo C, McKay JL, Kesar TM, et al. Classifying simulated gait impairments using privacy-preserving explainable artificial intelligence and mobile phone videos. PLOS Digital Health. 2025;4(9):e0001004. doi:10.1371/journal.pdig.0001004

  78. [78]

    Length and Redundancy of Outpatient Progress Notes Across a Decade at an Academic Medical Center

    Rule A, Bedrick S, Chiang MF, Hribar MR. Length and Redundancy of Outpatient Progress Notes Across a Decade at an Academic Medical Center. JAMA Network Open. 2021;4(7):e2115334. doi:10.1001/jamanetworkopen.2021.15334. May 21, 2026 16/18

  79. [79]

    Physician Sentiments Around the Use of AI in Health Care; 2024

    American Medical Association. Physician Sentiments Around the Use of AI in Health Care; 2024. 66% of US physicians reported using AI in 2024, up from 38% in 2023; 47% ranked increased oversight as the top regulatory action. Accessed 2026-05-10. AMA Augmented Intelligence Research, 2024 Physician Survey (N≈ 1,200). Available from: https://www.ama-assn.org/...

  80. [80]

    Impact of AI on radiology: a EuroAIM/EuSoMII 2024 survey among members of the European Society of Radiology

    Zanardo M, Visser JJ, Colarieti A, Cuocolo R, Klontzas ME, Pinto dos Santos D, et al. Impact of AI on radiology: a EuroAIM/EuSoMII 2024 survey among members of the European Society of Radiology. Insights into Imaging. 2024;15(1):240. doi:10.1186/s13244-024-01801-w

Showing first 80 references.