AIGaitor: Privacy-preserving and cloud-free motion analysis for everyone, using edge computing
Pith reviewed 2026-05-21 05:12 UTC · model grok-4.3
The pith
AIGaitor runs complete monocular motion capture and gait analysis entirely on a smartphone with no cloud upload.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AIGaitor is the first monocular system to demonstrate end-to-end on-device motion capture and downstream deep-learning analysis, supporting clinically applicable movement analysis that is low-cost, private, and accessible to smartphone users.
What carries the argument
The Time-Priority end-to-end on-device pipeline integrating optimized 2D and 3D pose estimation, pose optimization, skeleton-based deep-learning analysis, and vision-language models on mobile neural accelerators.
If this is right
- Rehabilitation clinicians can adopt motion analysis tools at low operating cost without specialized hardware or extensive training.
- Patient data remains on the device throughout capture and analysis, eliminating upload-related privacy risks.
- Analysis becomes available to any user with a modern smartphone, extending reach beyond equipped clinics.
- Short video clips can be processed in times that match or beat cloud pipelines when network transfer delays are counted.
- Keypoint extraction runs in real time and gait classification finishes in sub-milliseconds on the same hardware.
Where Pith is reading between the lines
- Home-based continuous monitoring of gait changes becomes practical for long-term rehab follow-up without clinic visits.
- The approach could combine with other phone sensors to track broader movement and balance patterns over time.
- Lower hardware barriers may speed adoption in regions with limited access to motion labs.
- Accuracy validation against lab systems on varied populations and movement disorders would be a direct next test.
Load-bearing premise
The mobile-optimized pose estimation and skeleton-based models retain enough accuracy for clinical gait analysis despite the paper reporting only processing times and no quantitative accuracy metrics or gold-standard comparisons.
What would settle it
A direct comparison of joint angles, stride parameters, or gait classifications produced by AIGaitor against simultaneous laboratory optical motion capture on the same walking subjects, quantifying any systematic differences.
read the original abstract
Motion capture is the gold standard for measuring human movement, but clinical use remains limited by cost, technical complexity, and privacy concerns. AIGaitor is a privacy-preserving, cloud-free motion analysis system that runs markerless monocular motion-capture pipelines and downstream deep-learning analysis entirely on a consumer smartphone using on-device neural accelerators. To motivate its design, we surveyed 74 rehabilitation clinicians: 92 percent said they would adopt an accurate, cost-effective, easy-to-use AI gait analysis tool, while 79.7 percent cited operating cost, 68.9 percent insufficient training, and 64.9 percent privacy concerns as leading barriers. We then optimized and benchmarked mobile iOS implementations of current monocular pipeline components, including 2D and 3D pose estimation, pose optimization, skeleton-based deep-learning analysis, and a vision-language model. A Time-Priority end-to-end on-device pipeline processes a 10 s 4K 60 fps video clip in 77 s on an iPhone 14, matching or beating the same pipeline on a high-end NVIDIA H200 cloud server when network transfer is included: 94 s at global mobile-average uplink and 66 s at developed-world Wi-Fi. Lightweight models such as ViTPose-s achieve real-time keypoint extraction, and skeleton-based action-recognition models provide sub-millisecond gait classification on the same clip. To our knowledge, AIGaitor is the first monocular system to demonstrate end-to-end on-device motion capture and downstream deep-learning analysis, supporting clinically applicable movement analysis that is low-cost, private, and accessible to smartphone users.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AIGaitor, a privacy-preserving on-device system for monocular motion capture and downstream deep-learning gait analysis that runs entirely on consumer smartphones. Motivated by a survey of 74 rehabilitation clinicians (92% would adopt an accurate low-cost tool; cost, training, and privacy as top barriers), it optimizes and benchmarks iOS implementations of 2D/3D pose estimation, pose optimization, skeleton-based DL models, and a vision-language model. End-to-end timing results show a Time-Priority pipeline processes a 10 s 4K60 clip in 77 s on an iPhone 14, competitive with or faster than the same pipeline on an NVIDIA H200 cloud server once network transfer is included (94 s at global mobile uplink, 66 s at developed-world Wi-Fi). Lightweight models achieve real-time keypoint extraction and sub-millisecond classification. The central claim is that this is the first monocular end-to-end on-device system supporting clinically applicable movement analysis that is low-cost, private, and accessible.
Significance. If the accuracy of the mobile-optimized pipeline is shown to be sufficient, the work would address documented clinician barriers and enable private, low-cost gait analysis on ubiquitous hardware. The empirical demonstration of competitive or superior end-to-end latency versus cloud-plus-network is a concrete strength, as is the focus on on-device neural accelerators and the survey grounding.
major comments (2)
- [Abstract] Abstract: the assertion that the system supports 'clinically applicable movement analysis' is not supported by any quantitative accuracy results. Only latency numbers (77 s on iPhone 14, real-time keypoint extraction, sub-millisecond classification) are reported; no MPJPE, PCK, gait-parameter errors, comparison to optical motion-capture ground truth, or validation on patient cohorts appear.
- [Abstract] Abstract and Results: the central claim that mobile-optimized models 'retain enough accuracy for clinical gait analysis' rests on an untested assumption. The manuscript supplies no error metrics or clinical validation after quantization and on-device execution, leaving the clinical-utility conclusion unsupported by the presented evidence.
minor comments (2)
- [Abstract] Abstract: the survey is cited with specific percentages but the sampling method, response rate, or clinician demographics are not described.
- Consider adding a table or figure that reports both latency and accuracy metrics side-by-side for the on-device versus cloud pipelines.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback emphasizing the need for quantitative accuracy evidence to support claims of clinical applicability. We address each major comment point by point below, with revisions planned where the manuscript text can be clarified or qualified without introducing unsupported data.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that the system supports 'clinically applicable movement analysis' is not supported by any quantitative accuracy results. Only latency numbers (77 s on iPhone 14, real-time keypoint extraction, sub-millisecond classification) are reported; no MPJPE, PCK, gait-parameter errors, comparison to optical motion-capture ground truth, or validation on patient cohorts appear.
Authors: We agree that the abstract and presented results focus exclusively on latency, on-device execution, and the clinician survey without reporting accuracy metrics such as MPJPE, PCK, or gait-parameter errors. The manuscript positions AIGaitor as an end-to-end on-device implementation of established monocular pipelines (e.g., ViTPose and skeleton-based models) whose accuracy is documented in prior literature; our contribution is the mobile optimization and latency benchmarking rather than new accuracy validation. To address this, we will revise the abstract to qualify the 'clinically applicable' phrasing as referring to the potential enabled by accurate base models running privately on-device, and we will add a brief discussion of expected accuracy retention based on the cited model papers. revision: yes
-
Referee: [Abstract] Abstract and Results: the central claim that mobile-optimized models 'retain enough accuracy for clinical gait analysis' rests on an untested assumption. The manuscript supplies no error metrics or clinical validation after quantization and on-device execution, leaving the clinical-utility conclusion unsupported by the presented evidence.
Authors: This observation is correct: the manuscript does not include post-quantization or on-device accuracy measurements, nor any new clinical validation. The central claims rest on the assumption that the accuracy of the source models is largely preserved under the optimizations we describe. We will revise the abstract and results sections to remove or soften language implying that the mobile pipeline has been shown to retain clinical-grade accuracy, instead framing the work as demonstrating feasible on-device execution of existing accurate models. Any available model-specific accuracy figures from the literature will be referenced more explicitly. revision: yes
- Direct quantitative validation against optical motion-capture ground truth or on patient cohorts, as the current manuscript does not contain such experiments and adding them would require new data collection beyond the scope of this work.
Circularity Check
No circularity: empirical benchmarks of existing models with no derivations or self-referential predictions
full rationale
The manuscript describes a clinician survey and reports runtime benchmarks for off-the-shelf mobile-optimized pose estimation and skeleton-based models running on iOS devices. No equations, parameter fitting, or derivation chain appear in the provided text. Claims of being the first end-to-end on-device system are presented as an empirical observation rather than a mathematical result derived from prior self-citations or fitted inputs. The central assertions rest on direct latency measurements (e.g., 77 s for a 10 s clip) and qualitative statements about model performance, which do not reduce to the inputs by construction. This is a standard engineering demonstration paper whose content is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Gait analysis methods in rehabilitation
Baker R. Gait analysis methods in rehabilitation. Journal of NeuroEngineering and Rehabilitation. 2006;3(1):4. doi:10.1186/1743-0003-3-4
-
[2]
The Effect of Preoperative Gait Analysis on Orthopaedic Decision Making
Kay RM, Dennis S, Rethlefsen S, Reynolds RAK, Skaggs DL, Tolo VT. The Effect of Preoperative Gait Analysis on Orthopaedic Decision Making. Clinical Orthopaedics and Related Research. 2000;372:217-22. doi:10.1097/00003086-200003000-00023
-
[3]
Efficacy of clinical gait analysis: A systematic review
Wren TAL, Gorton GE, ˜Ounpuu S, Tucker CA. Efficacy of clinical gait analysis: A systematic review. Gait & Posture. 2011;34(2):149-53. doi:10.1016/j.gaitpost.2011.03.027
-
[4]
Clinical gait analysis 1973–2023: Evaluating progress to guide the future
Stebbins J, Harrington M, Stewart C. Clinical gait analysis 1973–2023: Evaluating progress to guide the future. Journal of Biomechanics. 2023;160:111827. doi:10.1016/j.jbiomech.2023.111827
-
[5]
Colyer SL, Evans M, Cosker DP, Salo AIT. A Review of the Evolution of Vision-Based Motion Analysis and the Integration of Advanced Computer Vision Methods Towards Developing a Markerless System. Sports Medicine - Open. 2018;4(1):24. doi:10.1186/s40798-018-0139-y
-
[6]
Reliability of Observational Kinematic Gait Analysis
Krebs DE, Edelstein JE, Fishman S. Reliability of Observational Kinematic Gait Analysis. Physical Therapy. 1985;65(7):1027-33. doi:10.1093/ptj/65.7.1027
-
[7]
Reliability of videotaped observational gait analysis in patients with orthopedic impairments
Brunnekreef JJ, van Uden CJ, van Moorsel S, Kooloos JG. Reliability of videotaped observational gait analysis in patients with orthopedic impairments. BMC Musculoskeletal Disorders. 2005;6(1):17. doi:10.1186/1471-2474-6-17
-
[8]
A review of observational gait assessment in clinical practice
Toro B, Nester C, Farren P. A review of observational gait assessment in clinical practice. Physiotherapy Theory and Practice. 2003;19(3):137-49. doi:10.1080/09593980307964
-
[9]
Interrater reliability of videotaped observational gait-analysis assessments
Eastlack ME, Arvidson J, Snyder-Mackler L, Danoff JV, McGarvey CL. Interrater Reliability of Videotaped Observational Gait-Analysis Assessments. Physical Therapy. 1991;71(6):465-72. doi:10.1093/ptj/71.6.465
-
[10]
Kim DJ, Park ES, Sim EG, Kim KJ, Kim YU, Rha Dw. Reliability of Visual Classification of Sagittal Gait Patterns in Patients with Bilateral Spastic Cerebral Palsy. Annals of Rehabilitation Medicine. 2011;35(3):354. doi:10.5535/arm.2011.35.3.354
-
[11]
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;43(1):172-86. doi:10.1109/tpami.2019.2929257
-
[12]
Deep High-Resolution Representation Learning for Human Pose Estimation
Sun K, Xiao B, Liu D, Wang J. Deep High-Resolution Representation Learning for Human Pose Estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2019. p. 5686-96. doi:10.1109/cvpr.2019.00584
-
[13]
DeepLabCut: markerless pose estimation of user-defined body parts with deep learning
Mathis A, Mamidanna P, Cury KM, Abe T, Murthy VN, Mathis MW, et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nature Neuroscience. 2018;21(9):1281-9. doi:10.1038/s41593-018-0209-y
-
[14]
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
Pavllo D, Feichtenhofer C, Grangier D, Auli M. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2019. p. 7745-54. doi:10.1109/cvpr.2019.00794
-
[15]
Smpl: a skinned multi-person linear model,
Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ. SMPL: a skinned multi-person linear model. ACM Transactions on Graphics. 2015;34(6):1-16. doi:10.1145/2816795.2818013
-
[16]
End-to-End Recovery of Human Shape and Pose
Kanazawa A, Black MJ, Jacobs DW, Malik J. End-to-End Recovery of Human Shape and Pose. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2018. p. 7122-31. doi:10.1109/cvpr.2018.00744. May 21, 2026 12/18
-
[17]
Concurrent assessment of gait kinematics using marker-based and markerless motion capture
Kanko RM, Laende EK, Davis EM, Selbie WS, Deluzio KJ. Concurrent assessment of gait kinematics using marker-based and markerless motion capture. Journal of Biomechanics. 2021;127:110665. doi:10.1016/j.jbiomech.2021.110665
-
[18]
Evaluation of 3D Markerless Motion Capture Accuracy Using OpenPose With Multiple Video Cameras
Nakano N, Sakura T, Ueda K, Omura L, Kimura A, Iino Y, et al. Evaluation of 3D Markerless Motion Capture Accuracy Using OpenPose With Multiple Video Cameras. Frontiers in Sports and Active Living. 2020;2:50. doi:10.3389/fspor.2020.00050
-
[19]
Anipose: A toolkit for robust markerless 3D pose estimation
Karashchuk P, Rupp KL, Dickinson ES, Walling-Bell S, Sanders E, Azim E, et al. Anipose: A toolkit for robust markerless 3D pose estimation. Cell Reports. 2021;36(13):109730. doi:10.1016/j.celrep.2021.109730
-
[20]
The accuracy of several pose estimation methods for 3D joint centre localisation
Needham L, Evans M, Cosker DP, Wade L, McGuigan PM, Bilzon JL, et al. The accuracy of several pose estimation methods for 3D joint centre localisation. Scientific Reports. 2021;11(1):20673. doi:10.1038/s41598-021-00212-x
-
[21]
Wren TAL, Isakov P, Rethlefsen SA. Comparison of kinematics between Theia markerless and conventional marker-based gait analysis in clinical patients. Gait & Posture. 2023;104:9-14. doi:10.1016/j.gaitpost.2023.05.029
-
[22]
Pose2Sim: An open-source Python package for multiview markerless kinematics
Pagnon D, Domalain M, Reveret L. Pose2Sim: An open-source Python package for multiview markerless kinematics. Journal of Open Source Software. 2022;7(77):4362. doi:10.21105/joss.04362
-
[23]
OpenCap: Human movement dynamics from smartphone videos
Uhlrich SD, Falisse A, Kidzi´ nski L, Muccini J, Ko M, Chaudhari AS, et al. OpenCap: Human movement dynamics from smartphone videos. PLOS Computational Biology. 2023;19(10):e1011462. doi:10.1371/journal.pcbi.1011462
-
[24]
OpenCap Monocular: 3D Human Kinematics and Musculoskeletal Dynamics from a Single Smartphone Video
Gilon S, Miller EY, Uhlrich SD. OpenCap Monocular: 3D Human Kinematics and Musculoskeletal Dynamics from a Single Smartphone Video. arXiv preprint arXiv:260324733. 2026. Available from: https://arxiv.org/abs/2603.24733. arXiv:2603.24733
-
[25]
Two-dimensional video-based analysis of human gait using pose estimation
Stenum J, Rossi C, Roemmich RT. Two-dimensional video-based analysis of human gait using pose estimation. PLOS Computational Biology. 2021;17(4):e1008935. doi:10.1371/journal.pcbi.1008935
-
[26]
Drazan JF, Phillips WT, Seethapathi N, Hullfish TJ, Baxter JR. Moving outside the lab: Markerless motion capture accurately quantifies sagittal plane kinematics during the vertical jump. Journal of Biomechanics. 2021;125:110547. doi:10.1016/j.jbiomech.2021.110547
-
[27]
Stenum J, Hsu MM, Pantelyat AY, Roemmich RT. Clinical gait analysis using video-based pose estimation: Multiple perspectives, clinical populations, and measuring change. PLOS Digital Health. 2024;3(3):e0000467. doi:10.1371/journal.pdig.0000467
-
[28]
BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos
Koleini F, Saleem MU, Wang P, Xue H, Helmy A, Fenwick A. BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos. In: WACV; 2025. ArXiv:2501.07800
-
[29]
Portable Biomechanics Laboratory: Clinically Accessible Movement Analysis from a Handheld Smartphone
Peiffer JD, Shah K, Djuraskovic I, Anarwala S, Abdou K, Patel R, et al. Portable Biomechanics Laboratory: Clinically Accessible Movement Analysis from a Handheld Smartphone. arXiv preprint arXiv:250708268. 2025. Available from:https://arxiv.org/abs/2507.08268. arXiv:2507.08268
-
[30]
Khirodkar R, Wen H, Martinez J, Dong Y, Zhaoen S, Saito S. Sapiens2: Pretraining 1K Resolution Vision Transformers on 1B Human Images. In: International Conference on Learning Representations (ICLR); 2026. Available from:https://arxiv.org/abs/2604.21681. arXiv:2604.21681
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[31]
The Great GPU Shortage – Rental Capacity – Launching our H100 1 Year Rental Price Index; 2026
Nishball D, Nanos J, Wen CK, et al.. The Great GPU Shortage – Rental Capacity – Launching our H100 1 Year Rental Price Index; 2026. Published 2026-04-02. Accessed 2026-05-11. SemiAnalysis Newsletter. Available from: https://newsletter.semianalysis.com/p/the-great-gpu-shortage-rental-capacity. May 21, 2026 13/18
work page 2026
-
[32]
Cost of a Data Breach Report 2024
IBM Security, Ponemon Institute. Cost of a Data Breach Report 2024. IBM Corporation; 2024. Available from:https://www.ibm.com/reports/data-breach
work page 2024
-
[33]
Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information; 2024
U S Department of Health and Human Services, Office for Civil Rights. Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information; 2024. Accessed 2025. https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf
work page 2024
-
[34]
Measuring Digital Development: Facts and Figures 2024
International Telecommunication Union. Measuring Digital Development: Facts and Figures 2024. Geneva: ITU Publications; 2024. Available from: https://www.itu.int/itu-d/reports/statistics/facts-figures-2024/
work page 2024
-
[35]
Deploying medical AI in low-resource settings: a scoping review of challenges and strategies
Al-Ganad A, Al-Shahdhi A, Al-Dhaifi O, Hajeb E, Hajeb H, Al-Motarreb A. Deploying medical AI in low-resource settings: a scoping review of challenges and strategies. Frontiers in Digital Health. 2026;8:1743634. doi:10.3389/fdgth.2026.1743634
-
[36]
Incidental data: observation of privacy compromising data on social media platforms
Kutschera S. Incidental data: observation of privacy compromising data on social media platforms. International Cybersecurity Law Review. 2023;4(1):91-114. doi:10.1365/s43439-022-00071-w
-
[37]
Chawdhry AA, Paullet K, Douglas DM. Data Privacy: Are We Accidentally Sharing Too Much Information? In: Proceedings of the Conference for Information Systems Applied Research (CONISAR). San Antonio, TX, USA; 2013. ISSN 2167-1508, v6 n2818. Available from: https://iscap.us/proceedings/conisar/2013/pdf/2818.pdf
work page 2013
-
[38]
What to do when you have accidentally shared something online
The Cyber Helpline. What to do when you have accidentally shared something online;. Accessed: 2026-04-29.https://www.thecyberhelpline.com/guides/accidental-info-share-usa
work page 2026
-
[39]
BlazePose: On-device Real-time Body Pose Tracking
Bazarevsky V, Grishchenko I, Raveendran K, Zhu T, Zhang F, Grundmann M. BlazePose: On-device Real-time Body Pose Tracking. In: CVPR Workshop on Computer Vision for Augmented and Virtual Reality; 2020. arXiv:2006.10204
-
[40]
BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation
Grishchenko I, Bazarevsky V, Zanfir A, Bazavan EG, Zanfir M, Yee R, et al. BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation. In: CVPR Workshop on Computer Vision for Augmented and Virtual Reality; 2022. arXiv:2206.11678
-
[41]
Rtmpose: Real-time multi-person pose estimation based on mmpose,
Jiang T, Lu P, Zhang L, Ma N, Han R, Lyu C, et al. RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose. arXiv preprint arXiv:230307399. 2023. arXiv:2303.07399
-
[42]
Apple Neural Engine documentation; CoreML performance guide; Apple MLX framework
Apple Inc . Apple Neural Engine documentation; CoreML performance guide; Apple MLX framework
-
[43]
Available from: https://machinelearning.apple.com/research/neural-engine-transformers
-
[44]
Snapdragon 8 Elite Hexagon NPU brief; Google LiteRT; ONNX Runtime Mobile
Qualcomm Inc . Snapdragon 8 Elite Hexagon NPU brief; Google LiteRT; ONNX Runtime Mobile
-
[45]
Available from:https://www.qualcomm.com/products/mobile/snapdragon/smartphones/ snapdragon-8-series-mobile-platforms/snapdragon-8-elite-mobile-platform
-
[46]
Core ML: Integrate machine learning models into your app; 2023
Apple Inc . Core ML: Integrate machine learning models into your app; 2023. https://developer.apple.com/documentation/coreml
work page 2023
-
[47]
TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems
David R, Duke J, Jain A, Reddi VJ, Jeffries N, Li J, et al. TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems. In: Proceedings of Machine Learning and Systems (MLSys)
-
[48]
Available from:https://arxiv.org/abs/2010.08678. arXiv:2010.08678
-
[49]
Tian H, Ma X, Wu H, Li Y. Skeleton-based abnormal gait recognition with spatio-temporal attention enhanced gait-structural graph convolutional networks. Neurocomputing. 2022;473:116-26. doi:10.1016/j.neucom.2021.12.004
-
[50]
WM–STGCN: A Novel Spatiotemporal Modeling Method for Parkinsonian Gait Recognition
Zhang J, Lim J, Kim MH, Hur S, Chung TM. WM–STGCN: A Novel Spatiotemporal Modeling Method for Parkinsonian Gait Recognition. Sensors. 2023;23(10):4980. doi:10.3390/s23104980. May 21, 2026 14/18
-
[51]
Jun K, Lee K, Lee S, Lee H, Kim MS. Hybrid Deep Neural Network Framework Combining Skeleton and Gait Features for Pathological Gait Recognition. Bioengineering. 2023;10(10):1133. doi:10.3390/bioengineering10101133
-
[52]
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Tao D, Xu Y, Zhang J, Zhang Q. ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation. In: Advances in Neural Information Processing Systems 35. Neural Information Processing Systems Foundation, Inc. (NeurIPS); 2022. p. 38571-84. doi:10.52202/068431-2795
-
[53]
MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation
Sarandi I, Linder T, Arras KO, Leibe B. MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation. IEEE Transactions on Biometrics, Behavior, and Identity Science. 2021;3(1):16-30. doi:10.1109/tbiom.2020.3037257
-
[54]
360mvsnet: Deep multi-view stereo network with 360° images for indoor scene reconstruction,
Sarandi I, Hermans A, Leibe B. Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE; 2023. p. 2955-65. doi:10.1109/wacv56688.2023.00297
-
[55]
Humans in 4D: Reconstructing and Tracking Humans with Transformers
Goel S, Pavlakos G, Rajasegaran J, Kanazawa A, Malik J. Humans in 4D: Reconstructing and Tracking Humans with Transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2023. Available from:https://arxiv.org/abs/2305.20091. arXiv:2305.20091
-
[56]
CameraHMR: Aligning People with Perspective
Patel P, Black MJ. CameraHMR: Aligning People with Perspective. In: 2025 International Conference on 3D Vision (3DV). IEEE; 2025. p. 1562-71. doi:10.1109/3dv66043.2025.00146
-
[57]
Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition
Ord´ o˜ nez F, Roggen D. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition. Sensors. 2016;16(1):115. doi:10.3390/s16010115
-
[58]
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
Shi L, Zhang Y, Cheng J, Lu H. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2019. p. 12018-27. doi:10.1109/cvpr.2019.01230
-
[59]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR); 2020. Available from:https://arxiv.org/abs/2010.11929. arXiv:2010.11929
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[60]
Yang R, Kennedy A, Cotton RJ. BiomechGPT: Towards a Biomechanically Fluent Multimodal Foundation Model for Clinically Relevant Motion Tasks. arXiv preprint arXiv:250518465. 2025. Available from:https://arxiv.org/abs/2505.18465. arXiv:2505.18465
-
[61]
Google DeepMind. Gemma 4 Model Card; 2026. Accessed 2026. Google AI for Developers Documentation. Available from:https://ai.google.dev/gemma/docs/core/model_card_4
work page 2026
-
[62]
Xu V, Gao C, Hoffmann H, Ahuja K. MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices. In: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology. ACM; 2024. p. 1-11. doi:10.1145/3654777.3676461
-
[63]
GSMA. The Mobile Economy 2024; 2024. Available from: https://www.gsma.com/r/mobileeconomy/
work page 2024
-
[64]
The Swift Programming Language; 2015
Apple Inc , The Swift Project Authors. The Swift Programming Language; 2015. Open-sourced December 2015 under the Apache 2.0 license with a Runtime Library Exception. Accessed 2026-05-14. https://swift.org
work page 2015
-
[65]
Apple Inc . Xcode; 2026. Integrated development environment for Apple platforms. Accessed 2026-05-14. Apple Developer. Available from:https://developer.apple.com/xcode/. May 21, 2026 15/18
work page 2026
-
[66]
TestFlight — Beta Testing Made Simple; 2024
Apple Inc . TestFlight — Beta Testing Made Simple; 2024. Apple’s official beta-distribution service for iOS apps. Accessed 2026-05-09. Apple Developer. Available from: https://developer.apple.com/testflight/
work page 2024
-
[67]
Shin S, Kim J, Halilaj E, Black MJ. WHAM: Reconstructing World-Grounded Humans with Accurate 3D Motion. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2024. p. 2070-80. doi:10.1109/cvpr52733.2024.00202
-
[68]
FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging with Diffusion Decoding
Mehraban S, Iaboni A, Taati B. FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging with Diffusion Decoding. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); 2026. Available from:https://arxiv.org/abs/2510.10868. arXiv:2510.10868
-
[69]
Kwon H, Clifford GD, Genias I, Bernhard D, Esper CD, Factor SA, et al. An Explainable Spatial-Temporal Graphical Convolutional Network to Score Freezing of Gait in Parkinsonian Patients. Sensors. 2023;23(4):1766. doi:10.3390/s23041766
-
[70]
Vision Framework Documentation; 2024
Apple Inc . Vision Framework Documentation; 2024. On-device computer vision APIs including VNDetectHumanRectanglesRequest. Accessed 2026-05-09. Apple Developer Documentation. Available from:https://developer.apple.com/documentation/vision
work page 2024
-
[71]
Apple Inc . Accelerate Framework; 2024. High-performance vectorized CPU primitives spanning BLAS, LAPACK, vDSP, vImage, and BNNS. Accessed 2026-05-14. Apple Developer Documentation. Available from:https://developer.apple.com/accelerate/
work page 2024
-
[72]
Apple introduces iPhone 13 and iPhone 13 mini; 2021
Apple Inc . Apple introduces iPhone 13 and iPhone 13 mini; 2021. Announcement of the A15 Bionic system-on-chip; six-core CPU, five-core GPU, and 16-core Neural Engine. Accessed 2026-05-09. Apple Newsroom. Available from:https: //www.apple.com/newsroom/2021/09/apple-introduces-iphone-13-and-iphone-13-mini/
work page 2021
-
[73]
iPhone 14 Technical Specifications; 2023
Apple Inc . iPhone 14 Technical Specifications; 2023. iPhone 14 specifications including A15 Bionic chip configuration. Accessed 2026-05-09. Apple Support. Available from: https://support.apple.com/en-us/111872
work page 2023
-
[74]
NVIDIA H200 Tensor Core GPU Datasheet; 2024
NVIDIA Corporation. NVIDIA H200 Tensor Core GPU Datasheet; 2024. Specifications for the H200 NVL (141 GB HBM3e, 4.8 TB/s memory bandwidth, dual-slot PCIe, 600 W TGP). Accessed 2026-05-10. NVIDIA Product Documentation. Available from:https://resources.nvidia.com/ en-us-data-center-overview-mc/en-us-data-center-overview/hpc-datasheet-sc23-h200
work page 2024
-
[75]
Intel Xeon 6731P Processor (144 M Cache, 2.50 GHz); 2025
Intel Corporation. Intel Xeon 6731P Processor (144 M Cache, 2.50 GHz); 2025. 32 cores, 64 threads, 2.5 GHz base / 4.1 GHz max turbo, 144 MB cache, 245 W TDP, DDR5 6400 MT/s, PCIe 5.0. Accessed 2026-05-10. Intel Product Specifications (ARK). Available from: https://www.intel.com/content/www/us/en/products/sku/242635/ intel-xeon-6731p-processor-144m-cache-2...
work page 2025
-
[76]
Benchmarking Quantum Red TEA on CPUs, GPUs, and TPUs
Ansel J, Yang E, He H, Gimelshein N, Jain A, Voznesensky M, et al. PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation. In: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Association for Computing Machinery; 2024. p....
-
[77]
Reddy L, Anand K, Kaushik S, Rodrigo C, McKay JL, Kesar TM, et al. Classifying simulated gait impairments using privacy-preserving explainable artificial intelligence and mobile phone videos. PLOS Digital Health. 2025;4(9):e0001004. doi:10.1371/journal.pdig.0001004
-
[78]
Length and Redundancy of Outpatient Progress Notes Across a Decade at an Academic Medical Center
Rule A, Bedrick S, Chiang MF, Hribar MR. Length and Redundancy of Outpatient Progress Notes Across a Decade at an Academic Medical Center. JAMA Network Open. 2021;4(7):e2115334. doi:10.1001/jamanetworkopen.2021.15334. May 21, 2026 16/18
-
[79]
Physician Sentiments Around the Use of AI in Health Care; 2024
American Medical Association. Physician Sentiments Around the Use of AI in Health Care; 2024. 66% of US physicians reported using AI in 2024, up from 38% in 2023; 47% ranked increased oversight as the top regulatory action. Accessed 2026-05-10. AMA Augmented Intelligence Research, 2024 Physician Survey (N≈ 1,200). Available from: https://www.ama-assn.org/...
work page 2024
-
[80]
Zanardo M, Visser JJ, Colarieti A, Cuocolo R, Klontzas ME, Pinto dos Santos D, et al. Impact of AI on radiology: a EuroAIM/EuSoMII 2024 survey among members of the European Society of Radiology. Insights into Imaging. 2024;15(1):240. doi:10.1186/s13244-024-01801-w
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.