Impact of Hand Impairment and Occlusions on Hand Pose Estimation Accuracy in Augmented Reality Applications
Pith reviewed 2026-06-27 02:03 UTC · model grok-4.3
The pith
Hand pose estimation accuracy does not differ between people with cervical spinal cord injury and uninjured controls.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study found that 3D joint predictions from the HoloLens 2 and from WiLoR, HaMeR, WildHands, and MediaPipe showed no accuracy difference between the cSCI group and uninjured controls. Clear objects produced a 0.1 mm advantage over opaque objects, and WiLoR and HaMeR outperformed the HoloLens 2 by about 2 mm. Ground truth came from triangulation across a multi-camera setup while participants interacted with real objects.
What carries the argument
Direct comparison of 3D hand joint predictions from the HoloLens 2 and four pose estimation algorithms against multi-camera triangulated ground truth during real-object interactions.
If this is right
- The HoloLens 2 can support hand rehabilitation applications that involve real-object interactions.
- Existing pose estimation algorithms generalize to populations with hand impairment from cSCI.
- The collected dataset can be used to improve future algorithms for impaired-hand tracking.
- Small accuracy gains from clear objects suggest occlusion by real objects has limited impact.
Where Pith is reading between the lines
- Rehabilitation apps could safely incorporate physical objects without major loss of tracking reliability.
- Similar accuracy testing could be extended to other hand-impairment causes such as stroke or arthritis.
- Individual calibration may still be needed even if group-level accuracy is preserved.
Load-bearing premise
Ground truth estimates of 3D joint positions generated via triangulation from a multi-camera setup are sufficiently accurate to serve as the reference for all comparisons.
What would settle it
An independent measurement of the same hand poses using a different sensor technology such as electromagnetic markers that yields large systematic discrepancies with the multi-camera triangulation would falsify the accuracy claims.
Figures
read the original abstract
Mixed reality applications can be designed for hand rehabilitation. Augmented reality (AR) head mounted displays (HMDs) specifically allow for ecologically valid tasks because individuals can see their real environment and interact with real objects while receiving additional cues on the HMD. While these applications rely on accurate hand pose estimation, there is a gap in investigating the influence of hand impairment or occlusion from real-object interactions on pose estimation accuracy. Further, comparisons between AR HMD predictions and state-of-the-art pose estimation methods have not been established. The current study assessed pose estimation accuracy of the HoloLens 2 HMD and state-of-the-art pose estimation algorithms (WiLoR, HaMeR, WildHands, and MediaPipe) while individuals with cervical spinal cord injury (cSCI; n = 13, Neurological Level of Injury: C3-C6; American Spinal Injury Association Impairment Scale: A-D) and 15 uninjured controls interacted with clear and opaque objects. Ground truth estimates of 3D joint positions were generated via triangulation from a multi-camera setup. Pose estimation accuracy did not differ between the cSCI and uninjured control groups suggesting that 3D joint predictions from the HoloLens 2 and pose estimation algorithms can generalize to populations with hand impairment. Further, clear objects provided a small accuracy advantage over opaque objects (0.1 mm) and predictions from both WiLoR and HaMeR were slightly more accurate than the HoloLens 2 (2 mm). Overall, these results suggest that the HoloLens 2 may be viable for hand rehabilitation applications and the dataset generated can be used to refine pose estimation methods for hand-impaired populations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports an empirical comparison of 3D hand pose estimation accuracy for the HoloLens 2 HMD and four algorithms (WiLoR, HaMeR, WildHands, MediaPipe) in cSCI participants (n=13, C3-C6, AIS A-D) versus uninjured controls (n=15). Participants interacted with clear and opaque objects; ground truth 3D joint positions were obtained by multi-camera triangulation. The central claims are that accuracy did not differ between groups (suggesting generalization to hand impairment), clear objects yielded a 0.1 mm advantage, and WiLoR/HaMeR outperformed HoloLens 2 by 2 mm. The authors conclude that HoloLens 2 is viable for AR hand rehabilitation and release a dataset for impaired-hand refinement.
Significance. If the no-difference claim is substantiated, the work would be moderately significant for AR rehabilitation applications, providing the first direct evidence that current pose estimators generalize across cSCI-related hand impairment. The release of a dataset containing impaired-hand interactions is a concrete strength. However, the reported differences are extremely small and the absence of statistical support or ground-truth validation leaves the practical implications unclear.
major comments (3)
- [Abstract] Abstract: the claim that 'Pose estimation accuracy did not differ between the cSCI and uninjured control groups' is presented without statistical tests, p-values, confidence intervals, or per-group variance measures. This directly undermines the generalization conclusion.
- [Abstract] Abstract (and implied Methods): the multi-camera triangulation ground truth is treated as an unbiased reference for all comparisons, yet no per-group reprojection error, localization uncertainty, or visibility statistics are reported. cSCI participants may exhibit atypical postures and reduced visibility that systematically increase triangulation error relative to controls, which would invalidate the null-group-difference result.
- [Abstract] Abstract: the numerical differences (0.1 mm for object opacity, 2 mm for algorithm vs. HoloLens 2) are stated without effect sizes, practical significance thresholds for AR hand tracking, or participant-level error distributions, making it impossible to judge whether they are meaningful.
minor comments (1)
- [Abstract] Abstract: the error metric (e.g., mean per-joint position error) and any participant exclusion criteria are not defined, reducing reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and indicate where revisions will be made to improve clarity and rigor.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'Pose estimation accuracy did not differ between the cSCI and uninjured control groups' is presented without statistical tests, p-values, confidence intervals, or per-group variance measures. This directly undermines the generalization conclusion.
Authors: We agree the abstract should include statistical support for the no-difference claim. The full manuscript reports per-group means and standard deviations with overlapping distributions, and we performed group comparisons. We will revise the abstract to explicitly include p-values, confidence intervals, and variance measures. revision: yes
-
Referee: [Abstract] Abstract (and implied Methods): the multi-camera triangulation ground truth is treated as an unbiased reference for all comparisons, yet no per-group reprojection error, localization uncertainty, or visibility statistics are reported. cSCI participants may exhibit atypical postures and reduced visibility that systematically increase triangulation error relative to controls, which would invalidate the null-group-difference result.
Authors: This raises a valid methodological concern. We will add per-group reprojection error, localization uncertainty, and visibility statistics to the Methods and Results sections. These will be computed from the existing multi-camera data to confirm comparability of ground-truth quality across groups. revision: yes
-
Referee: [Abstract] Abstract: the numerical differences (0.1 mm for object opacity, 2 mm for algorithm vs. HoloLens 2) are stated without effect sizes, practical significance thresholds for AR hand tracking, or participant-level error distributions, making it impossible to judge whether they are meaningful.
Authors: We agree that effect sizes and context for practical significance are needed. We will incorporate effect sizes (e.g., Cohen's d), reference established AR hand-tracking error thresholds from the literature, and include participant-level error distributions in the revised abstract and results. revision: yes
Circularity Check
No circularity: purely empirical comparison with external ground truth
full rationale
The paper reports an empirical accuracy comparison between hand-pose estimators (HoloLens 2, WiLoR, HaMeR, etc.) on cSCI vs. control participants, using multi-camera triangulation as an independent reference. No equations, fitted parameters, derivations, uniqueness theorems, or self-citations are invoked to support any claim. The central result (no group difference) is a direct statistical observation, not a quantity that reduces to its own inputs by construction. This matches the default case of a self-contained empirical study.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
The lifetime cost of spinal cord injury in Ontario, Canada: A population-based study from the perspective of the public health care payer,
B. C.-F. Chanet al., “The lifetime cost of spinal cord injury in Ontario, Canada: A population-based study from the perspective of the public health care payer,”The Journal of Spinal Cord Medicine, vol. 42, no. 2, pp. 184–193, Mar. 2019
2019
-
[2]
Targeting Recovery: Priorities of the Spinal Cord- Injured Population,
K. D. Anderson, “Targeting Recovery: Priorities of the Spinal Cord- Injured Population,”Journal of Neurotrauma, vol. 21, no. 10, pp. 1371– 1383, Oct. 2004
2004
-
[3]
Gaps in recovery priorities between individuals with spinal cord injury and healthcare professionals,
S. Samejimaet al., “Gaps in recovery priorities between individuals with spinal cord injury and healthcare professionals,”npj Health Systems, vol. 3, no. 19, pp. 1–8, Feb. 2026
2026
-
[4]
Efficacy of Virtual Reality Rehabilitation after Spinal Cord Injury: A Systematic Review,
A. V . L. De Ara ´ujoet al., “Efficacy of Virtual Reality Rehabilitation after Spinal Cord Injury: A Systematic Review,”BioMed Research International, vol. 2019, pp. 1–15, Nov. 2019
2019
-
[5]
Virtual Reality as a Therapeutic Tool in Spinal Cord Injury Rehabilitation: A Comprehensive Evaluation and Systematic Review,
M. Scaliseet al., “Virtual Reality as a Therapeutic Tool in Spinal Cord Injury Rehabilitation: A Comprehensive Evaluation and Systematic Review,”Journal of Clinical Medicine, vol. 13, no. 5429, pp. 1–16, Sep. 2024
2024
-
[6]
Augmented reality for hand function rehabili- tation: Assessing perceptions of feasibility and meaningfulness among individuals with cervical spinal cord injury,
D. M. Manzoneet al., “Augmented reality for hand function rehabili- tation: Assessing perceptions of feasibility and meaningfulness among individuals with cervical spinal cord injury,”Research Square, 2026
2026
-
[7]
The use of augmented reality for reha- bilitation after stroke: a narrative review,
C. Gorman and L. Gustafsson, “The use of augmented reality for reha- bilitation after stroke: a narrative review,”Disability and Rehabilitation: Assistive Technology, vol. 17, no. 4, pp. 409–417, May 2022
2022
-
[8]
Effectiveness of Augmented Reality in Stroke Rehabilitation: A Meta-Analysis,
H. L. Phanet al., “Effectiveness of Augmented Reality in Stroke Rehabilitation: A Meta-Analysis,”Applied Sciences, vol. 12, no. 1848, pp. 1–17, Feb. 2022
2022
-
[9]
Electrical Stimulation Exercise for People with Spinal Cord Injury: A Healthcare Provider Perspective,
D. R. Dolbowet al., “Electrical Stimulation Exercise for People with Spinal Cord Injury: A Healthcare Provider Perspective,”Journal of Clinical Medicine, vol. 12, no. 3150, pp. 1–14, Apr. 2023
2023
-
[10]
Grasp Analysis in the Home Environment as a Measure of Hand Function After Cervical Spinal Cord Injury,
M. Doustyet al., “Grasp Analysis in the Home Environment as a Measure of Hand Function After Cervical Spinal Cord Injury,”Neu- rorehabilitation and Neural Repair, vol. 37, no. 7, pp. 466–474, Jul. 2023. 10
2023
-
[11]
Personalized video-based hand taxonomy using egocentric video in the wild,
M. Dousty, D. J. Fleet, and J. Zariffa, “Personalized video-based hand taxonomy using egocentric video in the wild,”IEEE Journal of Biomedical and Health Informatics, vol. 29, no. 9, pp. 6214–6225, 2024
2024
-
[12]
Does Task-Specific Training Improve Upper Limb Performance in Daily Life Poststroke?
K. J. Waddellet al., “Does Task-Specific Training Improve Upper Limb Performance in Daily Life Poststroke?”Neurorehabilitation and Neural Repair, vol. 31, no. 3, pp. 290–300, Mar. 2017
2017
-
[13]
Activity-Based Therapy: From Basic Science to Clinical Application for Recovery After Spinal Cord Injury,
A. L. Behrman, E. M. Ardolino, and S. J. Harkema, “Activity-Based Therapy: From Basic Science to Clinical Application for Recovery After Spinal Cord Injury,”Journal of Neurologic Physical Therapy, vol. 41, pp. S39–S45, Jul. 2017
2017
-
[14]
Challenges and trends in egocentric vision: A survey,
X. Liet al., “Challenges and trends in egocentric vision: A survey,” Machine Intelligence Research, vol. 23, no. 1, pp. 1–33, Feb. 2026
2026
-
[15]
A methodological framework to assess the accuracy of virtual reality hand-tracking systems: A case study with the Meta Quest 2,
D. Abdlkarimet al., “A methodological framework to assess the accuracy of virtual reality hand-tracking systems: A case study with the Meta Quest 2,”Behavior Research Methods, vol. 56, no. 2, pp. 1052–1063, Feb. 2024
2024
-
[16]
Dynamic Pose Tracking Performance Evaluation of HTC Vive Virtual Reality System,
M. S. Ikbal, V . Ramadoss, and M. Zoppi, “Dynamic Pose Tracking Performance Evaluation of HTC Vive Virtual Reality System,”IEEE Access, vol. 9, pp. 3798–3815, Dec. 2021
2021
-
[17]
Accuracy Evaluation of Touch Tasks in Commodity Virtual and Augmented Reality Head-Mounted Displays,
D. Schneideret al., “Accuracy Evaluation of Touch Tasks in Commodity Virtual and Augmented Reality Head-Mounted Displays,” inProceed- ings of the 2021 ACM Symposium on Spatial User Interaction, Nov. 2021, pp. 1–11
2021
-
[18]
Accuracy and repeatability tests on hololens 2 and htc vive,
I. Soareset al., “Accuracy and repeatability tests on hololens 2 and htc vive,”Multimodal Technologies and Interaction, vol. 5, no. 47, pp. 1–14, Aug. 2021
2021
-
[19]
Evaluation of HoloLens 2 for Hand Tracking and Kinematic Features Assessment,
J. Bertolasiet al., “Evaluation of HoloLens 2 for Hand Tracking and Kinematic Features Assessment,”Virtual Worlds, vol. 4, no. 31, pp. 1– 18, Jul. 2025
2025
-
[20]
Validation of the Comprehensive Augmented Reality Testing Platform to Quantify Parkinson’s Disease Fine Motor Perfor- mance,
A. Bazyket al., “Validation of the Comprehensive Augmented Reality Testing Platform to Quantify Parkinson’s Disease Fine Motor Perfor- mance,”Journal of Clinical Medicine, vol. 14, no. 3966, pp. 1–16, Jun. 2025
2025
-
[21]
Quantitative Comparison of Hand Kinematics Mea- sured with a Markerless Commercial Head-Mounted Display and a Marker-Based Motion Capture System in Stroke Survivors,
A. Casileet al., “Quantitative Comparison of Hand Kinematics Mea- sured with a Markerless Commercial Head-Mounted Display and a Marker-Based Motion Capture System in Stroke Survivors,”Sensors, vol. 23, no. 7906, pp. 1–13, Sep. 2023
2023
-
[22]
Principles of Experience-Dependent Neural Plasticity: Implications for Rehabilitation After Brain Damage,
J. A. Kleim and T. A. Jones, “Principles of Experience-Dependent Neural Plasticity: Implications for Rehabilitation After Brain Damage,” Journal of Speech, Language, and Hearing Research, vol. 51, no. 1, pp. S225–S239, Feb. 2008
2008
-
[23]
Partially Occluded Hands:,
B. Myanganbayaret al., “Partially Occluded Hands:,” inComputer Vision – ACCV 2018, C. Jawaharet al., Eds., 2019, pp. 85–98
2018
-
[24]
Wilor: End-to-end 3d hand localization and reconstruction in-the-wild,
R. A. Potamiaset al., “Wilor: End-to-end 3d hand localization and reconstruction in-the-wild,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 12 242–12 254
2025
-
[25]
Reconstructing hands in 3d with transformers,
G. Pavlakoset al., “Reconstructing hands in 3d with transformers,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 9826–9836
2024
-
[26]
Benchmarks and challenges in pose estimation for egocentric hand interactions with objects,
Z. Fanet al., “Benchmarks and challenges in pose estimation for egocentric hand interactions with objects,” inEuropean Conference on Computer Vision, 2024, pp. 428–448
2024
-
[27]
Evaluating hololens 2 pose estimation accuracy for individuals with cervical spinal cord injury,
D. M. Manzoneet al., “Evaluating hololens 2 pose estimation accuracy for individuals with cervical spinal cord injury,” in2026 48th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Toronto, ON, CA, July 2026, to appear
2026
-
[28]
Hololens 2 research mode as a tool for computer vision research,
D. Ungureanuet al., “Hololens 2 research mode as a tool for computer vision research,”arXiv preprint arXiv:2008.11239, 2020
-
[29]
Automatic generation and detection of highly reliable fiducial markers under occlusion,
S. Garrido-Juradoet al., “Automatic generation and detection of highly reliable fiducial markers under occlusion,”Pattern Recognition, vol. 47, no. 6, pp. 2280–2292, Jun. 2014
2014
-
[30]
The Graded Redefined Assessment of Strength Sensibility and Prehension: Reliability and Validity,
S. Kalsi-Ryanet al., “The Graded Redefined Assessment of Strength Sensibility and Prehension: Reliability and Validity,”Journal of Neuro- trauma, vol. 29, no. 5, pp. 905–914, Mar. 2012
2012
-
[31]
3-Dimensional printing in rehabilitation: feasibility of printing an upper extremity gross motor function assessment tool,
N. Kapadiaet al., “3-Dimensional printing in rehabilitation: feasibility of printing an upper extremity gross motor function assessment tool,” BioMedical Engineering OnLine, vol. 20, no. 1, Dec. 2021
2021
-
[32]
Preliminary evaluation of the reliability and validity of the 3D printed Toronto Rehabilitation Institute-Hand Function Test in individuals with spinal cord injury,
N. Kapadiaet al., “Preliminary evaluation of the reliability and validity of the 3D printed Toronto Rehabilitation Institute-Hand Function Test in individuals with spinal cord injury,”The Journal of Spinal Cord Medicine, vol. 44, pp. S225–S233, Sep. 2021
2021
-
[34]
Deep High-Resolution Representation Learning for Visual Recognition,
J. Wanget al., “Deep High-Resolution Representation Learning for Visual Recognition,” Mar. 2020, arXiv:1908.07919 [cs]
-
[35]
Benchmarking 2d egocentric hand pose datasets,
O. Taran, D. M. Manzone, and J. Zariffa, “Benchmarking 2d egocentric hand pose datasets,”IEEE Access, vol. 13, pp. 92 445–92 456, May 2025
2025
-
[36]
Mediapipe hands: On-device real-time hand tracking,
F. Zhanget al., “Mediapipe hands: On-device real-time hand tracking,” arXiv preprint arXiv:2006.10214, 2020
-
[37]
3d hand pose estimation in everyday egocentric images,
A. Prakashet al., “3d hand pose estimation in everyday egocentric images,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 183–202
2024
-
[38]
Generalized procrustes analysis,
J. C. Gower, “Generalized procrustes analysis,”Psychometrika, vol. 40, no. 1, pp. 33–51, Mar. 1975
1975
-
[39]
End-to-End Recovery of Human Shape and Pose,
A. Kanazawaet al., “End-to-End Recovery of Human Shape and Pose,” in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun. 2018, pp. 7122–7131
2018
-
[40]
Applications of Pose Estimation in Human Health and Performance across the Lifespan,
J. Stenumet al., “Applications of Pose Estimation in Human Health and Performance across the Lifespan,”Sensors, vol. 21, no. 21, Nov. 2021
2021
-
[41]
Hot3d: Hand and object tracking in 3d from egocen- tric multi-view videos,
P. Banerjeeet al., “Hot3d: Hand and object tracking in 3d from egocen- tric multi-view videos,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025, pp. 7061–7071
2025
-
[42]
Rehabhand—a new physical rehabilitation training dataset: Construction and benchmark performances of the relevant hand tasks,
S. H. Nguyenet al., “Rehabhand—a new physical rehabilitation training dataset: Construction and benchmark performances of the relevant hand tasks,”IEEE Access, vol. 13, pp. 102 373–102 389, Jun. 2025
2025
-
[43]
H2O: Two Hands Manipulating Objects for First Person Interaction Recognition,
T. Kwonet al., “H2O: Two Hands Manipulating Objects for First Person Interaction Recognition,” Aug. 2021, arXiv:2104.11181 [cs]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.