pith. machine review for the scientific record. sign in

arxiv: 2506.03103 · v2 · submitted 2025-06-03 · 💻 cs.CV

Recognition: unknown

DyTact: Capturing Dynamic Contacts in Hand-Object Manipulation

Authors on Pith no claims yet
classification 💻 cs.CV
keywords dynamicdytactcontacthand-objectcapturecapturingcomplexcontacts
0
0 comments X
read the original abstract

Reconstructing dynamic hand-object contacts is essential for realistic manipulation in AI character animation, XR, and robotics, yet it remains challenging due to heavy occlusions, complex surface details, and limitations in existing capture techniques. In this paper, we introduce DyTact, a markerless capture method for accurately capturing dynamic contact in hand-object manipulations in a non-intrusive manner. Our approach leverages a dynamic, articulated representation based on 2D Gaussian surfels to model complex manipulations. By binding these surfels to MANO meshes, DyTact harnesses the inductive bias of template models to stabilize and accelerate optimization. A refinement module addresses time-dependent high-frequency deformations, while a contact-guided adaptive sampling strategy selectively increases surfel density in contact regions to handle heavy occlusion. Extensive experiments demonstrate that DyTact not only achieves state-of-the-art dynamic contact estimation accuracy but also significantly improves novel view synthesis quality, all while operating with fast optimization and efficient memory usage. Project Page: https://oliver-cong02.github.io/DyTact.github.io/ .

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Grasp in Gaussians: Fast Monocular Reconstruction of Dynamic Hand-Object Interactions

    cs.CV 2026-04 unverdicted novelty 6.0

    GraG reconstructs dynamic 3D hand-object interactions from monocular video 6.4x faster than prior work by using compact Sum-of-Gaussians tracking initialized from large models and refined with 2D losses.