JointHOI jointly generates hand-object motion and distance-based contact maps in one diffusion stage to improve temporal stability and physical plausibility over prior multi-stage HOI methods.
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
TCA-Captioner introduces an Observer-Checker-Corrector refinement loop and TCA-Bench to address modality detachment and temporal incoherence in audiovisual video captioning.
citing papers explorer
-
JointHOI: Jointly Generating Contact Maps Enhances Hand Object Interaction Generation
JointHOI jointly generates hand-object motion and distance-based contact maps in one diffusion stage to improve temporal stability and physical plausibility over prior multi-stage HOI methods.
-
Temporal and Cross-Modal Alignment for Enhanced Audiovisual Video Captioning
TCA-Captioner introduces an Observer-Checker-Corrector refinement loop and TCA-Bench to address modality detachment and temporal incoherence in audiovisual video captioning.