BORA combines offline RL critic training with online chunk-wise residual adaptation to raise average success rates of real-world dexterous VLA policies by 33% and up to 43% on unseen objects across five tasks.
Flowing from reasoning to motion: Learning 3d hand trajectory prediction from egocentric human interaction videos
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
MoT-HRA learns embodiment-agnostic human-intention priors from a curated 2.2M-episode human video dataset via a three-expert hierarchical vision-language-action model to improve robotic manipulation under distribution shift.
Donk is a unified video-action denoising model that generates dexterous hand trajectories and videos under language, image, and state conditioning while also serving as a text-conditioned data engine.
citing papers explorer
-
BORA: Bridging Offline Reinforcement Learning and Online Residual Adaptation for Real-World Dexterous VLA Models
BORA combines offline RL critic training with online chunk-wise residual adaptation to raise average success rates of real-world dexterous VLA policies by 33% and up to 43% on unseen objects across five tasks.
-
Learning Human-Intention Priors from Large-Scale Human Demonstrations for Robotic Manipulation
MoT-HRA learns embodiment-agnostic human-intention priors from a curated 2.2M-episode human video dataset via a three-expert hierarchical vision-language-action model to improve robotic manipulation under distribution shift.
-
Unified Video-Action Joint Denoising for Dexterous Action and Data Generation
Donk is a unified video-action denoising model that generates dexterous hand trajectories and videos under language, image, and state conditioning while also serving as a text-conditioned data engine.