A hardware-free dual-camera capture framework with ChArUco spatial unification and receding-horizon state alignment enables decoupled SE(3) manipulation and SE(2) base trajectories for diffusion policies, yielding 83.8% average success on four long-horizon household tasks.
hub
OPEN TEACH: A versatile teleoperation system for robotic manipulation.arXiv preprint arXiv:2403.07870
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
TouchGuide improves contact-rich robot manipulation by steering diffusion or flow-matching visuomotor policies with tactile feasibility scores from a contrastively trained Contact Physical Model.
DexTwist detects tripod pinches, estimates the intended screw axis and twist magnitude, then applies real-time joint refinement to track turning progress while stabilizing the robot's tripod geometry.
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.
TeleGate achieves high-precision real-time whole-body teleoperation of humanoid robots by dynamically gating between expert policies and using a VAE motion prior to infer future intent from history, outperforming distillation baselines on dynamic motions with only 2.5 hours of mocap data.
EgoVLA pretrains VLA models on egocentric human videos, retargets predicted actions to robots via IK, and fine-tunes on few robot demos to improve bimanual manipulation performance on a new simulation benchmark.
TeleSim is a publicly shared network-aware testbed and benchmark dataset that records completion time, success rate, PSNR, SSIM, and QoS metrics across 300 trials under high, medium, and low network conditions simulated in OMNeT++.
DexWild co-trains dexterous robot policies on in-the-wild human hand interactions recorded with a low-cost system and limited robot data, achieving 68.5% success in unseen environments and 5.8x better cross-embodiment generalization.
Adaptor uses few-shot learning with trajectory perturbation and vision-language conditioning to achieve robust cross-operator intent recognition and higher success rates in assistive teleoperation.
A multi-view point cloud VR system with wrist RGB detail outperforms RGB streams and stereo views in robot teleoperation tasks per a 31-participant user study.
HandelBot refines simulation policies via physical rollouts and residual RL to achieve precise bimanual piano playing, outperforming direct sim transfer by 1.8x with only 30 minutes of real data across five songs.
A two-room Wizard-of-Oz pilot collected 53 multimodal trials from five users to capture dialogue ambiguities for training ambiguity-aware assistive robot controllers.
citing papers explorer
-
Mobile UMI: Cross-View Diffusion Policy with Decoupled Kinematics for Mobile Manipulation
A hardware-free dual-camera capture framework with ChArUco spatial unification and receding-horizon state alignment enables decoupled SE(3) manipulation and SE(2) base trajectories for diffusion policies, yielding 83.8% average success on four long-horizon household tasks.
-
TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance
TouchGuide improves contact-rich robot manipulation by steering diffusion or flow-matching visuomotor policies with tactile feasibility scores from a contrastively trained Contact Physical Model.
-
DexTwist: Dexterous Hand Retargeting for Twist Motion via Mixed Reality-based Teleoperation
DexTwist detects tripod pinches, estimates the intended screw axis and twist magnitude, then applies real-time joint refinement to track turning progress while stabilizing the robot's tripod geometry.
-
FingerViP: Learning Real-World Dexterous Manipulation with Fingertip Visual Perception
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
-
WARPED: Wrist-Aligned Rendering for Robot Policy Learning from Egocentric Human Demonstrations
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.
-
TeleGate: Whole-Body Humanoid Teleoperation via Gated Expert Selection with Motion Prior
TeleGate achieves high-precision real-time whole-body teleoperation of humanoid robots by dynamically gating between expert policies and using a VAE motion prior to infer future intent from history, outperforming distillation baselines on dynamic motions with only 2.5 hours of mocap data.
-
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
EgoVLA pretrains VLA models on egocentric human videos, retargets predicted actions to robots via IK, and fine-tunes on few robot demos to improve bimanual manipulation performance on a new simulation benchmark.
-
TeleSim: A Network-Aware Testbed and Benchmark Dataset for Telerobotic Applications
TeleSim is a publicly shared network-aware testbed and benchmark dataset that records completion time, success rate, PSNR, SSIM, and QoS metrics across 300 trials under high, medium, and low network conditions simulated in OMNeT++.
-
DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies
DexWild co-trains dexterous robot policies on in-the-wild human hand interactions recorded with a low-cost system and limited robot data, achieving 68.5% success in unseen environments and 5.8x better cross-embodiment generalization.
-
Adaptor: Advancing Assistive Teleoperation with Few-Shot Learning and Cross-Operator Generalization
Adaptor uses few-shot learning with trajectory perturbation and vision-language conditioning to achieve robust cross-operator intent recognition and higher success rates in assistive teleoperation.
-
A Multi-View 3D Telepresence System for XR Robot Teleoperation
A multi-view point cloud VR system with wrist RGB detail outperforms RGB streams and stereo views in robot teleoperation tasks per a 31-participant user study.
-
HandelBot: Real-World Piano Playing via Fast Adaptation of Dexterous Robot Policies
HandelBot refines simulation policies via physical rollouts and residual RL to achieve precise bimanual piano playing, outperforming direct sim transfer by 1.8x with only 30 minutes of real data across five songs.
-
A Multimodal Data Collection Framework for Dialogue-Driven Assistive Robotics to Clarify Ambiguities: A Wizard-of-Oz Pilot Study
A two-room Wizard-of-Oz pilot collected 53 multimodal trials from five users to capture dialogue ambiguities for training ambiguity-aware assistive robot controllers.