DexJoCo is a benchmark and toolkit with 11 functionally grounded tasks, 1.1K trajectories, and empirical benchmarks for task-oriented dexterous manipulation on MuJoCo.
hub Canonical reference
Karen Liu
Canonical reference. 88% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
HandITL enables seamless human intervention in VLA policies for bimanual dexterous manipulation, cutting jitter by 99.8% and improving refined policies by 19% over standard teleoperation.
StereoPolicy fuses stereo image pairs via a Stereo Transformer on pretrained 2D encoders to boost robotic manipulation policies, showing gains over monocular, RGB-D, point cloud, and multi-view methods in simulations and real-robot tests.
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
DEX-Mouse is a portable, calibration-free teleoperation interface under $150 with kinesthetic force feedback that supports mounting the robot hand on the operator's forearm for aligned data collection, achieving 86.67% task completion and lower perceived workload than separated setups.
ActiveGlasses learns robot manipulation from ego-centric human demos captured with active vision via smart glasses, achieving zero-shot transfer using object-centric point-cloud policies.
TeleGate achieves high-precision real-time whole-body teleoperation of humanoid robots by dynamically gating between expert policies and using a VAE motion prior to infer future intent from history, outperforming distillation baselines on dynamic motions with only 2.5 hours of mocap data.
Uni-Hand forecasts 2D/3D hand waypoints, head motion, and contact states in egocentric views using vision-language fusion and dual-branch diffusion, with new benchmarks for downstream robotics and action tasks.
A hybrid event-driven switching system pairs VLA models with lightweight dexterous policies on a compliant anthropomorphic hand to perform language-conditioned multi-finger tasks with cross-embodiment modularity.
DP3 uses compact 3D representations from sparse point clouds inside diffusion policies to learn generalizable visuomotor skills from few demonstrations, reporting 24% gains in simulation and 85% success on real robots.
FlexiTac is a scalable piezoresistive tactile sensing system with flexible FPC-Velostat-FPC pads and a 100 Hz multi-channel readout board that mounts on rigid or soft grippers and supports visuo-tactile learning.
A structured survey of dexterous robotic hand research that reviews hardware, control methods, data resources, and benchmarks while identifying major limitations and future directions.
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
A position paper proposes decomposing affective robotic touch into multiple specialized deep learning models for better social human-robot interaction.
A survey of VLA robotics research identifies data infrastructure as the primary bottleneck and distills four open challenges in representation alignment, multimodal supervision, reasoning assessment, and scalable data generation.
citing papers explorer
-
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo
DexJoCo is a benchmark and toolkit with 11 functionally grounded tasks, 1.1K trajectories, and empirical benchmarks for task-oriented dexterous manipulation on MuJoCo.
-
Hand-in-the-Loop: Improving VLA Policies for Dexterous Manipulation via Seamless Hand-Arm Intervention
HandITL enables seamless human intervention in VLA policies for bimanual dexterous manipulation, cutting jitter by 99.8% and improving refined policies by 19% over standard teleoperation.
-
StereoPolicy: Improving Robotic Manipulation Policies via Stereo Perception
StereoPolicy fuses stereo image pairs via a Stereo Transformer on pretrained 2D encoders to boost robotic manipulation policies, showing gains over monocular, RGB-D, point cloud, and multi-view methods in simulations and real-robot tests.
-
FingerViP: Learning Real-World Dexterous Manipulation with Fingertip Visual Perception
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
-
DEX-Mouse: A Low-cost Portable and Universal Interface with Force Feedback for Data Collection of Dexterous Robotic Hands
DEX-Mouse is a portable, calibration-free teleoperation interface under $150 with kinesthetic force feedback that supports mounting the robot hand on the operator's forearm for aligned data collection, achieving 86.67% task completion and lower perceived workload than separated setups.
-
ActiveGlasses: Learning Manipulation with Active Vision from Ego-centric Human Demonstration
ActiveGlasses learns robot manipulation from ego-centric human demos captured with active vision via smart glasses, achieving zero-shot transfer using object-centric point-cloud policies.
-
TeleGate: Whole-Body Humanoid Teleoperation via Gated Expert Selection with Motion Prior
TeleGate achieves high-precision real-time whole-body teleoperation of humanoid robots by dynamically gating between expert policies and using a VAE motion prior to infer future intent from history, outperforming distillation baselines on dynamic motions with only 2.5 hours of mocap data.
-
Uni-Hand: Universal Hand Motion Forecasting in Egocentric Views
Uni-Hand forecasts 2D/3D hand waypoints, head motion, and contact states in egocentric views using vision-language fusion and dual-branch diffusion, with new benchmarks for downstream robotics and action tasks.
-
Language Conditioned Multi-Finger Dexterous Manipulation Enabled by Physical Compliance and Switching of Controllers
A hybrid event-driven switching system pairs VLA models with lightweight dexterous policies on a compliant anthropomorphic hand to perform language-conditioned multi-finger tasks with cross-embodiment modularity.
-
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
DP3 uses compact 3D representations from sparse point clouds inside diffusion policies to learn generalizable visuomotor skills from few demonstrations, reporting 24% gains in simulation and 85% success on real robots.
-
FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems
FlexiTac is a scalable piezoresistive tactile sensing system with flexible FPC-Velostat-FPC pads and a 100 Hz multi-channel readout board that mounts on rigid or soft grippers and supports visuo-tactile learning.
-
Towards Robotic Dexterous Hand Intelligence: A Survey
A structured survey of dexterous robotic hand research that reviews hardware, control methods, data resources, and benchmarks while identifying major limitations and future directions.
-
World Action Models: The Next Frontier in Embodied AI
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
-
Robotic Affection -- Opportunities of AI-based haptic interactions to improve social robotic touch through a multi-deep-learning approach
A position paper proposes decomposing affective robotic touch into multiple specialized deep learning models for better social human-robot interaction.
-
Vision-Language-Action in Robotics: A Survey of Datasets, Benchmarks, and Data Engines
A survey of VLA robotics research identifies data infrastructure as the primary bottleneck and distills four open challenges in representation alignment, multimodal supervision, reasoning assessment, and scalable data generation.