pith. machine review for the scientific record. sign in

arxiv: 1809.10790 · v1 · submitted 2018-09-27 · 💻 cs.RO

Recognition: unknown

Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects

Authors on Pith no claims yet
classification 💻 cs.RO
keywords datasyntheticnetworkdeepestimationobjectobjectspose
0
0 comments X
read the original abstract

Using synthetic data for training deep neural networks for robotic manipulation holds the promise of an almost unlimited amount of pre-labeled training data, generated safely out of harm's way. One of the key challenges of synthetic data, to date, has been to bridge the so-called reality gap, so that networks trained on synthetic data operate correctly when exposed to real-world data. We explore the reality gap in the context of 6-DoF pose estimation of known objects from a single RGB image. We show that for this problem the reality gap can be successfully spanned by a simple combination of domain randomized and photorealistic data. Using synthetic data generated in this manner, we introduce a one-shot deep neural network that is able to perform competitively against a state-of-the-art network trained on a combination of real and synthetic data. To our knowledge, this is the first deep network trained only on synthetic data that is able to achieve state-of-the-art performance on 6-DoF object pose estimation. Our network also generalizes better to novel environments including extreme lighting conditions, for which we show qualitative results. Using this network we demonstrate a real-time system estimating object poses with sufficient accuracy for real-world semantic grasping of known household objects in clutter by a real robot.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation

    cs.RO 2024-09 conditional novelty 7.0

    ReKep encodes robotic tasks as optimizable Python functions over 3D keypoints that are generated automatically from language and RGB-D input, enabling real-time hierarchical planning on single- and dual-arm platforms ...

  2. OMNI-PoseX: A Fast Vision Model for 6D Object Pose Estimation in Embodied Tasks

    cs.RO 2026-04 unverdicted novelty 5.0

    OMNI-PoseX presents a unified vision model using open-vocabulary perception and SO(3)-aware reflected flow matching to deliver state-of-the-art 6D pose estimation with real-time performance for embodied tasks.