pith. sign in

arxiv: 1807.05705 · v2 · pith:46O7IUZ4new · submitted 2018-07-16 · 💻 cs.CV

ENG: End-to-end Neural Geometry for Robust Depth and Pose Estimation using CNNs

classification 💻 cs.CV
keywords motionposedepthpredictionstructurecameracnnsend-to-end
0
0 comments X
read the original abstract

Recovering structure and motion parameters given a image pair or a sequence of images is a well studied problem in computer vision. This is often achieved by employing Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) algorithms based on the real-time requirements. Recently, with the advent of Convolutional Neural Networks (CNNs) researchers have explored the possibility of using machine learning techniques to reconstruct the 3D structure of a scene and jointly predict the camera pose. In this work, we present a framework that achieves state-of-the-art performance on single image depth prediction for both indoor and outdoor scenes. The depth prediction system is then extended to predict optical flow and ultimately the camera pose and trained end-to-end. Our motion estimation framework outperforms the previous motion prediction systems and we also demonstrate that the state-of-the-art metric depths can be further improved using the knowledge of pose.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Scene Motion Decomposition for Learnable Visual Odometry

    cs.CV 2019-07 unverdicted novelty 5.0

    Decomposing scene motion into per-point 6DoF motion maps from optical flow and depth enables a neural network to estimate camera motion more accurately than stacking raw inputs.