High Quality Monocular Depth Estimation via Transfer Learning

· 2018 · cs.CV · arXiv 1812.11941

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Accurate depth estimation from images is a fundamental task in many applications including scene understanding and reconstruction. Existing solutions for depth estimation often produce blurry approximations of low resolution. This paper presents a convolutional neural network for computing a high-resolution depth map given a single RGB image with the help of transfer learning. Following a standard encoder-decoder architecture, we leverage features extracted using high performing pre-trained networks when initializing our encoder along with augmentation and training strategies that lead to more accurate results. We show how, even for a very simple decoder, our method is able to achieve detailed high-resolution depth maps. Our network, with fewer parameters and training iterations, outperforms state-of-the-art on two datasets and also produces qualitatively better results that capture object boundaries more faithfully. Code and corresponding pre-trained weights are made publicly available.

representative citing papers

Finite Scalar Quantization: VQ-VAE Made Simple

cs.CV · 2023-09-27 · conditional · novelty 7.0

Finite scalar quantization simplifies VQ-VAE latents by independently rounding a few dimensions to fixed levels, producing an equivalent-sized implicit codebook with competitive performance and no collapse.

Real-time Vision-based Depth Reconstruction with NVidia Jetson

cs.CV · 2019-07-16 · unverdicted · novelty 3.0

A comparison of FCNN architectures for monocular depth estimation yields a model suitable for real-time operation on NVidia Jetson hardware with evaluation in vSLAM.

citing papers explorer

Showing 2 of 2 citing papers.

Finite Scalar Quantization: VQ-VAE Made Simple cs.CV · 2023-09-27 · conditional · none · ref 3 · internal anchor
Finite scalar quantization simplifies VQ-VAE latents by independently rounding a few dimensions to fixed levels, producing an equivalent-sized implicit codebook with competitive performance and no collapse.
Real-time Vision-based Depth Reconstruction with NVidia Jetson cs.CV · 2019-07-16 · unverdicted · none · ref 16 · internal anchor
A comparison of FCNN architectures for monocular depth estimation yields a model suitable for real-time operation on NVidia Jetson hardware with evaluation in vSLAM.

High Quality Monocular Depth Estimation via Transfer Learning

fields

years

verdicts

representative citing papers

citing papers explorer