pith. machine review for the scientific record.
sign in

arxiv: 1907.06023 · v1 · pith:QZH2T32Rnew · submitted 2019-07-13 · 💻 cs.CV

Structure-Aware Residual Pyramid Network for Monocular Depth Estimation

classification 💻 cs.CV
keywords structuredepthresidualscenestructuresestimationmulti-scalepropose
0
0 comments X
read the original abstract

Monocular depth estimation is an essential task for scene understanding. The underlying structure of objects and stuff in a complex scene is critical to recovering accurate and visually-pleasing depth maps. Global structure conveys scene layouts, while local structure reflects shape details. Recently developed approaches based on convolutional neural networks (CNNs) significantly improve the performance of depth estimation. However, few of them take into account multi-scale structures in complex scenes. In this paper, we propose a Structure-Aware Residual Pyramid Network (SARPN) to exploit multi-scale structures for accurate depth prediction. We propose a Residual Pyramid Decoder (RPD) which expresses global scene structure in upper levels to represent layouts, and local structure in lower levels to present shape details. At each level, we propose Residual Refinement Modules (RRM) that predict residual maps to progressively add finer structures on the coarser structure predicted at the upper level. In order to fully exploit multi-scale image features, an Adaptive Dense Feature Fusion (ADFF) module, which adaptively fuses effective features from all scales for inferring structures of each scale, is introduced. Experiment results on the challenging NYU-Depth v2 dataset demonstrate that our proposed approach achieves state-of-the-art performance in both qualitative and quantitative evaluation. The code is available at https://github.com/Xt-Chen/SARPN.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Efficient Test-Time Optimization for Depth Completion via Low-Rank Decoder Adaptation

    cs.CV 2026-03 unverdicted novelty 7.0

    Low-rank decoder adaptation enables efficient test-time optimization for zero-shot depth completion by updating only the subspace containing depth-relevant information.