pith. machine review for the scientific record. sign in

arxiv: 1709.00930 · v1 · submitted 2017-09-04 · 💻 cs.CV

Recognition: unknown

Self-Supervised Learning for Stereo Matching with Self-Improving Ability

Authors on Pith no claims yet
classification 💻 cs.CV
keywords stereomatchingdisparitymanymapsdeep-learningdensedifferent
0
0 comments X
read the original abstract

Exiting deep-learning based dense stereo matching methods often rely on ground-truth disparity maps as the training signals, which are however not always available in many situations. In this paper, we design a simple convolutional neural network architecture that is able to learn to compute dense disparity maps directly from the stereo inputs. Training is performed in an end-to-end fashion without the need of ground-truth disparity maps. The idea is to use image warping error (instead of disparity-map residuals) as the loss function to drive the learning process, aiming to find a depth-map that minimizes the warping error. While this is a simple concept well-known in stereo matching, to make it work in a deep-learning framework, many non-trivial challenges must be overcome, and in this work we provide effective solutions. Our network is self-adaptive to different unseen imageries as well as to different camera settings. Experiments on KITTI and Middlebury stereo benchmark datasets show that our method outperforms many state-of-the-art stereo matching methods with a margin, and at the same time significantly faster.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SMFormer: Empowering Self-supervised Stereo Matching via Foundation Models and Data Augmentation

    cs.CV 2026-04 unverdicted novelty 5.0

    SMFormer achieves state-of-the-art self-supervised stereo matching by using vision foundation models for disturbance-resistant features and data augmentation to enforce output consistency, rivaling or exceeding some s...

  2. Geometry Reinforced Efficient Attention Tuning Equipped with Normals for Robust Stereo Matching

    cs.CV 2026-04 unverdicted novelty 5.0

    GREATEN fuses surface normals with image features via gated contextual-geometric fusion and efficient sparse attentions to cut stereo matching errors by up to 30% on real datasets when trained solely on synthetic data.