pith. sign in

arxiv: 2103.15595 · v2 · pith:PX6MJYREnew · submitted 2021-03-29 · 💻 cs.CV

MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo

classification 💻 cs.CV
keywords radianceneuralfieldreconstructionapproachfastfieldsimages
0
0 comments X
read the original abstract

We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct neural radiance fields for view synthesis. Unlike prior works on neural radiance fields that consider per-scene optimization on densely captured images, we propose a generic deep neural network that can reconstruct radiance fields from only three nearby input views via fast network inference. Our approach leverages plane-swept cost volumes (widely used in multi-view stereo) for geometry-aware scene reasoning, and combines this with physically based volume rendering for neural radiance field reconstruction. We train our network on real objects in the DTU dataset, and test it on three different datasets to evaluate its effectiveness and generalizability. Our approach can generalize across scenes (even indoor scenes, completely different from our training scenes of objects) and generate realistic view synthesis results using only three input images, significantly outperforming concurrent works on generalizable radiance field reconstruction. Moreover, if dense images are captured, our estimated radiance field representation can be easily fine-tuned; this leads to fast per-scene reconstruction with higher rendering quality and substantially less optimization time than NeRF.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Cross-View Splatter: Feed-Forward View Synthesis with Georeferenced Images

    cs.CV 2026-05 unverdicted novelty 6.0

    A feed-forward model aligns ground and satellite features to predict Gaussian splats for improved novel-view synthesis on georeferenced outdoor scenes.

  2. PAGE-4D: VGGT-4D Perception via Disentangled Pose and Geometry Estimation

    cs.CV 2025-10 unverdicted novelty 6.0

    PAGE-4D is a feedforward extension of VGGT that uses a dynamics-aware aggregator and mask to disentangle pose estimation from geometry reconstruction in videos with moving objects.