CoRR , volume =

Rui Li, Biao Zhang, Zhenyu Li, Federico Tombari, Peter Wonka , title = · 2025 · cs.CV · arXiv 2504.18424

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

We present Layered Ray Intersections (LaRI), a fully supervised method for occluded geometry reasoning from a single image. Unlike conventional depth estimation, which is limited to visible surfaces, LaRI predicts multiple surfaces intersected by the camera rays using layered point maps. Compared to the existing approaches that leverage neural implicit representations or iterative refinement, LaRI achieves complete scene reconstruction in one feed-forward pass, enabling efficient and view-aligned geometric reasoning to underpin both object-level and scene-level tasks. We further propose to predict the ray stopping index, which identifies valid intersecting pixels and layers from LaRI's output. To better underpin and evaluate this task, we build an annotation pipeline using rendering engines, construct annotations for five public datasets, including synthetic and real-world data covering 3D objects and scenes. As a generic method, LaRI's performance is validated in object-level and scene-level reconstruction tasks.

representative citing papers

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

cs.CV · 2026-06-11 · unverdicted · novelty 7.0

World Tracing introduces a multi-layer pixel-aligned 3D point representation instantiated via a diffusion transformer (WT-DiT) trained with pixel-space flow matching to jointly reconstruct visible surfaces and generate occluded geometry.

Pixal3D: Pixel-Aligned 3D Generation from Images

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Pixal3D performs pixel-aligned 3D generation from images via back-projected multi-scale feature volumes, achieving fidelity close to reconstruction while supporting multi-view and scene synthesis.

VolFill: Single-View Amodal 3D Scene Reconstruction with Volumetric Flow Matching

cs.CV · 2026-05-29 · unverdicted · novelty 5.0

VolFill uses a hybrid 3D VAE to compress sparse truncated unsigned distance function grids into latent space and a latent Diffusion Transformer to denoise complete scenes, conditioned on geometry foundation models, outperforming baselines on SCRREAM and NRGB-D datasets.

citing papers explorer

Showing 3 of 3 citing papers after filters.

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible cs.CV · 2026-06-11 · unverdicted · none · ref 26 · internal anchor
World Tracing introduces a multi-layer pixel-aligned 3D point representation instantiated via a diffusion transformer (WT-DiT) trained with pixel-space flow matching to jointly reconstruct visible surfaces and generate occluded geometry.
Pixal3D: Pixel-Aligned 3D Generation from Images cs.CV · 2026-05-11 · unverdicted · none · ref 51 · internal anchor
Pixal3D performs pixel-aligned 3D generation from images via back-projected multi-scale feature volumes, achieving fidelity close to reconstruction while supporting multi-view and scene synthesis.
VolFill: Single-View Amodal 3D Scene Reconstruction with Volumetric Flow Matching cs.CV · 2026-05-29 · unverdicted · none · ref 41 · internal anchor
VolFill uses a hybrid 3D VAE to compress sparse truncated unsigned distance function grids into latent space and a latent Diffusion Transformer to denoise complete scenes, conditioned on geometry foundation models, outperforming baselines on SCRREAM and NRGB-D datasets.

CoRR , volume =

fields

years

verdicts

representative citing papers

citing papers explorer