SatSplatDiff: Geometry-preserving generative refinement for high-fidelity satellite Gaussian Splatting

Jiyong Kim; Ronjgun Qin; Shuang Song

arxiv: 2606.27223 · v2 · pith:NM5AFOTQnew · submitted 2026-06-25 · 💻 cs.CV

SatSplatDiff: Geometry-preserving generative refinement for high-fidelity satellite Gaussian Splatting

Jiyong Kim , Shuang Song , Ronjgun Qin This is my paper

Pith reviewed 2026-06-30 01:04 UTC · model grok-4.3

classification 💻 cs.CV

keywords Gaussian Splattingsatellite 3D reconstructiongenerative refinementshadow guidancegeometric preservationDSM initializationphotogrammetryvisual fidelity

0 comments

The pith

SatSplatDiff uses shadow maps from Gaussian representations to guide generative refinement and reduce geometric degradation in satellite 3D reconstruction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a pipeline that first creates a geometrically regularized surface representation for satellite scenes using DSM initialization, monocular depth supervision, and multi-scale refinement. It then applies shadow-guided generative refinement so that updates to rendered images stay consistent with the underlying geometry instead of introducing independent hallucinations. A sympathetic reader would care because satellite imagery provides only top-down views, leaving facades under-supervised, and a method that adds visual detail without breaking 3D accuracy would improve large-scale mapping reliability.

Core claim

SatSplatDiff minimizes geometric degradation by computing shadow maps from the current Gaussian representation and using them to constrain generative refinement, after first establishing an accurate surface via photogrammetric DSM initialization, monocular depth supervision, and multi-scale geometric refinement on a 2DGS base.

What carries the argument

Shadow-guided generative refinement, in which geometrically calculated shadow maps steer the generative updates to preserve consistency with the underlying surface geometry.

If this is right

Reduces geometric MAE by up to 18 percent on the IARPA2016 and DFC2019 datasets.
Improves visual fidelity measured by FID-CLIP by 28 to 45 percent over existing baselines.
Supports up to 5 times resolution enhancement while keeping sensor-consistent appearance.
Maintains seamless cross-tile consistency and scalability for large-area reconstruction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The shadow-constraint idea could be tested on other sparse-view reconstruction tasks that currently suffer from generative hallucinations.
The demonstrated cross-tile consistency suggests the method may support city-scale or multi-temporal satellite mapping without additional alignment steps.
If shadow maps prove robust across sensors, the approach might reduce the need for dense ground-truth geometry in future satellite pipelines.

Load-bearing premise

Shadow maps computed from the current Gaussian representation can reliably constrain the generative refinement process without requiring dataset-specific tuning or introducing new inconsistencies.

What would settle it

If the shadow-guided refinement step produces higher geometric MAE or visible surface inconsistencies than the non-generative baseline on the IARPA2016 or DFC2019 datasets.

Figures

Figures reproduced from arXiv: 2606.27223 by Jiyong Kim, Ronjgun Qin, Shuang Song.

**Figure 2.** Figure 2: Overall architecture of the SatSplatDiff framework. The pipeline achieves geometric accuracy and visual fidelity through three key stages: (1) photogrammetric initialization (section 3) establishes the initial 3D representation through bundle adjustment and DSM generation; (2) geometric optimization (section 4.2) transforms the baseline into a 2DGS representation and refines the geometry, utilizing multi-… view at source ↗

**Figure 3.** Figure 3: Impact of multi-scale geometric refinement on surface solidity, e [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization of diffusion refinement combined with shadow casting. The diffusion model enhances texture details while preserving the shape of cast shadows. Since Rdiff remains fixed until the diffusion-refined supervision targets are regenerated in the next refinement cycle, the geometry of Gmodi f ied is optimized using image supervision aligned with the shadow structure rendered from the current Gaussi… view at source ↗

**Figure 5.** Figure 5: Qualitative visual comparison of synthesized views across JAX, OMA, and IARPA sites. Compared to existing 3DGS-based baselines that often produce [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Comparative analysis of structural stability and artifacts. This figure contrasts our framework with Skyfall-GS and Google Earth images. Yellow boxes [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Qualitative comparison of reconstructed Digital Surface Models (DSMs) across JAX, OMA, and IARPA sites. Our method consistently generates the most [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Comparative qualitative analysis of reconstructed geometry against LiDAR ground truth. Compared to existing GS-based baselines, our framework [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Qualitative ablation of our generative refinement process. The comparison between the initial 2DGS output ("w/ [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Qualitative comparison of satellite image enhancement using di [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: Qualitative ablation of shadow casting across multiple JAX sites. Shadow casting imposes geometric guidance, helping Gaussians maintain geometric [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: Initialization sensitivity on JAX-167. Top row: initialization DSMs [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

**Figure 13.** Figure 13: MAEreg over training steps under volumetric initialization, used as a stress test for extreme geometry cases. Monocular depth supervision accelerates early-stage convergence, while the absence of it results in slower and less stable convergence throughout training [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: Sensitivity analysis on numbers of refined images on pseudo dataset. [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗

**Figure 15.** Figure 15: Application on larger area: seamless mosaics of adjacent JAX sites, including (a) JAX-164, JAX-165, JAX-167 and (b) JAX-214, JAX-260. Each site [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗

read the original abstract

Gaussian Splatting has been recently explored for satellite 3D reconstruction, demonstrating flexibility and efficiency in representing radiometrically diverse satellite scenes. However, the limited top viewpoint of satellite imagery results in insufficient supervision on building facades, leaving surface holes and degraded visual fidelity. Generative refinement, which leverages pretrained generative priors to iteratively refine and update the rendered images used as supervision targets, has recently been investigated to improve the visual fidelity of Gaussian-rendered images. However, since these models refine each view independently, the resulting images can generate hallucinations and break photo-consistency, leading to geometric degradation. To address these limitations, we propose SatSplatDiff, which aims to minimize geometric degradation prevalent in generative refinement. Building on photogrammetric DSM initialization and 2DGS-based shadow casting established in our prior work SatSplat, we first introduce monocular depth supervision and multi-scale geometric refinement to establish a geometrically accurate and well-regularized surface representation. We then apply shadow-guided generative refinement, where geometrically calculated shadow maps guide the Gaussians to maintain consistency with the underlying geometry, improving visual fidelity while reducing geometric degradation. Extensive evaluations on the IARPA2016 and DFC2019 datasets demonstrate state-of-the-art performance, reducing geometric MAE by up to 18% and improving visual fidelity (FID-CLIP) by 28-45% over existing baselines. Our method delivers up to 5x resolution enhancement with minimal hallucination and sensor-consistent appearance, demonstrating seamless cross-tile consistency and strong scalability for large-scale reconstruction. Source code is available at https://github.com/GDAOSU/SatSplatDiff

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Adds monocular depth and shadow-guided diffusion to SatSplat for better facade handling, but the geometry preservation claims rest on thin experimental detail.

read the letter

The one thing to know is that SatSplatDiff adds monocular depth supervision and multi-scale geometric refinement before using shadow maps to guide a generative diffusion step on top of their earlier SatSplat method. This aims to fix facade holes from top-down satellite views without letting the generative model introduce geometric errors.

They do this by first anchoring with photogrammetric DSM, applying 2DGS shadow casting, then the new supervision steps to regularize the surface. The shadow maps then constrain the diffusion updates so rendered images stay consistent. They report SOTA on IARPA2016 and DFC2019 with up to 18% lower geometric MAE and 28-45% better FID-CLIP, plus 5x resolution and code release.

This is solid for what it is: a practical engineering extension that ships code and uses real datasets. The cross-tile consistency claim is relevant for scaling to cities.

The main soft spot is that the abstract gives no ablations isolating the shadow guidance or the multi-scale part, and no mention of error bars or variance across runs. Without those, the quantitative claims are hard to weigh. The circular dependency worry from the stress test is plausible on the surface because incomplete facades could yield bad shadows, but the paper sequences the depth and multi-scale steps first, which might make the initial Gaussians good enough for reliable shadows. Still, the full text needs to demonstrate that the shadows actually constrain without new inconsistencies.

This paper is for people already working on Gaussian splatting for remote sensing or large-scale 3D reconstruction. It has enough grounding in prior work and reproducible elements to deserve a serious referee, even if revisions will be needed on the experimental details.

I would send it to peer review.

Referee Report

2 major / 1 minor

Summary. The paper introduces SatSplatDiff, which extends 2D Gaussian Splatting (initialized from photogrammetric DSMs) with monocular depth supervision, multi-scale geometric refinement, and shadow-guided generative refinement using pretrained diffusion models. The central claim is that computing shadow maps from the current 2DGS representation and using them to constrain the diffusion process yields state-of-the-art results on IARPA2016 and DFC2019, reducing geometric MAE by up to 18% and improving FID-CLIP by 28-45% while enabling up to 5x resolution enhancement with minimal hallucination and cross-tile consistency. Source code is released.

Significance. If the shadow-constrained refinement demonstrably avoids geometric degradation while leveraging generative priors, the approach would meaningfully advance scalable satellite 3D reconstruction for radiometrically diverse scenes with incomplete facade coverage. The explicit release of code and the grounding in prior SatSplat work are positive for reproducibility and incremental progress.

major comments (2)

[Abstract / Methods (shadow-guided section)] Abstract (shadow-guided generative refinement paragraph) and Methods: The mechanism assumes shadow maps computed from the initial 2DGS (after monocular depth and multi-scale refinement) are sufficiently accurate to constrain diffusion without reinforcing facade holes or hallucinations. No ablation, uncertainty weighting, or iterative shadow-update schedule is described to break the potential circular dependency between incomplete geometry and the guiding shadows.
[Experiments / Results] Evaluations (IARPA2016/DFC2019 results): The reported MAE reductions (up to 18%) and FID-CLIP gains (28-45%) are presented without error bars, per-tile variance, or statistical significance tests. This makes it difficult to determine whether the gains are robust or driven by particular tiles where the initial DSM already provides strong geometry.

minor comments (1)

[Abstract] The abstract states 'sensor-consistent appearance' and 'seamless cross-tile consistency' but does not define the quantitative metric used to support these claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on SatSplatDiff. The comments highlight important considerations regarding the shadow-guided refinement pipeline and the presentation of experimental results. We address each major comment below and outline planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract / Methods (shadow-guided section)] Abstract (shadow-guided generative refinement paragraph) and Methods: The mechanism assumes shadow maps computed from the initial 2DGS (after monocular depth and multi-scale refinement) are sufficiently accurate to constrain diffusion without reinforcing facade holes or hallucinations. No ablation, uncertainty weighting, or iterative shadow-update schedule is described to break the potential circular dependency between incomplete geometry and the guiding shadows.

Authors: We appreciate the referee's point on potential circular dependencies. As described in Sections 3.2 and 3.3, the pipeline first applies monocular depth supervision and multi-scale geometric refinement to the photogrammetric DSM initialization to produce a well-regularized 2DGS representation before computing shadow maps for the generative stage. This ordering is intended to ensure that the guiding shadows are derived from an already improved geometry, thereby reducing the risk of reinforcing facade holes during diffusion. While the current manuscript does not include dedicated ablations on uncertainty weighting or iterative shadow updates, the sequential design prioritizes geometric fidelity prior to generative refinement. In the revision we will expand the discussion of this design rationale in the Methods section and add a brief analysis of shadow map accuracy on sample tiles. revision: partial
Referee: [Experiments / Results] Evaluations (IARPA2016/DFC2019 results): The reported MAE reductions (up to 18%) and FID-CLIP gains (28-45%) are presented without error bars, per-tile variance, or statistical significance tests. This makes it difficult to determine whether the gains are robust or driven by particular tiles where the initial DSM already provides strong geometry.

Authors: We agree that additional statistical reporting would improve the clarity and robustness of the results. The evaluations on IARPA2016 and DFC2019 demonstrate consistent gains across multiple scenes, but the manuscript currently reports only aggregate metrics. In the revised version we will include error bars, per-tile variance, and basic statistical significance tests (e.g., paired t-tests) to better characterize the distribution of improvements and address concerns about tile-specific effects. revision: yes

Circularity Check

0 steps flagged

Minor self-citation on SatSplat shadow casting; core claims rest on new supervision and external dataset evaluations

full rationale

The abstract explicitly builds on 'photogrammetric DSM initialization and 2DGS-based shadow casting established in our prior work SatSplat' for the shadow-guided refinement step. This is a self-citation by overlapping authors, but it is not load-bearing: the paper adds independent components (monocular depth supervision, multi-scale geometric refinement) and reports quantitative gains (MAE reduction up to 18%, FID-CLIP 28-45%) from evaluations on IARPA2016 and DFC2019. No equation, fitted parameter, or derivation reduces the claimed outputs to the inputs by construction, and no uniqueness theorem or ansatz is smuggled via citation. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the effectiveness of shadow-map guidance and the reliability of monocular depth estimates for satellite scenes; these are treated as domain assumptions rather than derived quantities.

axioms (2)

domain assumption Monocular depth estimates from satellite imagery provide sufficiently accurate supervision for initializing and regularizing a 2DGS surface representation.
Invoked in the description of the first stage that establishes geometrically accurate surfaces.
domain assumption Shadow maps computed from the current geometry can be used to constrain generative image updates without introducing new geometric errors.
Central to the shadow-guided generative refinement step.

pith-pipeline@v0.9.1-grok · 5831 in / 1383 out tokens · 48095 ms · 2026-06-30T01:04:18.036814+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

146 extracted references · 1 canonical work pages

[1]

Communications of the ACM , volume=

Nerf: Representing scenes as neural radiance fields for view synthesis , author=. Communications of the ACM , volume=. 2021 , publisher=

2021
[2]

, author=

3D Gaussian splatting for real-time radiance field rendering. , author=. ACM Trans. Graph. , volume=
[3]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Sat-nerf: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using rpc cameras , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[4]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Multi-date earth observation nerf: The detail is in the shadows , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[5]

IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=

Sat-ngp: Unleashing neural graphics primitives for fast relightable transient-free 3d reconstruction from satellite imagery , author=. IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=. 2024 , organization=

2024
[6]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Gaussian Splatting for Efficient Satellite Image Photogrammetry , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[7]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Shadow neural radiance fields for multi-view satellite photogrammetry , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[8]

2025 , eprint =

Jie-Ying Lee and Yi-Ruei Liu and Shr-Ruei Tsai and Wei-Cheng Chang and Chung-Ho Wu and Jiewen Chan and Zhenjun Zhao and Chieh Hubert Lin and Yu-Lun Liu , journal =. 2025 , eprint =

2025
[9]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Flowedit: Inversion-free text-based editing using pre-trained flow models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[10]

International Journal of Remote Sensing , volume=

A review of 3D reconstruction from high-resolution urban satellite images , author=. International Journal of Remote Sensing , volume=. 2023 , publisher=

2023
[11]

Electronics letters , volume=

Scope of validity of PSNR in image/video quality assessment , author=. Electronics letters , volume=. 2008 , publisher=

2008
[13]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Rethinking fid: Towards a better evaluation metric for image generation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[14]

International conference on machine learning , pages=

Learning transferable visual models from natural language supervision , author=. International conference on machine learning , pages=. 2021 , organization=

2021
[15]

Advances in neural information processing systems , volume=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=
[16]

arXiv preprint arXiv:1801.01401 , year=

Demystifying mmd gans , author=. arXiv preprint arXiv:1801.01401 , year=

Pith/arXiv arXiv
[17]

IEEE transactions on image processing , volume=

Image quality assessment: from error visibility to structural similarity , author=. IEEE transactions on image processing , volume=. 2004 , publisher=

2004
[18]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

The unreasonable effectiveness of deep features as a perceptual metric , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
[19]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Rethinking the inception architecture for computer vision , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
[20]

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , year=

An automatic and modular stereo pipeline for pushbroom images , author=. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , year=
[21]

Earth and Space Science , volume=

The Ames Stereo Pipeline: NASA's open source software for deriving and processing terrain data , author=. Earth and Space Science , volume=. 2018 , publisher=

2018
[22]

ACM SIGGRAPH 2024 conference papers , pages=

2d gaussian splatting for geometrically accurate radiance fields , author=. ACM SIGGRAPH 2024 conference papers , pages=

2024
[23]

IEEE transactions on image processing , volume=

Complex wavelet structural similarity: A new image similarity index , author=. IEEE transactions on image processing , volume=. 2009 , publisher=

2009
[24]

IEEE Transactions on Geoscience and Remote Sensing , volume=

WHU-stereo: A challenging benchmark for stereo matching of high-resolution satellite images , author=. IEEE Transactions on Geoscience and Remote Sensing , volume=. 2023 , publisher=

2023
[25]

European Conference on Computer Vision , pages=

Revising densification in gaussian splatting , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024
[26]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Mip-splatting: Alias-free 3d gaussian splatting , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[27]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[28]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[29]

European conference on computer vision , pages=

Cor-gs: sparse-view 3d gaussian splatting via co-regularization , author=. European conference on computer vision , pages=. 2024 , organization=

2024
[30]

arXiv preprint arXiv:2309.00277 , year=

Sparsesat-nerf: Dense depth supervised neural radiance fields for sparse satellite images , author=. arXiv preprint arXiv:2309.00277 , year=

arXiv
[31]

European conference on computer vision , pages=

Fsgs: Real-time few-shot view synthesis using gaussian splatting , author=. European conference on computer vision , pages=. 2024 , organization=

2024
[32]

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume=

Rpc stereo processor (rsp)--a software package for digital surface model and orthophoto generation from satellite stereo imagery , author=. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume=. 2016 , publisher=

2016
[33]

International journal of computer vision , volume=

Distinctive image features from scale-invariant keypoints , author=. International journal of computer vision , volume=. 2004 , publisher=

2004
[34]

IEEE Trans

Fast explicit diffusion for accelerated features in nonlinear scale spaces , author=. IEEE Trans. Patt. Anal. Mach. Intell , volume=
[35]

IEEE Transactions on pattern analysis and machine intelligence , volume=

Stereo processing by semiglobal matching and mutual information , author=. IEEE Transactions on pattern analysis and machine intelligence , volume=. 2008 , publisher=

2008
[36]

BMVC 2015 , year=

MGM: A significantly more global matching for stereovision , author=. BMVC 2015 , year=

2015
[37]

IEEE Transactions on Geoscience and Remote Sensing , year=

Enhanced 3d urban scene reconstruction and point cloud densification using gaussian splatting and google earth imagery , author=. IEEE Transactions on Geoscience and Remote Sensing , year=
[38]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

Sat-DN: Implicit Surface Reconstruction from Multi-View Satellite Images with Depth and Normal Supervision , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=
[39]

Remote Sensing , volume=

Sat-mesh: Learning neural implicit surfaces for multi-view satellite reconstruction , author=. Remote Sensing , volume=. 2023 , publisher=

2023
[40]

Applied Sciences , volume=

SatelliteRF: Accelerating 3D reconstruction in multi-view satellite images with efficient neural radiance fields , author=. Applied Sciences , volume=. 2024 , publisher=

2024
[43]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Satellite to groundscape-large-scale consistent ground view generation from satellite views , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[44]

IEEE: Piscataway, NJ, USA , year=

Data Fusion Contest 2019 (DFC2019); IEEE Dataport , author=. IEEE: Piscataway, NJ, USA , year=

2019
[45]

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) , pages=

Semantic stereo for incidental satellite images , author=. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) , pages=. 2019 , organization=

2019
[46]

2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) , pages=

A multiple view stereo benchmark for satellite imagery , author=. 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) , pages=. 2016 , organization=

2016
[47]

European Conference on Computer Vision , pages=

Citygaussian: Real-time high-quality large-scale scene rendering with gaussians , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024
[48]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Vastgaussian: Vast 3d gaussians for large scene reconstruction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[50]

ISPRS Journal of Photogrammetry and Remote Sensing , volume=

ULSR-GS: Urban large-scale surface reconstruction Gaussian Splatting with multi-view geometric consistency , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2025 , publisher=

2025
[52]

arXiv preprint arXiv:2603.04770 , year=

DSA-SRGS: Super-Resolution Gaussian Splatting for Dynamic Sparse-View DSA Reconstruction , author=. arXiv preprint arXiv:2603.04770 , year=

arXiv
[54]

An Operational Pipeline for Generating Digital Surface Models from Multi-Stereo Satellite Images for Remote Sensing Applications , year=

Qin, Rongjun , booktitle=. An Operational Pipeline for Generating Digital Surface Models from Multi-Stereo Satellite Images for Remote Sensing Applications , year=
[55]

2021 , note =

Marí, Roger and de Franchis, Carlo and Meinhardt-Llopis, Enric and Anger, Jérémy and Facciolo, Gabriele , journal =. 2021 , note =

2021
[56]

Forty-first international conference on machine learning , year=

Scaling rectified flow transformers for high-resolution image synthesis , author=. Forty-first international conference on machine learning , year=
[57]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[58]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
[59]

Advances in neural information processing systems , volume=

Photorealistic text-to-image diffusion models with deep language understanding , author=. Advances in neural information processing systems , volume=
[61]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Magic3d: High-resolution text-to-3d content creation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[63]

Journal of Machine Learning Research , volume=

gsplat: An open-source library for Gaussian splatting , author=. Journal of Machine Learning Research , volume=
[64]

2025 , howpublished=

Black Forest Labs , title=. 2025 , howpublished=

2025
[65]

Proceedings of the seventh IEEE international conference on computer vision , volume=

Object recognition from local scale-invariant features , author=. Proceedings of the seventh IEEE international conference on computer vision , volume=. 1999 , organization=

1999
[66]

Communications of the ACM , volume=

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , author=. Communications of the ACM , volume=. 1981 , publisher=

1981
[67]

Ohio Supercomputer Center

Ohio Supercomputer Center. Ohio Supercomputer Center. 1987

1987
[68]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Vis2mesh: Efficient mesh reconstruction from unstructured point clouds of large scenes with learned virtual view visibility , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[69]

The American Statistician , volume=

Thirteen ways to look at the correlation coefficient , author=. The American Statistician , volume=. 1988 , publisher=

1988
[70]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Sat2city: 3d city generation from a single satellite image with cascaded latent diffusion , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[71]

arXiv preprint arXiv:2511.11470 , year=

Sat2RealCity: Geometry-Aware and Appearance-Controllable 3D Urban Generation from Satellite Imagery , author=. arXiv preprint arXiv:2511.11470 , year=

arXiv
[72]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Sat2scene: 3d urban scene generation from satellite images with diffusion , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[73]

Ming Qian and Zimin Xia and Changkun Liu and Shuailei Ma and Wen Wang and Zeran Ke and Bin Tan and Hang Zhang and Gui-Song Xia , booktitle=. Sat3. 2026 , url=

2026
[74]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

MagicCity: Geometry-Aware 3D City Generation from Satellite Imagery with Multi-View Consistency , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[75]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Sat2vid: Street-view panoramic video synthesis from a single satellite image , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[76]

ISPRS Journal of Photogrammetry and Remote Sensing , volume=

HMSM-Net: Hierarchical multi-scale matching network for disparity estimation of high-resolution satellite stereo images , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2022 , publisher=

2022
[77]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

Improving disparity consistency with self-refined cost volumes for deep learning-based satellite stereo matching , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=
[78]

IEEE Transactions on Geoscience and Remote Sensing , year=

GU-GS: Gaussian Splatting-based Geometry Refinement and Uncertainty-Aware Learning Method for DSM Generation from Satellite Imagery , author=. IEEE Transactions on Geoscience and Remote Sensing , year=
[79]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Flowr: Flowing from sparse to dense 3d reconstructions , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[81]

European Conference on Computer Vision , pages=

Deceptive-nerf/3dgs: Diffusion-generated pseudo-observations for high-quality sparse-view reconstruction , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024
[82]

Advances in Neural Information Processing Systems , volume=

3dgs-enhancer: Enhancing unbounded 3d gaussian splatting with view-consistent 2d diffusion priors , author=. Advances in Neural Information Processing Systems , volume=
[83]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Difix3d+: Improving 3d reconstructions with single-step diffusion models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[85]

International Conference on Learning Representations (ICLR) , year=

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation , author=. International Conference on Learning Representations (ICLR) , year=
[86]

Photogrammetric Engineering & Remote Sensing , note=

Song, Shuang and Kim, Jiyong and Qin, Rongjun , title=. Photogrammetric Engineering & Remote Sensing , note=
[87]

Advances in Neural Information Processing Systems , volume=

Depth anything v2 , author=. Advances in Neural Information Processing Systems , volume=
[88]

arXiv:2304.07193 , year=

DINOv2: Learning Robust Visual Features without Supervision , author=. arXiv:2304.07193 , year=

Pith/arXiv arXiv
[90]

International Conference on Learning Representations , volume=

Diffusionsat: A generative foundation model for satellite imagery , author=. International Conference on Learning Representations , volume=
[91]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Showing first 80 references.

[1] [1]

Communications of the ACM , volume=

Nerf: Representing scenes as neural radiance fields for view synthesis , author=. Communications of the ACM , volume=. 2021 , publisher=

2021

[2] [2]

, author=

3D Gaussian splatting for real-time radiance field rendering. , author=. ACM Trans. Graph. , volume=

[3] [3]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Sat-nerf: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using rpc cameras , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[4] [4]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Multi-date earth observation nerf: The detail is in the shadows , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[5] [5]

IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=

Sat-ngp: Unleashing neural graphics primitives for fast relightable transient-free 3d reconstruction from satellite imagery , author=. IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=. 2024 , organization=

2024

[6] [6]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Gaussian Splatting for Efficient Satellite Image Photogrammetry , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[7] [7]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Shadow neural radiance fields for multi-view satellite photogrammetry , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[8] [8]

2025 , eprint =

Jie-Ying Lee and Yi-Ruei Liu and Shr-Ruei Tsai and Wei-Cheng Chang and Chung-Ho Wu and Jiewen Chan and Zhenjun Zhao and Chieh Hubert Lin and Yu-Lun Liu , journal =. 2025 , eprint =

2025

[9] [9]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Flowedit: Inversion-free text-based editing using pre-trained flow models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[10] [10]

International Journal of Remote Sensing , volume=

A review of 3D reconstruction from high-resolution urban satellite images , author=. International Journal of Remote Sensing , volume=. 2023 , publisher=

2023

[11] [11]

Electronics letters , volume=

Scope of validity of PSNR in image/video quality assessment , author=. Electronics letters , volume=. 2008 , publisher=

2008

[12] [13]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Rethinking fid: Towards a better evaluation metric for image generation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[13] [14]

International conference on machine learning , pages=

Learning transferable visual models from natural language supervision , author=. International conference on machine learning , pages=. 2021 , organization=

2021

[14] [15]

Advances in neural information processing systems , volume=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

[15] [16]

arXiv preprint arXiv:1801.01401 , year=

Demystifying mmd gans , author=. arXiv preprint arXiv:1801.01401 , year=

Pith/arXiv arXiv

[16] [17]

IEEE transactions on image processing , volume=

Image quality assessment: from error visibility to structural similarity , author=. IEEE transactions on image processing , volume=. 2004 , publisher=

2004

[17] [18]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

The unreasonable effectiveness of deep features as a perceptual metric , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

[18] [19]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Rethinking the inception architecture for computer vision , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

[19] [20]

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , year=

An automatic and modular stereo pipeline for pushbroom images , author=. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , year=

[20] [21]

Earth and Space Science , volume=

The Ames Stereo Pipeline: NASA's open source software for deriving and processing terrain data , author=. Earth and Space Science , volume=. 2018 , publisher=

2018

[21] [22]

ACM SIGGRAPH 2024 conference papers , pages=

2d gaussian splatting for geometrically accurate radiance fields , author=. ACM SIGGRAPH 2024 conference papers , pages=

2024

[22] [23]

IEEE transactions on image processing , volume=

Complex wavelet structural similarity: A new image similarity index , author=. IEEE transactions on image processing , volume=. 2009 , publisher=

2009

[23] [24]

IEEE Transactions on Geoscience and Remote Sensing , volume=

WHU-stereo: A challenging benchmark for stereo matching of high-resolution satellite images , author=. IEEE Transactions on Geoscience and Remote Sensing , volume=. 2023 , publisher=

2023

[24] [25]

European Conference on Computer Vision , pages=

Revising densification in gaussian splatting , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024

[25] [26]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Mip-splatting: Alias-free 3d gaussian splatting , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[26] [27]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[27] [28]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[28] [29]

European conference on computer vision , pages=

Cor-gs: sparse-view 3d gaussian splatting via co-regularization , author=. European conference on computer vision , pages=. 2024 , organization=

2024

[29] [30]

arXiv preprint arXiv:2309.00277 , year=

Sparsesat-nerf: Dense depth supervised neural radiance fields for sparse satellite images , author=. arXiv preprint arXiv:2309.00277 , year=

arXiv

[30] [31]

European conference on computer vision , pages=

Fsgs: Real-time few-shot view synthesis using gaussian splatting , author=. European conference on computer vision , pages=. 2024 , organization=

2024

[31] [32]

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume=

Rpc stereo processor (rsp)--a software package for digital surface model and orthophoto generation from satellite stereo imagery , author=. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume=. 2016 , publisher=

2016

[32] [33]

International journal of computer vision , volume=

Distinctive image features from scale-invariant keypoints , author=. International journal of computer vision , volume=. 2004 , publisher=

2004

[33] [34]

IEEE Trans

Fast explicit diffusion for accelerated features in nonlinear scale spaces , author=. IEEE Trans. Patt. Anal. Mach. Intell , volume=

[34] [35]

IEEE Transactions on pattern analysis and machine intelligence , volume=

Stereo processing by semiglobal matching and mutual information , author=. IEEE Transactions on pattern analysis and machine intelligence , volume=. 2008 , publisher=

2008

[35] [36]

BMVC 2015 , year=

MGM: A significantly more global matching for stereovision , author=. BMVC 2015 , year=

2015

[36] [37]

IEEE Transactions on Geoscience and Remote Sensing , year=

Enhanced 3d urban scene reconstruction and point cloud densification using gaussian splatting and google earth imagery , author=. IEEE Transactions on Geoscience and Remote Sensing , year=

[37] [38]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

Sat-DN: Implicit Surface Reconstruction from Multi-View Satellite Images with Depth and Normal Supervision , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

[38] [39]

Remote Sensing , volume=

Sat-mesh: Learning neural implicit surfaces for multi-view satellite reconstruction , author=. Remote Sensing , volume=. 2023 , publisher=

2023

[39] [40]

Applied Sciences , volume=

SatelliteRF: Accelerating 3D reconstruction in multi-view satellite images with efficient neural radiance fields , author=. Applied Sciences , volume=. 2024 , publisher=

2024

[40] [43]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Satellite to groundscape-large-scale consistent ground view generation from satellite views , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[41] [44]

IEEE: Piscataway, NJ, USA , year=

Data Fusion Contest 2019 (DFC2019); IEEE Dataport , author=. IEEE: Piscataway, NJ, USA , year=

2019

[42] [45]

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) , pages=

Semantic stereo for incidental satellite images , author=. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) , pages=. 2019 , organization=

2019

[43] [46]

2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) , pages=

A multiple view stereo benchmark for satellite imagery , author=. 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) , pages=. 2016 , organization=

2016

[44] [47]

European Conference on Computer Vision , pages=

Citygaussian: Real-time high-quality large-scale scene rendering with gaussians , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024

[45] [48]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Vastgaussian: Vast 3d gaussians for large scene reconstruction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[46] [50]

ISPRS Journal of Photogrammetry and Remote Sensing , volume=

ULSR-GS: Urban large-scale surface reconstruction Gaussian Splatting with multi-view geometric consistency , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2025 , publisher=

2025

[47] [52]

arXiv preprint arXiv:2603.04770 , year=

DSA-SRGS: Super-Resolution Gaussian Splatting for Dynamic Sparse-View DSA Reconstruction , author=. arXiv preprint arXiv:2603.04770 , year=

arXiv

[48] [54]

An Operational Pipeline for Generating Digital Surface Models from Multi-Stereo Satellite Images for Remote Sensing Applications , year=

Qin, Rongjun , booktitle=. An Operational Pipeline for Generating Digital Surface Models from Multi-Stereo Satellite Images for Remote Sensing Applications , year=

[49] [55]

2021 , note =

Marí, Roger and de Franchis, Carlo and Meinhardt-Llopis, Enric and Anger, Jérémy and Facciolo, Gabriele , journal =. 2021 , note =

2021

[50] [56]

Forty-first international conference on machine learning , year=

Scaling rectified flow transformers for high-resolution image synthesis , author=. Forty-first international conference on machine learning , year=

[51] [57]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[52] [58]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

[53] [59]

Advances in neural information processing systems , volume=

Photorealistic text-to-image diffusion models with deep language understanding , author=. Advances in neural information processing systems , volume=

[54] [61]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Magic3d: High-resolution text-to-3d content creation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[55] [63]

Journal of Machine Learning Research , volume=

gsplat: An open-source library for Gaussian splatting , author=. Journal of Machine Learning Research , volume=

[56] [64]

2025 , howpublished=

Black Forest Labs , title=. 2025 , howpublished=

2025

[57] [65]

Proceedings of the seventh IEEE international conference on computer vision , volume=

Object recognition from local scale-invariant features , author=. Proceedings of the seventh IEEE international conference on computer vision , volume=. 1999 , organization=

1999

[58] [66]

Communications of the ACM , volume=

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , author=. Communications of the ACM , volume=. 1981 , publisher=

1981

[59] [67]

Ohio Supercomputer Center

Ohio Supercomputer Center. Ohio Supercomputer Center. 1987

1987

[60] [68]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Vis2mesh: Efficient mesh reconstruction from unstructured point clouds of large scenes with learned virtual view visibility , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[61] [69]

The American Statistician , volume=

Thirteen ways to look at the correlation coefficient , author=. The American Statistician , volume=. 1988 , publisher=

1988

[62] [70]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Sat2city: 3d city generation from a single satellite image with cascaded latent diffusion , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[63] [71]

arXiv preprint arXiv:2511.11470 , year=

Sat2RealCity: Geometry-Aware and Appearance-Controllable 3D Urban Generation from Satellite Imagery , author=. arXiv preprint arXiv:2511.11470 , year=

arXiv

[64] [72]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Sat2scene: 3d urban scene generation from satellite images with diffusion , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[65] [73]

Ming Qian and Zimin Xia and Changkun Liu and Shuailei Ma and Wen Wang and Zeran Ke and Bin Tan and Hang Zhang and Gui-Song Xia , booktitle=. Sat3. 2026 , url=

2026

[66] [74]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

MagicCity: Geometry-Aware 3D City Generation from Satellite Imagery with Multi-View Consistency , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[67] [75]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Sat2vid: Street-view panoramic video synthesis from a single satellite image , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[68] [76]

ISPRS Journal of Photogrammetry and Remote Sensing , volume=

HMSM-Net: Hierarchical multi-scale matching network for disparity estimation of high-resolution satellite stereo images , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2022 , publisher=

2022

[69] [77]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

Improving disparity consistency with self-refined cost volumes for deep learning-based satellite stereo matching , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

[70] [78]

IEEE Transactions on Geoscience and Remote Sensing , year=

GU-GS: Gaussian Splatting-based Geometry Refinement and Uncertainty-Aware Learning Method for DSM Generation from Satellite Imagery , author=. IEEE Transactions on Geoscience and Remote Sensing , year=

[71] [79]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Flowr: Flowing from sparse to dense 3d reconstructions , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[72] [81]

European Conference on Computer Vision , pages=

Deceptive-nerf/3dgs: Diffusion-generated pseudo-observations for high-quality sparse-view reconstruction , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024

[73] [82]

Advances in Neural Information Processing Systems , volume=

3dgs-enhancer: Enhancing unbounded 3d gaussian splatting with view-consistent 2d diffusion priors , author=. Advances in Neural Information Processing Systems , volume=

[74] [83]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Difix3d+: Improving 3d reconstructions with single-step diffusion models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[75] [85]

International Conference on Learning Representations (ICLR) , year=

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation , author=. International Conference on Learning Representations (ICLR) , year=

[76] [86]

Photogrammetric Engineering & Remote Sensing , note=

Song, Shuang and Kim, Jiyong and Qin, Rongjun , title=. Photogrammetric Engineering & Remote Sensing , note=

[77] [87]

Advances in Neural Information Processing Systems , volume=

Depth anything v2 , author=. Advances in Neural Information Processing Systems , volume=

[78] [88]

arXiv:2304.07193 , year=

DINOv2: Learning Robust Visual Features without Supervision , author=. arXiv:2304.07193 , year=

Pith/arXiv arXiv

[79] [90]

International Conference on Learning Representations , volume=

Diffusionsat: A generative foundation model for satellite imagery , author=. International Conference on Learning Representations , volume=

[80] [91]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=