SatSplatDiff: Geometry-preserving generative refinement for high-fidelity satellite Gaussian Splatting
Pith reviewed 2026-06-30 01:04 UTC · model grok-4.3
The pith
SatSplatDiff uses shadow maps from Gaussian representations to guide generative refinement and reduce geometric degradation in satellite 3D reconstruction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SatSplatDiff minimizes geometric degradation by computing shadow maps from the current Gaussian representation and using them to constrain generative refinement, after first establishing an accurate surface via photogrammetric DSM initialization, monocular depth supervision, and multi-scale geometric refinement on a 2DGS base.
What carries the argument
Shadow-guided generative refinement, in which geometrically calculated shadow maps steer the generative updates to preserve consistency with the underlying surface geometry.
If this is right
- Reduces geometric MAE by up to 18 percent on the IARPA2016 and DFC2019 datasets.
- Improves visual fidelity measured by FID-CLIP by 28 to 45 percent over existing baselines.
- Supports up to 5 times resolution enhancement while keeping sensor-consistent appearance.
- Maintains seamless cross-tile consistency and scalability for large-area reconstruction.
Where Pith is reading between the lines
- The shadow-constraint idea could be tested on other sparse-view reconstruction tasks that currently suffer from generative hallucinations.
- The demonstrated cross-tile consistency suggests the method may support city-scale or multi-temporal satellite mapping without additional alignment steps.
- If shadow maps prove robust across sensors, the approach might reduce the need for dense ground-truth geometry in future satellite pipelines.
Load-bearing premise
Shadow maps computed from the current Gaussian representation can reliably constrain the generative refinement process without requiring dataset-specific tuning or introducing new inconsistencies.
What would settle it
If the shadow-guided refinement step produces higher geometric MAE or visible surface inconsistencies than the non-generative baseline on the IARPA2016 or DFC2019 datasets.
Figures
read the original abstract
Gaussian Splatting has been recently explored for satellite 3D reconstruction, demonstrating flexibility and efficiency in representing radiometrically diverse satellite scenes. However, the limited top viewpoint of satellite imagery results in insufficient supervision on building facades, leaving surface holes and degraded visual fidelity. Generative refinement, which leverages pretrained generative priors to iteratively refine and update the rendered images used as supervision targets, has recently been investigated to improve the visual fidelity of Gaussian-rendered images. However, since these models refine each view independently, the resulting images can generate hallucinations and break photo-consistency, leading to geometric degradation. To address these limitations, we propose SatSplatDiff, which aims to minimize geometric degradation prevalent in generative refinement. Building on photogrammetric DSM initialization and 2DGS-based shadow casting established in our prior work SatSplat, we first introduce monocular depth supervision and multi-scale geometric refinement to establish a geometrically accurate and well-regularized surface representation. We then apply shadow-guided generative refinement, where geometrically calculated shadow maps guide the Gaussians to maintain consistency with the underlying geometry, improving visual fidelity while reducing geometric degradation. Extensive evaluations on the IARPA2016 and DFC2019 datasets demonstrate state-of-the-art performance, reducing geometric MAE by up to 18% and improving visual fidelity (FID-CLIP) by 28-45% over existing baselines. Our method delivers up to 5x resolution enhancement with minimal hallucination and sensor-consistent appearance, demonstrating seamless cross-tile consistency and strong scalability for large-scale reconstruction. Source code is available at https://github.com/GDAOSU/SatSplatDiff
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SatSplatDiff, which extends 2D Gaussian Splatting (initialized from photogrammetric DSMs) with monocular depth supervision, multi-scale geometric refinement, and shadow-guided generative refinement using pretrained diffusion models. The central claim is that computing shadow maps from the current 2DGS representation and using them to constrain the diffusion process yields state-of-the-art results on IARPA2016 and DFC2019, reducing geometric MAE by up to 18% and improving FID-CLIP by 28-45% while enabling up to 5x resolution enhancement with minimal hallucination and cross-tile consistency. Source code is released.
Significance. If the shadow-constrained refinement demonstrably avoids geometric degradation while leveraging generative priors, the approach would meaningfully advance scalable satellite 3D reconstruction for radiometrically diverse scenes with incomplete facade coverage. The explicit release of code and the grounding in prior SatSplat work are positive for reproducibility and incremental progress.
major comments (2)
- [Abstract / Methods (shadow-guided section)] Abstract (shadow-guided generative refinement paragraph) and Methods: The mechanism assumes shadow maps computed from the initial 2DGS (after monocular depth and multi-scale refinement) are sufficiently accurate to constrain diffusion without reinforcing facade holes or hallucinations. No ablation, uncertainty weighting, or iterative shadow-update schedule is described to break the potential circular dependency between incomplete geometry and the guiding shadows.
- [Experiments / Results] Evaluations (IARPA2016/DFC2019 results): The reported MAE reductions (up to 18%) and FID-CLIP gains (28-45%) are presented without error bars, per-tile variance, or statistical significance tests. This makes it difficult to determine whether the gains are robust or driven by particular tiles where the initial DSM already provides strong geometry.
minor comments (1)
- [Abstract] The abstract states 'sensor-consistent appearance' and 'seamless cross-tile consistency' but does not define the quantitative metric used to support these claims.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on SatSplatDiff. The comments highlight important considerations regarding the shadow-guided refinement pipeline and the presentation of experimental results. We address each major comment below and outline planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / Methods (shadow-guided section)] Abstract (shadow-guided generative refinement paragraph) and Methods: The mechanism assumes shadow maps computed from the initial 2DGS (after monocular depth and multi-scale refinement) are sufficiently accurate to constrain diffusion without reinforcing facade holes or hallucinations. No ablation, uncertainty weighting, or iterative shadow-update schedule is described to break the potential circular dependency between incomplete geometry and the guiding shadows.
Authors: We appreciate the referee's point on potential circular dependencies. As described in Sections 3.2 and 3.3, the pipeline first applies monocular depth supervision and multi-scale geometric refinement to the photogrammetric DSM initialization to produce a well-regularized 2DGS representation before computing shadow maps for the generative stage. This ordering is intended to ensure that the guiding shadows are derived from an already improved geometry, thereby reducing the risk of reinforcing facade holes during diffusion. While the current manuscript does not include dedicated ablations on uncertainty weighting or iterative shadow updates, the sequential design prioritizes geometric fidelity prior to generative refinement. In the revision we will expand the discussion of this design rationale in the Methods section and add a brief analysis of shadow map accuracy on sample tiles. revision: partial
-
Referee: [Experiments / Results] Evaluations (IARPA2016/DFC2019 results): The reported MAE reductions (up to 18%) and FID-CLIP gains (28-45%) are presented without error bars, per-tile variance, or statistical significance tests. This makes it difficult to determine whether the gains are robust or driven by particular tiles where the initial DSM already provides strong geometry.
Authors: We agree that additional statistical reporting would improve the clarity and robustness of the results. The evaluations on IARPA2016 and DFC2019 demonstrate consistent gains across multiple scenes, but the manuscript currently reports only aggregate metrics. In the revised version we will include error bars, per-tile variance, and basic statistical significance tests (e.g., paired t-tests) to better characterize the distribution of improvements and address concerns about tile-specific effects. revision: yes
Circularity Check
Minor self-citation on SatSplat shadow casting; core claims rest on new supervision and external dataset evaluations
full rationale
The abstract explicitly builds on 'photogrammetric DSM initialization and 2DGS-based shadow casting established in our prior work SatSplat' for the shadow-guided refinement step. This is a self-citation by overlapping authors, but it is not load-bearing: the paper adds independent components (monocular depth supervision, multi-scale geometric refinement) and reports quantitative gains (MAE reduction up to 18%, FID-CLIP 28-45%) from evaluations on IARPA2016 and DFC2019. No equation, fitted parameter, or derivation reduces the claimed outputs to the inputs by construction, and no uniqueness theorem or ansatz is smuggled via citation. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Monocular depth estimates from satellite imagery provide sufficiently accurate supervision for initializing and regularizing a 2DGS surface representation.
- domain assumption Shadow maps computed from the current geometry can be used to constrain generative image updates without introducing new geometric errors.
Reference graph
Works this paper leans on
-
[1]
Communications of the ACM , volume=
Nerf: Representing scenes as neural radiance fields for view synthesis , author=. Communications of the ACM , volume=. 2021 , publisher=
2021
-
[2]
, author=
3D Gaussian splatting for real-time radiance field rendering. , author=. ACM Trans. Graph. , volume=
-
[3]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Sat-nerf: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using rpc cameras , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[4]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Multi-date earth observation nerf: The detail is in the shadows , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[5]
IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=
Sat-ngp: Unleashing neural graphics primitives for fast relightable transient-free 3d reconstruction from satellite imagery , author=. IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=. 2024 , organization=
2024
-
[6]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Gaussian Splatting for Efficient Satellite Image Photogrammetry , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[7]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Shadow neural radiance fields for multi-view satellite photogrammetry , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[8]
2025 , eprint =
Jie-Ying Lee and Yi-Ruei Liu and Shr-Ruei Tsai and Wei-Cheng Chang and Chung-Ho Wu and Jiewen Chan and Zhenjun Zhao and Chieh Hubert Lin and Yu-Lun Liu , journal =. 2025 , eprint =
2025
-
[9]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Flowedit: Inversion-free text-based editing using pre-trained flow models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[10]
International Journal of Remote Sensing , volume=
A review of 3D reconstruction from high-resolution urban satellite images , author=. International Journal of Remote Sensing , volume=. 2023 , publisher=
2023
-
[11]
Electronics letters , volume=
Scope of validity of PSNR in image/video quality assessment , author=. Electronics letters , volume=. 2008 , publisher=
2008
-
[13]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Rethinking fid: Towards a better evaluation metric for image generation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[14]
International conference on machine learning , pages=
Learning transferable visual models from natural language supervision , author=. International conference on machine learning , pages=. 2021 , organization=
2021
-
[15]
Advances in neural information processing systems , volume=
Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=
-
[16]
arXiv preprint arXiv:1801.01401 , year=
Demystifying mmd gans , author=. arXiv preprint arXiv:1801.01401 , year=
-
[17]
IEEE transactions on image processing , volume=
Image quality assessment: from error visibility to structural similarity , author=. IEEE transactions on image processing , volume=. 2004 , publisher=
2004
-
[18]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
The unreasonable effectiveness of deep features as a perceptual metric , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[19]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Rethinking the inception architecture for computer vision , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[20]
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , year=
An automatic and modular stereo pipeline for pushbroom images , author=. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , year=
-
[21]
Earth and Space Science , volume=
The Ames Stereo Pipeline: NASA's open source software for deriving and processing terrain data , author=. Earth and Space Science , volume=. 2018 , publisher=
2018
-
[22]
ACM SIGGRAPH 2024 conference papers , pages=
2d gaussian splatting for geometrically accurate radiance fields , author=. ACM SIGGRAPH 2024 conference papers , pages=
2024
-
[23]
IEEE transactions on image processing , volume=
Complex wavelet structural similarity: A new image similarity index , author=. IEEE transactions on image processing , volume=. 2009 , publisher=
2009
-
[24]
IEEE Transactions on Geoscience and Remote Sensing , volume=
WHU-stereo: A challenging benchmark for stereo matching of high-resolution satellite images , author=. IEEE Transactions on Geoscience and Remote Sensing , volume=. 2023 , publisher=
2023
-
[25]
European Conference on Computer Vision , pages=
Revising densification in gaussian splatting , author=. European Conference on Computer Vision , pages=. 2024 , organization=
2024
-
[26]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Mip-splatting: Alias-free 3d gaussian splatting , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[27]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[28]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[29]
European conference on computer vision , pages=
Cor-gs: sparse-view 3d gaussian splatting via co-regularization , author=. European conference on computer vision , pages=. 2024 , organization=
2024
-
[30]
arXiv preprint arXiv:2309.00277 , year=
Sparsesat-nerf: Dense depth supervised neural radiance fields for sparse satellite images , author=. arXiv preprint arXiv:2309.00277 , year=
-
[31]
European conference on computer vision , pages=
Fsgs: Real-time few-shot view synthesis using gaussian splatting , author=. European conference on computer vision , pages=. 2024 , organization=
2024
-
[32]
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume=
Rpc stereo processor (rsp)--a software package for digital surface model and orthophoto generation from satellite stereo imagery , author=. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume=. 2016 , publisher=
2016
-
[33]
International journal of computer vision , volume=
Distinctive image features from scale-invariant keypoints , author=. International journal of computer vision , volume=. 2004 , publisher=
2004
-
[34]
IEEE Trans
Fast explicit diffusion for accelerated features in nonlinear scale spaces , author=. IEEE Trans. Patt. Anal. Mach. Intell , volume=
-
[35]
IEEE Transactions on pattern analysis and machine intelligence , volume=
Stereo processing by semiglobal matching and mutual information , author=. IEEE Transactions on pattern analysis and machine intelligence , volume=. 2008 , publisher=
2008
-
[36]
BMVC 2015 , year=
MGM: A significantly more global matching for stereovision , author=. BMVC 2015 , year=
2015
-
[37]
IEEE Transactions on Geoscience and Remote Sensing , year=
Enhanced 3d urban scene reconstruction and point cloud densification using gaussian splatting and google earth imagery , author=. IEEE Transactions on Geoscience and Remote Sensing , year=
-
[38]
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=
Sat-DN: Implicit Surface Reconstruction from Multi-View Satellite Images with Depth and Normal Supervision , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=
-
[39]
Remote Sensing , volume=
Sat-mesh: Learning neural implicit surfaces for multi-view satellite reconstruction , author=. Remote Sensing , volume=. 2023 , publisher=
2023
-
[40]
Applied Sciences , volume=
SatelliteRF: Accelerating 3D reconstruction in multi-view satellite images with efficient neural radiance fields , author=. Applied Sciences , volume=. 2024 , publisher=
2024
-
[43]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Satellite to groundscape-large-scale consistent ground view generation from satellite views , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[44]
IEEE: Piscataway, NJ, USA , year=
Data Fusion Contest 2019 (DFC2019); IEEE Dataport , author=. IEEE: Piscataway, NJ, USA , year=
2019
-
[45]
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) , pages=
Semantic stereo for incidental satellite images , author=. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) , pages=. 2019 , organization=
2019
-
[46]
2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) , pages=
A multiple view stereo benchmark for satellite imagery , author=. 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) , pages=. 2016 , organization=
2016
-
[47]
European Conference on Computer Vision , pages=
Citygaussian: Real-time high-quality large-scale scene rendering with gaussians , author=. European Conference on Computer Vision , pages=. 2024 , organization=
2024
-
[48]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Vastgaussian: Vast 3d gaussians for large scene reconstruction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[50]
ISPRS Journal of Photogrammetry and Remote Sensing , volume=
ULSR-GS: Urban large-scale surface reconstruction Gaussian Splatting with multi-view geometric consistency , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2025 , publisher=
2025
-
[52]
arXiv preprint arXiv:2603.04770 , year=
DSA-SRGS: Super-Resolution Gaussian Splatting for Dynamic Sparse-View DSA Reconstruction , author=. arXiv preprint arXiv:2603.04770 , year=
-
[54]
An Operational Pipeline for Generating Digital Surface Models from Multi-Stereo Satellite Images for Remote Sensing Applications , year=
Qin, Rongjun , booktitle=. An Operational Pipeline for Generating Digital Surface Models from Multi-Stereo Satellite Images for Remote Sensing Applications , year=
-
[55]
2021 , note =
Marí, Roger and de Franchis, Carlo and Meinhardt-Llopis, Enric and Anger, Jérémy and Facciolo, Gabriele , journal =. 2021 , note =
2021
-
[56]
Forty-first international conference on machine learning , year=
Scaling rectified flow transformers for high-resolution image synthesis , author=. Forty-first international conference on machine learning , year=
-
[57]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[58]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
-
[59]
Advances in neural information processing systems , volume=
Photorealistic text-to-image diffusion models with deep language understanding , author=. Advances in neural information processing systems , volume=
-
[61]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Magic3d: High-resolution text-to-3d content creation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[63]
Journal of Machine Learning Research , volume=
gsplat: An open-source library for Gaussian splatting , author=. Journal of Machine Learning Research , volume=
-
[64]
2025 , howpublished=
Black Forest Labs , title=. 2025 , howpublished=
2025
-
[65]
Proceedings of the seventh IEEE international conference on computer vision , volume=
Object recognition from local scale-invariant features , author=. Proceedings of the seventh IEEE international conference on computer vision , volume=. 1999 , organization=
1999
-
[66]
Communications of the ACM , volume=
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , author=. Communications of the ACM , volume=. 1981 , publisher=
1981
-
[67]
Ohio Supercomputer Center
Ohio Supercomputer Center. Ohio Supercomputer Center. 1987
1987
-
[68]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Vis2mesh: Efficient mesh reconstruction from unstructured point clouds of large scenes with learned virtual view visibility , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[69]
The American Statistician , volume=
Thirteen ways to look at the correlation coefficient , author=. The American Statistician , volume=. 1988 , publisher=
1988
-
[70]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Sat2city: 3d city generation from a single satellite image with cascaded latent diffusion , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[71]
arXiv preprint arXiv:2511.11470 , year=
Sat2RealCity: Geometry-Aware and Appearance-Controllable 3D Urban Generation from Satellite Imagery , author=. arXiv preprint arXiv:2511.11470 , year=
-
[72]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Sat2scene: 3d urban scene generation from satellite images with diffusion , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[73]
Ming Qian and Zimin Xia and Changkun Liu and Shuailei Ma and Wen Wang and Zeran Ke and Bin Tan and Hang Zhang and Gui-Song Xia , booktitle=. Sat3. 2026 , url=
2026
-
[74]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
MagicCity: Geometry-Aware 3D City Generation from Satellite Imagery with Multi-View Consistency , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[75]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Sat2vid: Street-view panoramic video synthesis from a single satellite image , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[76]
ISPRS Journal of Photogrammetry and Remote Sensing , volume=
HMSM-Net: Hierarchical multi-scale matching network for disparity estimation of high-resolution satellite stereo images , author=. ISPRS Journal of Photogrammetry and Remote Sensing , volume=. 2022 , publisher=
2022
-
[77]
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=
Improving disparity consistency with self-refined cost volumes for deep learning-based satellite stereo matching , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=
-
[78]
IEEE Transactions on Geoscience and Remote Sensing , year=
GU-GS: Gaussian Splatting-based Geometry Refinement and Uncertainty-Aware Learning Method for DSM Generation from Satellite Imagery , author=. IEEE Transactions on Geoscience and Remote Sensing , year=
-
[79]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Flowr: Flowing from sparse to dense 3d reconstructions , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[81]
European Conference on Computer Vision , pages=
Deceptive-nerf/3dgs: Diffusion-generated pseudo-observations for high-quality sparse-view reconstruction , author=. European Conference on Computer Vision , pages=. 2024 , organization=
2024
-
[82]
Advances in Neural Information Processing Systems , volume=
3dgs-enhancer: Enhancing unbounded 3d gaussian splatting with view-consistent 2d diffusion priors , author=. Advances in Neural Information Processing Systems , volume=
-
[83]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Difix3d+: Improving 3d reconstructions with single-step diffusion models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[85]
International Conference on Learning Representations (ICLR) , year=
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation , author=. International Conference on Learning Representations (ICLR) , year=
-
[86]
Photogrammetric Engineering & Remote Sensing , note=
Song, Shuang and Kim, Jiyong and Qin, Rongjun , title=. Photogrammetric Engineering & Remote Sensing , note=
-
[87]
Advances in Neural Information Processing Systems , volume=
Depth anything v2 , author=. Advances in Neural Information Processing Systems , volume=
-
[88]
DINOv2: Learning Robust Visual Features without Supervision , author=. arXiv:2304.07193 , year=
-
[90]
International Conference on Learning Representations , volume=
Diffusionsat: A generative foundation model for satellite imagery , author=. International Conference on Learning Representations , volume=
-
[91]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.