HarmoGS: Robust 3D Gaussian Splatting in the Wild via Conflict-Aware Gradient Harmonization

Jian-Fang Hu; Jianhuang Lai; Tianze Zhu; Wei-Shi Zheng; Yulei Kang

arxiv: 2605.13073 · v2 · pith:WZBTSO2Enew · submitted 2026-05-13 · 💻 cs.CV

HarmoGS: Robust 3D Gaussian Splatting in the Wild via Conflict-Aware Gradient Harmonization

Yulei Kang , Tianze Zhu , Jian-Fang Hu , Jianhuang Lai , Wei-Shi Zheng This is my paper

Pith reviewed 2026-05-19 16:58 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D Gaussian SplattingNovel View SynthesisIn-the-wild ReconstructionGradient HarmonizationTransient DistractorsCross-view Consistency

0 comments

The pith

Rotating view-specific gradients into orthogonal directions reduces conflicts and improves 3D Gaussian Splatting quality in scenes with transient objects and lighting changes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets the instability in 3D Gaussian Splatting when applied to real-world captures that contain moving distractors and appearance differences across views. Masking unreliable pixels at the image level leaves residual gradient conflicts that still disrupt the optimization of Gaussian primitives. The proposed approach first refines masks using learned pixel-wise consistency scores, then applies a dual-view harmonization step that rotates gradients from different views to lie at right angles to each other, and finally adjusts densification and pruning rules to favor stable primitives. Experiments on standard in-the-wild benchmarks show higher rendering fidelity than prior masking-only techniques.

Core claim

We propose a conflict-aware 3DGS framework that addresses this problem from both image-space supervision and gradient-level optimization. Semantic Consistency-Guided Masking learns pixel-wise consistency scores to adaptively refine prior masks and suppress unreliable supervision before gradient formation. A dual-view Conflict-Aware Gradient Harmonization strategy further reconciles view-specific gradients by mutually rotating them into an orthogonal configuration, reducing negative directional interference across views. We also introduce conflict-aware densification and pruning to stabilize Gaussian growth and remove persistently conflicting primitives.

What carries the argument

Dual-view Conflict-Aware Gradient Harmonization, which rotates gradients from different input views into an orthogonal configuration to reduce negative directional interference during optimization.

If this is right

Residual occlusions and illumination inconsistencies are suppressed before they form conflicting gradients.
Gaussian primitives grow and are pruned according to conflict levels rather than raw density or opacity alone.
Rendering quality improves on standard benchmarks containing transient distractors and cross-view appearance changes.
Optimization remains stable even when prior masks leave some unreliable pixels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same orthogonal-gradient idea could be tested on other radiance-field representations that also optimize per-view contributions.
If the rotation operation preserves gradient magnitude, it may extend naturally to multi-view consistency losses in dynamic scene capture.
Scenes with extreme view-dependent effects might still require additional appearance modeling beyond gradient alignment.

Load-bearing premise

That rotating view-specific gradients into an orthogonal configuration reliably reduces negative directional interference without introducing new optimization instabilities or artifacts in the Gaussian primitives.

What would settle it

A controlled comparison on a scene with known illumination variation where the orthogonal rotation step is disabled and the resulting increase in visible artifacts or drop in PSNR is measured against the full method.

Figures

Figures reproduced from arXiv: 2605.13073 by Jian-Fang Hu, Jianhuang Lai, Tianze Zhu, Wei-Shi Zheng, Yulei Kang.

**Figure 2.** Figure 2: Overview of the proposed framework. At each training iteration, two views are sampled [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the Cross-View Conflict-Aware Gradient Harmonization process. (a) A [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization of Semantic Consistency-Guided Masking. The SAM-based binary mask [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparisons on challenging in-the-wild scenes. Our method suppresses [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Additional qualitative comparisons with DroneSplat [28]. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Additional qualitative comparisons with AsymGS [13]. [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

read the original abstract

In-the-wild 3D Gaussian Splatting remains challenging due to transient distractors and illumination-induced cross-view appearance inconsistencies. Existing methods mainly rely on image-level masking to suppress unreliable supervision, but masking alone cannot fully eliminate residual occlusions or resolve illumination-induced inconsistencies, both of which can introduce conflicting cross-view gradients. These unresolved conflicts may destabilize Gaussian optimization and lead to visible reconstruction artifacts. We propose a conflict-aware 3DGS framework that addresses this problem from both image-space supervision and gradient-level optimization. Semantic Consistency-Guided Masking learns pixel-wise consistency scores to adaptively refine prior masks and suppress unreliable supervision before gradient formation. A dual-view Conflict-Aware Gradient Harmonization strategy further reconciles view-specific gradients by mutually rotating them into an orthogonal configuration, reducing negative directional interference across views. We also introduce conflict-aware densification and pruning to stabilize Gaussian growth and remove persistently conflicting primitives. Extensive experiments on standard in-the-wild benchmarks demonstrate that our method achieves state-of-the-art rendering quality under complex transient distractors and cross-view inconsistencies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds semantic masking plus orthogonal gradient rotation to handle conflicts in wild 3DGS, but the rotation step lacks derivation or stability analysis.

read the letter

The main thing to know is that this work tries to improve 3D Gaussian Splatting for scenes with moving objects and lighting shifts by refining masks with semantic consistency scores and then rotating view-specific gradients into an orthogonal setup to cut interference, along with conflict-aware densification and pruning. The orthogonal rotation and the dual-view harmonization strategy appear to be the concrete new pieces beyond standard masking approaches. The paper does a reasonable job spelling out why residual cross-view inconsistencies survive simple masks and why operating at the gradient level could help stabilize the optimization. It also keeps the focus on practical reconstruction quality under real conditions. The soft spot is exactly the one flagged in the stress test. The description gives no derivation showing that mutual orthogonal rotation preserves the informative parts of the original gradients for updating Gaussian means, covariances, and opacities, nor does it examine how the rotated gradients interact with the photometric loss or the adaptive densification and pruning rules. Without that, it is difficult to rule out new instabilities or artifacts. The SOTA claims rest on experiments that are referenced but not detailed here, so the contribution of the harmonization component versus the masking would need close checking in the full results and ablations. This is for computer vision groups working on robust neural rendering from casual captures. A reader already using 3DGS and dealing with transient distractors would find the pipeline ideas worth testing. It deserves a serious referee because the underlying problem is common and the proposed steps are specific enough to evaluate, even if the gradient rotation will require more justification.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes HarmoGS, a conflict-aware 3D Gaussian Splatting framework for in-the-wild scenes. It introduces Semantic Consistency-Guided Masking to adaptively refine prior masks using pixel-wise consistency scores, a dual-view Conflict-Aware Gradient Harmonization strategy that mutually rotates view-specific gradients into an orthogonal configuration to reduce negative directional interference, and conflict-aware densification/pruning rules to stabilize Gaussian optimization. The authors claim this yields state-of-the-art rendering quality on standard in-the-wild benchmarks under transient distractors and cross-view inconsistencies.

Significance. If the empirical results and the gradient harmonization step hold up under scrutiny, the work offers a practical advance for 3DGS in real-world captures by moving beyond pure image-space masking to address gradient conflicts directly. The orthogonal rotation idea is a distinctive contribution that could generalize to other multi-view optimization settings, provided it is accompanied by reproducible code and clear ablation evidence.

major comments (2)

[Method (Conflict-Aware Gradient Harmonization)] The central mechanism in Conflict-Aware Gradient Harmonization (described in the method section) rotates view-specific gradients to an orthogonal configuration without a derivation or analysis showing that the rotated vectors preserve the components necessary for stable updates to Gaussian means, covariances, and opacities. This assumption is load-bearing for the claim of reduced interference without new instabilities, yet no interaction with the photometric loss or the adaptive densification/pruning rules is examined.
[Experiments] The abstract asserts SOTA rendering quality, but the provided description supplies no quantitative metrics, ablation tables, or error analysis comparing against baselines under controlled transient and illumination conditions. Without these, it is impossible to verify whether the gradient harmonization step delivers the claimed gains or merely correlates with other design choices.

minor comments (2)

Notation for the consistency scores and the rotation operator should be defined explicitly with symbols and dimensions to avoid ambiguity when readers implement the dual-view harmonization step.
Figure captions for qualitative results should include the specific benchmark scenes and the exact baselines shown for direct visual comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing clarifications and indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Method (Conflict-Aware Gradient Harmonization)] The central mechanism in Conflict-Aware Gradient Harmonization (described in the method section) rotates view-specific gradients to an orthogonal configuration without a derivation or analysis showing that the rotated vectors preserve the components necessary for stable updates to Gaussian means, covariances, and opacities. This assumption is load-bearing for the claim of reduced interference without new instabilities, yet no interaction with the photometric loss or the adaptive densification/pruning rules is examined.

Authors: We thank the referee for this observation. The manuscript motivates the orthogonal rotation as a means to decorrelate conflicting directional components across views while preserving update magnitudes, but we acknowledge that an explicit derivation and interaction analysis were not provided. In the revised manuscript we will add a formal derivation in Section 3.2: the dual-view rotation is performed via a mutual orthogonalization operator that projects each gradient onto the orthogonal complement of the other, which mathematically preserves the component aligned with the photometric loss gradient for each view (i.e., the inner product with the original loss direction remains unchanged). We will also include a short analysis of the interaction with the photometric loss and the conflict-aware densification/pruning rules, showing that the reduced directional variance lowers the incidence of unstable densification events without altering the expected update scale for means, covariances, and opacities. revision: yes
Referee: [Experiments] The abstract asserts SOTA rendering quality, but the provided description supplies no quantitative metrics, ablation tables, or error analysis comparing against baselines under controlled transient and illumination conditions. Without these, it is impossible to verify whether the gradient harmonization step delivers the claimed gains or merely correlates with other design choices.

Authors: The full manuscript already contains quantitative results, ablation tables, and comparisons against baselines (3DGS, WildGaussians, etc.) on standard in-the-wild benchmarks in Section 4, reporting PSNR/SSIM/LPIPS improvements and isolating the contribution of gradient harmonization. However, to make these findings more immediately verifiable, we will revise the abstract to highlight key numerical gains and expand the experimental section with additional controlled ablation studies that separately vary transient distractors and cross-view illumination while measuring the incremental benefit of the harmonization module. revision: partial

Circularity Check

0 steps flagged

No significant circularity; claims rest on external data and standard 3DGS primitives

full rationale

The paper introduces Semantic Consistency-Guided Masking, dual-view Conflict-Aware Gradient Harmonization via orthogonal rotation of view-specific gradients, and conflict-aware densification/pruning. No equations, derivations, or fitted-parameter predictions appear in the provided sections that reduce any claimed output to the method's own inputs by construction. The central claims operate on external in-the-wild image data and benchmarks, with no load-bearing self-citations or uniqueness theorems imported from prior author work that would collapse the argument. The derivation chain remains self-contained against standard 3D Gaussian Splatting optimization and photometric losses.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit equations or implementation details, so no free parameters, axioms, or invented entities can be identified with certainty. The approach appears to rest on the standard 3D Gaussian Splatting optimization pipeline plus two added heuristic modules.

pith-pipeline@v0.9.0 · 5732 in / 1132 out tokens · 47625 ms · 2026-05-19T16:58:00.247732+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

mutually rotating them into an orthogonal configuration, reducing negative directional interference across views

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

[1]

Tensorf: Tensorial radiance fields

Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. Tensorf: Tensorial radiance fields. InEuropean conference on computer vision, pages 333–350. Springer, 2022

work page 2022
[2]

Nerf-hugs: Improved neural radiance fields in non-static scenes using heuristics-guided segmentation

Jiahao Chen, Yipeng Qin, Lingjie Liu, Jiangbo Lu, and Guanbin Li. Nerf-hugs: Improved neural radiance fields in non-static scenes using heuristics-guided segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19436–19446, 2024

work page 2024
[3]

Hallucinated neural radiance fields in the wild

Xingyu Chen, Qi Zhang, Xiaoyu Li, Yue Chen, Ying Feng, Xuan Wang, and Jue Wang. Hallucinated neural radiance fields in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12943–12952, 2022

work page 2022
[4]

K-planes: Explicit radiance fields in space, time, and appearance

Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12479–12488, 2023

work page 2023
[5]

Plenoxels: Radiance fields without neural networks

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5501–5510, 2022

work page 2022
[6]

On quantiz- ing implicit neural representations

Cameron Gordon, Shin-Fang Chng, Lachlan MacDonald, and Simon Lucey. On quantiz- ing implicit neural representations. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 341–350, 2023

work page 2023
[7]

Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering

Antoine Guédon and Vincent Lepetit. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5354–5363, 2024

work page 2024
[8]

Efficientnerf efficient neural radiance fields

Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, and Jiaya Jia. Efficientnerf efficient neural radiance fields. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12902–12911, 2022

work page 2022
[9]

Refinedfields: Radiance fields refinement for unconstrained scenes.arXiv preprint arXiv:2312.00639, 3, 2023

Karim Kassab, Antoine Schnepf, Jean-Yves Franceschi, Laurent Caraffa, Jeremie Mary, and Valérie Gouet-Brunet. Refinedfields: Radiance fields refinement for unconstrained scenes.arXiv preprint arXiv:2312.00639, 3, 2023

work page arXiv 2023
[10]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

work page 2023
[11]

Segment anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023

work page 2023
[12]

Wildgaussians: 3d gaussian splatting in the wild,

Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, and Torsten Sattler. Wildgaussians: 3d gaussian splatting in the wild.arXiv preprint arXiv:2407.08447, 2024

work page arXiv 2024
[13]

Robust neural rendering in the wild with asymmetric dual 3d gaussian splatting

Chengqi Li, Zhihao Shi, Yangdi Lu, Wenbo He, and Xiangyu Xu. Robust neural rendering in the wild with asymmetric dual 3d gaussian splatting. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025
[14]

Segment and recognize anything at any granularity

Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianwei Yang, Lei Zhang, and Jianfeng Gao. Segment and recognize anything at any granularity. InEuropean Conference on Computer Vision, pages 467–484. Springer, 2024

work page 2024
[15]

Vastgaussian: Vast 3d gaussians for large scene reconstruction

Jiaqi Lin, Zhihao Li, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, et al. Vastgaussian: Vast 3d gaussians for large scene reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5166–5175, 2024. 10

work page 2024
[16]

Hybridgs: Decoupling transients and statics with 2d and 3d gaussian splatting

Jingyu Lin, Jiaqi Gu, Lubin Fan, Bojian Wu, Yujing Lou, Renjie Chen, Ligang Liu, and Jieping Ye. Hybridgs: Decoupling transients and statics with 2d and 3d gaussian splatting. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 788–797, 2025

work page 2025
[17]

Maskgaussian: Adaptive 3d gaussian representation from probabilistic masks

Yifei Liu, Zhihang Zhong, Yifan Zhan, Sheng Xu, and Xiao Sun. Maskgaussian: Adaptive 3d gaussian representation from probabilistic masks. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 681–690, 2025

work page 2025
[18]

Nerf in the wild: Neural radiance fields for unconstrained photo collections

Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi, Jonathan T Barron, Alexey Doso- vitskiy, and Daniel Duckworth. Nerf in the wild: Neural radiance fields for unconstrained photo collections. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7210–7219, 2021

work page 2021
[19]

Nerf: Representing scenes as neural radiance fields for view synthesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

work page 2021
[20]

Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

work page 2022
[21]

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[22]

Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps

Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas Geiger. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. InProceedings of the IEEE/CVF international conference on computer vision, pages 14335–14345, 2021

work page 2021
[23]

Nerf on-the-go: Exploiting uncertainty for distractor-free nerfs in the wild

Weining Ren, Zihan Zhu, Boyang Sun, Jiaqi Chen, Marc Pollefeys, and Songyou Peng. Nerf on-the-go: Exploiting uncertainty for distractor-free nerfs in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8931–8940, 2024

work page 2024
[24]

Spotlesssplats: Ignoring distractors in 3d gaussian splatting.ACM Transactions on Graphics, 44(2):1–11, 2025

Sara Sabour, Lily Goli, George Kopanas, Mark Matthews, Dmitry Lagun, Leonidas Guibas, Alec Jacobson, David Fleet, and Andrea Tagliasacchi. Spotlesssplats: Ignoring distractors in 3d gaussian splatting.ACM Transactions on Graphics, 44(2):1–11, 2025

work page 2025
[25]

Robustnerf: Ignoring distractors with robust losses

Sara Sabour, Suhani V ora, Daniel Duckworth, Ivan Krasin, David J Fleet, and Andrea Tagliasac- chi. Robustnerf: Ignoring distractors with robust losses. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20626–20636, 2023

work page 2023
[26]

Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction

Cheng Sun, Min Sun, and Hwann-Tzong Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5459–5469, 2022

work page 2022
[27]

Neural geometric level of detail: Real-time rendering with implicit 3d shapes

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. Neural geometric level of detail: Real-time rendering with implicit 3d shapes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11358–11367, 2021

work page 2021
[28]

Dronesplat: 3d gaussian splatting for robust 3d reconstruction from in-the-wild drone imagery

Jiadong Tang, Yu Gao, Dianyi Yang, Liqi Yan, Yufeng Yue, and Yi Yang. Dronesplat: 3d gaussian splatting for robust 3d reconstruction from in-the-wild drone imagery. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 833–843, 2025

work page 2025
[29]

Robust 3d gaussian splatting for novel view synthesis in presence of distractors

Paul Ungermann, Armin Ettenhofer, Matthias Nießner, and Barbara Roessle. Robust 3d gaussian splatting for novel view synthesis in presence of distractors. InDAGM German Conference on Pattern Recognition, pages 153–167. Springer, 2024

work page 2024
[30]

Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600– 612, 2004

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600– 612, 2004. 11

work page 2004
[31]

Wild-gs: Real-time novel view synthesis from un- constrained photo collections.Advances in Neural Information Processing Systems, 37:103334– 103355, 2024

Jiacong Xu, Yiqun Mei, and Vishal M Patel. Wild-gs: Real-time novel view synthesis from un- constrained photo collections.Advances in Neural Information Processing Systems, 37:103334– 103355, 2024

work page 2024
[32]

Point-nerf: Point-based neural radiance fields

Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, and Ulrich Neumann. Point-nerf: Point-based neural radiance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5438–5448, 2022

work page 2022
[33]

Cross-ray neural radiance fields for novel-view synthesis from unconstrained image collections

Yifan Yang, Shuhai Zhang, Zixiong Huang, Yubing Zhang, and Mingkui Tan. Cross-ray neural radiance fields for novel-view synthesis from unconstrained image collections. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15901–15911, 2023

work page 2023
[34]

Mip-splatting: Alias-free 3d gaussian splatting

Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, and Andreas Geiger. Mip-splatting: Alias-free 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19447–19456, 2024

work page 2024
[35]

Gaussian in the wild: 3d gaussian splatting for unconstrained image collections

Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, and Haoqian Wang. Gaussian in the wild: 3d gaussian splatting for unconstrained image collections. InEuropean Conference on Computer Vision, pages 341–359. Springer, 2024

work page 2024
[36]

The unrea- sonable effectiveness of deep features as a perceptual metric

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018. 12 Ground TruthOursDroneSplat Figure 6: Additional qualitative comparisons with DroneSplat [28]. Table ...

work page arXiv 2018

[1] [1]

Tensorf: Tensorial radiance fields

Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. Tensorf: Tensorial radiance fields. InEuropean conference on computer vision, pages 333–350. Springer, 2022

work page 2022

[2] [2]

Nerf-hugs: Improved neural radiance fields in non-static scenes using heuristics-guided segmentation

Jiahao Chen, Yipeng Qin, Lingjie Liu, Jiangbo Lu, and Guanbin Li. Nerf-hugs: Improved neural radiance fields in non-static scenes using heuristics-guided segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19436–19446, 2024

work page 2024

[3] [3]

Hallucinated neural radiance fields in the wild

Xingyu Chen, Qi Zhang, Xiaoyu Li, Yue Chen, Ying Feng, Xuan Wang, and Jue Wang. Hallucinated neural radiance fields in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12943–12952, 2022

work page 2022

[4] [4]

K-planes: Explicit radiance fields in space, time, and appearance

Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12479–12488, 2023

work page 2023

[5] [5]

Plenoxels: Radiance fields without neural networks

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5501–5510, 2022

work page 2022

[6] [6]

On quantiz- ing implicit neural representations

Cameron Gordon, Shin-Fang Chng, Lachlan MacDonald, and Simon Lucey. On quantiz- ing implicit neural representations. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 341–350, 2023

work page 2023

[7] [7]

Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering

Antoine Guédon and Vincent Lepetit. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5354–5363, 2024

work page 2024

[8] [8]

Efficientnerf efficient neural radiance fields

Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, and Jiaya Jia. Efficientnerf efficient neural radiance fields. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12902–12911, 2022

work page 2022

[9] [9]

Refinedfields: Radiance fields refinement for unconstrained scenes.arXiv preprint arXiv:2312.00639, 3, 2023

Karim Kassab, Antoine Schnepf, Jean-Yves Franceschi, Laurent Caraffa, Jeremie Mary, and Valérie Gouet-Brunet. Refinedfields: Radiance fields refinement for unconstrained scenes.arXiv preprint arXiv:2312.00639, 3, 2023

work page arXiv 2023

[10] [10]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

work page 2023

[11] [11]

Segment anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023

work page 2023

[12] [12]

Wildgaussians: 3d gaussian splatting in the wild,

Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, and Torsten Sattler. Wildgaussians: 3d gaussian splatting in the wild.arXiv preprint arXiv:2407.08447, 2024

work page arXiv 2024

[13] [13]

Robust neural rendering in the wild with asymmetric dual 3d gaussian splatting

Chengqi Li, Zhihao Shi, Yangdi Lu, Wenbo He, and Xiangyu Xu. Robust neural rendering in the wild with asymmetric dual 3d gaussian splatting. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025

[14] [14]

Segment and recognize anything at any granularity

Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianwei Yang, Lei Zhang, and Jianfeng Gao. Segment and recognize anything at any granularity. InEuropean Conference on Computer Vision, pages 467–484. Springer, 2024

work page 2024

[15] [15]

Vastgaussian: Vast 3d gaussians for large scene reconstruction

Jiaqi Lin, Zhihao Li, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, et al. Vastgaussian: Vast 3d gaussians for large scene reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5166–5175, 2024. 10

work page 2024

[16] [16]

Hybridgs: Decoupling transients and statics with 2d and 3d gaussian splatting

Jingyu Lin, Jiaqi Gu, Lubin Fan, Bojian Wu, Yujing Lou, Renjie Chen, Ligang Liu, and Jieping Ye. Hybridgs: Decoupling transients and statics with 2d and 3d gaussian splatting. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 788–797, 2025

work page 2025

[17] [17]

Maskgaussian: Adaptive 3d gaussian representation from probabilistic masks

Yifei Liu, Zhihang Zhong, Yifan Zhan, Sheng Xu, and Xiao Sun. Maskgaussian: Adaptive 3d gaussian representation from probabilistic masks. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 681–690, 2025

work page 2025

[18] [18]

Nerf in the wild: Neural radiance fields for unconstrained photo collections

Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi, Jonathan T Barron, Alexey Doso- vitskiy, and Daniel Duckworth. Nerf in the wild: Neural radiance fields for unconstrained photo collections. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7210–7219, 2021

work page 2021

[19] [19]

Nerf: Representing scenes as neural radiance fields for view synthesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

work page 2021

[20] [20]

Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

work page 2022

[21] [21]

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[22] [22]

Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps

Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas Geiger. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. InProceedings of the IEEE/CVF international conference on computer vision, pages 14335–14345, 2021

work page 2021

[23] [23]

Nerf on-the-go: Exploiting uncertainty for distractor-free nerfs in the wild

Weining Ren, Zihan Zhu, Boyang Sun, Jiaqi Chen, Marc Pollefeys, and Songyou Peng. Nerf on-the-go: Exploiting uncertainty for distractor-free nerfs in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8931–8940, 2024

work page 2024

[24] [24]

Spotlesssplats: Ignoring distractors in 3d gaussian splatting.ACM Transactions on Graphics, 44(2):1–11, 2025

Sara Sabour, Lily Goli, George Kopanas, Mark Matthews, Dmitry Lagun, Leonidas Guibas, Alec Jacobson, David Fleet, and Andrea Tagliasacchi. Spotlesssplats: Ignoring distractors in 3d gaussian splatting.ACM Transactions on Graphics, 44(2):1–11, 2025

work page 2025

[25] [25]

Robustnerf: Ignoring distractors with robust losses

Sara Sabour, Suhani V ora, Daniel Duckworth, Ivan Krasin, David J Fleet, and Andrea Tagliasac- chi. Robustnerf: Ignoring distractors with robust losses. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20626–20636, 2023

work page 2023

[26] [26]

Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction

Cheng Sun, Min Sun, and Hwann-Tzong Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5459–5469, 2022

work page 2022

[27] [27]

Neural geometric level of detail: Real-time rendering with implicit 3d shapes

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. Neural geometric level of detail: Real-time rendering with implicit 3d shapes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11358–11367, 2021

work page 2021

[28] [28]

Dronesplat: 3d gaussian splatting for robust 3d reconstruction from in-the-wild drone imagery

Jiadong Tang, Yu Gao, Dianyi Yang, Liqi Yan, Yufeng Yue, and Yi Yang. Dronesplat: 3d gaussian splatting for robust 3d reconstruction from in-the-wild drone imagery. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 833–843, 2025

work page 2025

[29] [29]

Robust 3d gaussian splatting for novel view synthesis in presence of distractors

Paul Ungermann, Armin Ettenhofer, Matthias Nießner, and Barbara Roessle. Robust 3d gaussian splatting for novel view synthesis in presence of distractors. InDAGM German Conference on Pattern Recognition, pages 153–167. Springer, 2024

work page 2024

[30] [30]

Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600– 612, 2004

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600– 612, 2004. 11

work page 2004

[31] [31]

Wild-gs: Real-time novel view synthesis from un- constrained photo collections.Advances in Neural Information Processing Systems, 37:103334– 103355, 2024

Jiacong Xu, Yiqun Mei, and Vishal M Patel. Wild-gs: Real-time novel view synthesis from un- constrained photo collections.Advances in Neural Information Processing Systems, 37:103334– 103355, 2024

work page 2024

[32] [32]

Point-nerf: Point-based neural radiance fields

Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, and Ulrich Neumann. Point-nerf: Point-based neural radiance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5438–5448, 2022

work page 2022

[33] [33]

Cross-ray neural radiance fields for novel-view synthesis from unconstrained image collections

Yifan Yang, Shuhai Zhang, Zixiong Huang, Yubing Zhang, and Mingkui Tan. Cross-ray neural radiance fields for novel-view synthesis from unconstrained image collections. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15901–15911, 2023

work page 2023

[34] [34]

Mip-splatting: Alias-free 3d gaussian splatting

Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, and Andreas Geiger. Mip-splatting: Alias-free 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19447–19456, 2024

work page 2024

[35] [35]

Gaussian in the wild: 3d gaussian splatting for unconstrained image collections

Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, and Haoqian Wang. Gaussian in the wild: 3d gaussian splatting for unconstrained image collections. InEuropean Conference on Computer Vision, pages 341–359. Springer, 2024

work page 2024

[36] [36]

The unrea- sonable effectiveness of deep features as a perceptual metric

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018. 12 Ground TruthOursDroneSplat Figure 6: Additional qualitative comparisons with DroneSplat [28]. Table ...

work page arXiv 2018