pith. machine review for the scientific record. sign in

arxiv: 2605.09662 · v1 · submitted 2026-05-10 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:23 UTC · model grok-4.3

classification 💻 cs.CV
keywords 3D Gaussian Splattingobject extractionsemantic boundariesgeometry optimization3D reconstructionboundary segmentationnovel view synthesis
0
0 comments X

The pith

Two losses reshape 3D Gaussians to match semantic edges

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that standard 3D Gaussian Splatting can be extended past radiance supervision to actively reshape scene geometry using semantic boundary cues. It adds one loss that moves visible Gaussians into alignment by sending gradients straight through the rasterizer and a second loss that repositions non-visible Gaussians through direct parameter updates. If the claim holds, extracted objects gain near-perfect edges that support reliable editing and asset reuse from captured scenes. A sympathetic reader would care because unoptimized geometry has been the main barrier to clean object-level work in these representations. Broad tests against twelve prior methods on four datasets with six metrics are offered as evidence that boundary quality improves overall.

Core claim

We advance this concept further by proposing a novel solution that provides near perfect boundaries in object extraction. We do so by introducing two new losses in the optimization that take care of: 1) a loss that modifies the geometry of visible Gaussians to respect semantic boundaries, and 2) a loss that adjusts the geometry of non-visible Gaussians that appear once the object is extracted. Our first loss propagates gradients directly through the rasterization, allowing for seamless integration within the optimization of the Gaussian parameters. The second loss also propagates gradients to Gaussian parameters but does so without passing through the rasterization, enabling modification of

What carries the argument

A pair of gradient-based losses: one that updates visible Gaussians to fit semantic boundaries via direct rasterization gradients, and one that updates non-visible Gaussians directly without rasterization.

Load-bearing premise

Accurate semantic boundary information must be available to steer the geometry adjustments without creating new artifacts.

What would settle it

Apply the method to scenes supplied with deliberately noisy or shifted semantic boundaries and check whether extracted object metrics fall below the levels reported for competing techniques.

Figures

Figures reproduced from arXiv: 2605.09662 by Adrian Penate-Sanchez, Alessio Mazzucchelli, Francesc Moreno-Noguer, Jordi Sanchez-Riera, Jorge Bustos-Sanchez, Maria Naranjo-Almeida, Mariella Dimiccoli.

Figure 1
Figure 1. Figure 1: Top row: Lifting 2D semantic segmentations to 3DGS scenes often produces incorrect object boundaries due to scenes being constructed as a single entity (left). Bottom row: Early approaches freeze the geometry and only assign semantic labels to existing Gaussians, often leading to inaccurate object extractions (bottom left). This limitation is partially mitigated by radiance-based supervision that allows vi… view at source ↗
Figure 2
Figure 2. Figure 2: Main Diagram: Given a pretrained 2DGS scene and two-stage segmentations masks obtained using SAM2 [ [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Multiview Reprojection. First column: original RGB images. Second column: two-stage segmentation masks obtained using SAM2 [38]. Third column: improved segmentation masks using our multiview reprojection. The first row shows how our approach incorporates regions that were missed. The second row shows that, by understanding the 3D structure of the scene, our approach yields much more fine grained details. b… view at source ↗
Figure 4
Figure 4. Figure 4: Optimization Losses. First column: original Gaus￾sians assignment. Second column: Gaussians optimized using only the 2D Boundary Loss. Third column: Gaussians optimized with both the 2D Boundary Loss and the 3D Occupancy Loss. methods also propagate their loss through the rasteriza￾tion like COB-GS [52] and ObjectGS [57]. We now high￾light the key differences. Our approach naturally supports multi-class se… view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative comparison across methods (rows) and scenes (columns): room (MipNeRF-360), teatime (LeRF), fortress (LLFF), [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Most Gaussian Splatting techniques that provide a 3D semantic representation of the scene do not optimize the underlying 3D geometry, making object-level editing or asset extraction challenging. Recent methods, such as COBGS, Trace3D, ObjectGS, acknowledge this limitation and propose approaches that modify the scene's geometry to represent the underlying semantics. We advance this concept further by proposing a novel solution that provides near perfect boundaries in object extraction. We do so by introducing two new losses in the optimization that take care of: 1) a loss that modifies the geometry of visible Gaussians to respect semantic boundaries, and 2) a loss that adjusts the geometry of non-visible Gaussians that appear once the object is extracted. Our first loss propagates gradients directly through the rasterization, allowing for seamless integration within the optimization of the Gaussian parameters. The second loss also propagates gradients to Gaussian parameters but does so without passing through the rasterization, enabling modification of the scene's geometry even when little transmittance reaches a Gaussian (partial or non-visible). Exhaustive comparisons with 12 state of the art methods across 4 datasets, using six metrics, demonstrate that our approach produces overall the best boundary segmentation to date.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces BEA-GS, an extension to 3D Gaussian Splatting for semantic object extraction. It adds two losses: one that back-propagates gradients through rasterization to adjust positions and covariances of visible Gaussians to match semantic boundaries, and a second that directly updates non-visible (low-transmittance) Gaussians without rasterization. Exhaustive experiments claim superiority over 12 prior methods across 4 datasets and 6 metrics for boundary quality.

Significance. If the results are robust, the work would meaningfully advance semantic 3DGS by directly optimizing geometry rather than radiance only, enabling cleaner object-level editing and asset extraction. The scale of the experimental comparison (12 baselines, 4 datasets, 6 metrics) is a positive feature that could establish a new empirical reference point, provided the evaluation fully captures 3D consistency.

major comments (3)
  1. [Abstract and §3] Abstract and §3 (loss definitions): the two losses directly ingest external semantic boundary labels to drive Gaussian position/covariance updates. No sensitivity analysis or perturbation experiments are reported to quantify how label noise, edge incompleteness, or cross-view inconsistency in the semantics propagates into new 3D artifacts or floaters; this is load-bearing for the 'near perfect boundaries' and 'best to date' claims.
  2. [§4] §4 (evaluation): the six metrics appear to be computed on final 2D rendered boundaries. The manuscript does not report 3D-specific checks such as multi-view coherence of extracted Gaussians, surface consistency after object removal, or re-rendering under perturbed semantic inputs, leaving open whether the reported gains reflect true 3D geometry improvement.
  3. [§3.2] §3.2 (second loss): bypassing rasterization for low-transmittance Gaussians allows direct parameter updates, but the paper provides no analysis of how this affects global scene consistency or introduces unmeasured floaters once the object is extracted; this directly impacts the claim of precise object extraction.
minor comments (2)
  1. [§3] Notation for the two losses should be introduced with explicit equations and variable definitions in §3 to improve readability.
  2. Figure captions could more explicitly annotate the boundary improvements relative to baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive review and positive assessment of the work's potential significance and experimental scale. We address each major comment point-by-point below with honest responses and have revised the manuscript to incorporate additional analysis where needed.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (loss definitions): the two losses directly ingest external semantic boundary labels to drive Gaussian position/covariance updates. No sensitivity analysis or perturbation experiments are reported to quantify how label noise, edge incompleteness, or cross-view inconsistency in the semantics propagates into new 3D artifacts or floaters; this is load-bearing for the 'near perfect boundaries' and 'best to date' claims.

    Authors: We agree that explicit sensitivity analysis to semantic label imperfections would strengthen the robustness claims. Our cross-dataset results already span annotations of varying quality, providing indirect support, but to directly address the concern we have added a new subsection with perturbation experiments (label noise, edge incompleteness, and cross-view inconsistency). These show graceful degradation without substantial new artifacts or floaters, supporting the boundary quality claims. revision: yes

  2. Referee: [§4] §4 (evaluation): the six metrics appear to be computed on final 2D rendered boundaries. The manuscript does not report 3D-specific checks such as multi-view coherence of extracted Gaussians, surface consistency after object removal, or re-rendering under perturbed semantic inputs, leaving open whether the reported gains reflect true 3D geometry improvement.

    Authors: We acknowledge that 2D boundary metrics alone leave room for questions about 3D consistency. The original manuscript already provides qualitative multi-view renderings demonstrating coherence. In the revision we have added quantitative 3D checks, including multi-view Gaussian position variance, surface consistency metrics post-extraction, and re-rendering results under perturbed semantics, confirming that the reported gains correspond to genuine 3D geometry improvements. revision: yes

  3. Referee: [§3.2] §3.2 (second loss): bypassing rasterization for low-transmittance Gaussians allows direct parameter updates, but the paper provides no analysis of how this affects global scene consistency or introduces unmeasured floaters once the object is extracted; this directly impacts the claim of precise object extraction.

    Authors: The second loss is intended to refine non-visible Gaussians precisely to avoid floaters upon extraction. While main results illustrate cleaner extraction, we agree an explicit consistency analysis is warranted. The revised manuscript includes a targeted ablation quantifying global scene consistency and floater counts before/after extraction, using both quantitative metrics and multi-view visual checks. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method and claims are self-contained

full rationale

The paper defines two new loss terms that directly incorporate provided semantic boundary labels to adjust Gaussian positions and covariances (one via rasterization gradients, one bypassing it for low-transmittance cases). These losses are constructed from external inputs and the standard 3DGS rendering pipeline rather than being fitted to or defined in terms of the final boundary metrics being reported. Performance claims rest on empirical comparisons against 12 independent baselines across 4 datasets and 6 metrics, with no reduction of the central result to a self-citation chain, ansatz smuggled via prior work, or renaming of known patterns. The derivation chain therefore remains independent of its own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the introduction of two new loss terms that modify Gaussian geometry using semantic boundaries. No explicit free parameters, axioms, or invented entities are detailed.

free parameters (1)
  • Loss weighting factors
    Relative weights between the new losses and standard 3DGS terms are likely tuned hyperparameters, though not specified in the abstract.
axioms (1)
  • domain assumption Semantic segmentation or boundary labels are available as input
    The losses explicitly rely on semantic information to define and enforce boundaries during optimization.

pith-pipeline@v0.9.0 · 5552 in / 1209 out tokens · 58723 ms · 2026-05-12T04:23:33.573796+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

  1. [1]

    Adobe photoshop

    Adobe Inc. Adobe photoshop. 13

  2. [2]

    Mip-nerf 360: Unbounded anti-aliased neural radiance fields

    Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5470–5479, 2022. 2, 6, 13, 14

  3. [3]

    Segment any 3d gaussians

    Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xi- aopeng Zhang, Wei Shen, and Qi Tian. Segment any 3d gaussians. InProceedings of the AAAI conference on arti- ficial intelligence, pages 1971–1979, 2025. 2

  4. [4]

    Lifting by gaussians: A simple, fast and flexible method for 3d instance segmentation

    Rohan Chacko, Nicolai H ¨ani, Eldar Khaliullin, Lin Sun, and Douglas Lee. Lifting by gaussians: A simple, fast and flexible method for 3d instance segmentation. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3497–3507. IEEE, 2025. 2

  5. [5]

    Textured gaussians for en- hanced 3d scene appearance modeling

    Brian Chao, Hung-Yu Tseng, Lorenzo Porzi, Chen Gao, Tuotuo Li, Qinbo Li, Ayush Saraf, Jia-Bin Huang, Johannes Kopf, Gordon Wetzstein, et al. Textured gaussians for en- hanced 3d scene appearance modeling. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 8964–8974, 2025. 2

  6. [6]

    Dge: Di- rect gaussian 3d editing by consistent multi-view editing

    Minghao Chen, Iro Laina, and Andrea Vedaldi. Dge: Di- rect gaussian 3d editing by consistent multi-view editing. InEuropean conference on computer vision, pages 74–92. Springer, 2024. 2

  7. [7]

    Gaussianeditor: Swift and control- lable 3d editing with gaussian splatting

    Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xi- aofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, and Guosheng Lin. Gaussianeditor: Swift and control- lable 3d editing with gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21476–21485, 2024. 2

  8. [8]

    Berg, and Alexander Kirillov

    Bowen Cheng, Ross Girshick, Piotr Dollar, Alexander C. Berg, and Alexander Kirillov. Boundary iou: Improving object-centric image segmentation evaluation. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15334–15342, 2021. 6

  9. [9]

    Click-gaussian: Interactive segmenta- tion to any 3d gaussians

    Seokhun Choi, Hyeonseop Song, Jaechul Kim, Taehyeong Kim, and Hoseok Do. Click-gaussian: Interactive segmenta- tion to any 3d gaussians. InEuropean Conference on Com- puter Vision, pages 289–305. Springer, 2024. 2, 3

  10. [10]

    Splatfill: 3d scene inpainting via depth- guided gaussian splatting.arXiv preprint arXiv:2509.07809,

    Mahtab Dahaghin, Milind G Padalkar, Matteo Toso, and Alessio Del Bue. Splatfill: 3d scene inpainting via depth- guided gaussian splatting.arXiv preprint arXiv:2509.07809,

  11. [11]

    Optimal rates for k-nn density and mode estimation

    Sanjoy Dasgupta and Samory Kpotufe. Optimal rates for k-nn density and mode estimation. InAdvances in Neu- ral Information Processing Systems. Curran Associates, Inc.,

  12. [12]

    Gaussian frosting: Ed- itable complex radiance fields with real-time rendering

    Antoine Gu ´edon and Vincent Lepetit. Gaussian frosting: Ed- itable complex radiance fields with real-time rendering. In European conference on computer vision, pages 413–430. Springer, 2024. 2

  13. [13]

    arXiv:2401.17857 (2024)

    Xu Hu, Yuxi Wang, Lue Fan, Chuanchen Luo, Junsong Fan, Zhen Lei, Qing Li, Junran Peng, and Zhaoxiang Zhang. Sagd: Boundary-enhanced segment anything in 3d gaussian via gaussian decomposition.arXiv preprint arXiv:2401.17857, 2024. 3, 6, 7

  14. [14]

    2d gaussian splatting for geometrically ac- curate radiance fields

    Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically ac- curate radiance fields. InACM SIGGRAPH 2024 conference papers, pages 1–11, 2024. 2, 4, 13

  15. [15]

    3d gaussian inpainting with depth-guided cross-view consistency

    Sheng-Yu Huang, Zi-Ting Chou, and Yu-Chiang Frank Wang. 3d gaussian inpainting with depth-guided cross-view consistency. InProceedings of the Computer Vision and Pat- tern Recognition Conference (CVPR), pages 26704–26713,

  16. [16]

    Ice-g: Image conditional editing of 3d gaussian splats.arXiv preprint arXiv:2406.08488, 2024

    Vishnu Jaganathan, Hannah Hanyun Huang, Muham- mad Zubair Irshad, Varun Jampani, Amit Raj, and Zsolt Kira. Ice-g: Image conditional editing of 3d gaussian splats.arXiv preprint arXiv:2406.08488, 2024. 2

  17. [17]

    Gaus- siancut: Interactive segmentation via graph cut for 3d gaus- sian splatting.Advances in Neural Information Processing Systems, 37:89184–89212, 2024

    Umangi Jain, Ashkan Mirzaei, and Igor Gilitschenski. Gaus- siancut: Interactive segmentation via graph cut for 3d gaus- sian splatting.Advances in Neural Information Processing Systems, 37:89184–89212, 2024. 2, 3, 6, 7

  18. [18]

    Identity-aware language gaussian splatting for open-vocabulary 3d semantic segmen- tation

    SungMin Jang and Wonjun Kim. Identity-aware language gaussian splatting for open-vocabulary 3d semantic segmen- tation. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 20467–20476, 2025. 3, 7

  19. [19]

    Kim Jun-Seong, GeonU Kim, Kim Yu-Ji, Yu-Chiang Frank Wang, Jaesung Choe, and Tae-Hyun Oh. Dr. splat: Directly referring 3d gaussian splatting via direct language embed- ding registration. InProceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 14137– 14146, 2025. 2, 3, 6, 7

  20. [20]

    3d gaussian splatting for real-time radiance field rendering.ACM Trans

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1,

  21. [21]

    Lerf: Language embedded radiance fields

    Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, and Matthew Tancik. Lerf: Language embedded radiance fields. InProceedings of the IEEE/CVF interna- tional conference on computer vision, pages 19729–19739,

  22. [22]

    3d gaussian splat- ting as markov chain monte carlo.Advances in Neural Infor- mation Processing Systems, 37:80965–80986, 2024

    Shakiba Kheradmand, Daniel Rebain, Gopal Sharma, Wei- wei Sun, Yang-Che Tseng, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, and Kwang Moo Yi. 3d gaussian splat- ting as markov chain monte carlo.Advances in Neural Infor- mation Processing Systems, 37:80965–80986, 2024. 2

  23. [23]

    Robust 3d- masked part-level editing in 3d gaussian splatting with reg- ularized score distillation sampling

    Hayeon Kim, Ji Ha Jang, and Se Young Chun. Robust 3d- masked part-level editing in 3d gaussian splatting with reg- ularized score distillation sampling. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 5501–5510, 2025. 2

  24. [24]

    Segment any- thing

    Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 4015–4026, 2023. 2

  25. [25]

    Weakly supervised 3d open- vocabulary segmentation.Advances in Neural Information Processing Systems, 36:53433–53456, 2023

    Kunhao Liu, Fangneng Zhan, Jiahui Zhang, Muyu Xu, Yingchen Yu, Abdulmotaleb El Saddik, Christian Theobalt, Eric Xing, and Shijian Lu. Weakly supervised 3d open- vocabulary segmentation.Advances in Neural Information Processing Systems, 36:53433–53456, 2023. 6, 7, 13, 14

  26. [26]

    Loftsgaarden and Charles P

    Don O. Loftsgaarden and Charles P. Quesenberry. A non- parametric estimate of a multivariate density function.An- nals of Mathematical Statistics, 36:1049–1051, 1965. 5

  27. [27]

    Scaffold-GS: Structured 3d gaussians for view-adaptive rendering

    Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-GS: Structured 3d gaussians for view-adaptive rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20654–20664, 2024. 2

  28. [28]

    Gaga: Group any gaussians via 3d-aware memory bank.arXiv preprint arXiv:2404.07977, 2024

    Weijie Lyu, Xueting Li, Abhijit Kundu, Yi-Hsuan Tsai, and Ming-Hsuan Yang. Gaga: Group any gaussians via 3d-aware memory bank.arXiv preprint arXiv:2404.07977, 2024. 3

  29. [29]

    Virgi: View-dependent instant re- coloring of 3d gaussians splats.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2026

    Alessio Mazzucchelli, Ivan Ojeda-Martin, Fernando Rivas- Manzaneque, Elena Garces, Adrian Penate-Sanchez, and Francesc Moreno-Noguer. Virgi: View-dependent instant re- coloring of 3d gaussians splats.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2026. 2

  30. [30]

    Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines.ACM Transac- tions on Graphics (ToG), 38(4):1–14, 2019

    Ben Mildenhall, Pratul P Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines.ACM Transac- tions on Graphics (ToG), 38(4):1–14, 2019. 6, 13, 14

  31. [31]

    Nerf: Representing scenes as neural radiance fields for view syn- thesis

    Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. InEuropean Conference on Computer Vision, pages 405–421. Springer, 2020. 2

  32. [32]

    Reffusion: Reference adapted dif- fusion models for 3d scene inpainting.arXiv preprint arXiv:2404.10765, 2024

    Ashkan Mirzaei, Riccardo De Lutio, Seung Wook Kim, David Acuna, Jonathan Kelly, Sanja Fidler, Igor Gilitschen- ski, and Zan Gojcic. Reffusion: Reference adapted dif- fusion models for 3d scene inpainting.arXiv preprint arXiv:2404.10765, 2024. 2

  33. [33]

    3d gaussian ray trac- ing: Fast tracing of particle scenes.ACM Transactions on Graphics (TOG), 43(6):1–19, 2024

    Nicolas Moenne-Loccoz, Ashkan Mirzaei, Or Perel, Ric- cardo De Lutio, Janick Martinez Esturo, Gavriel State, Sanja Fidler, Nicholas Sharp, and Zan Gojcic. 3d gaussian ray trac- ing: Fast tracing of particle scenes.ACM Transactions on Graphics (TOG), 43(6):1–19, 2024. 2

  34. [34]

    Instant neural graphics primitives with a mul- tiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1–15, 2022

    Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a mul- tiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1–15, 2022. 2

  35. [35]

    Pytorch: An im- perative style, high-performance deep learning library, 2019

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K ¨opf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An im- perative style, high-perf...

  36. [36]

    Langsplat: 3d language gaussian splatting

    Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, and Hanspeter Pfister. Langsplat: 3d language gaussian splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20051–20060, 2024. 2

  37. [37]

    Learning transferable visual models from natural language supervi- sion

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 2

  38. [38]

    SAM 2: Segment anything in images and videos

    Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R¨adle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junt- ing Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao- Yuan Wu, Ross Girshick, Piotr Dollar, and Christoph Feicht- enhofer. SAM 2: Segment anything in images and videos. In The Thirteenth Inte...

  39. [39]

    Neural volumetric object selection

    Zhongzheng Ren, Aseem Agarwala, Bryan Russell, Alexan- der G Schwing, and Oliver Wang. Neural volumetric object selection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6133–6142,

  40. [40]

    Trace3d: Consistent segmenta- tion lifting via gaussian instance tracing

    Hongyu Shen, Junfeng Ni, Yixin Chen, Weishuo Li, Ming- tao Pei, and Siyuan Huang. Trace3d: Consistent segmenta- tion lifting via gaussian instance tracing. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 6656–6666, 2025. 1, 2, 3, 5, 6, 7, 8, 13

  41. [41]

    Flashsplat: 2d to 3d gaussian splatting segmentation solved optimally

    Qiuhong Shen, Xingyi Yang, and Xinchao Wang. Flashsplat: 2d to 3d gaussian splatting segmentation solved optimally. In European Conference on Computer Vision, pages 456–472. Springer, 2024. 2, 3, 4, 6, 7

  42. [42]

    Language embedded 3d gaussians for open- vocabulary scene understanding

    Jin-Chuan Shi, Miao Wang, Hao-Bin Duan, and Shao- Hua Guan. Language embedded 3d gaussians for open- vocabulary scene understanding. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5333–5343, 2024. 2

  43. [43]

    Learning 3d geometry and feature consistent gaussian splat- ting for object removal

    Yuxin Wang, Qianyi Wu, Guofeng Zhang, and Dan Xu. Learning 3d geometry and feature consistent gaussian splat- ting for object removal. InEuropean conference on computer vision, pages 1–17. Springer, 2024. 2

  44. [44]

    View-consistent 3d editing with gaus- sian splatting

    Yuxuan Wang, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, and Hanwang Zhang. View-consistent 3d editing with gaus- sian splatting. InEuropean conference on computer vision, pages 404–420. Springer, 2024. 2

  45. [45]

    Ag2aussian: Anchor-graph structured gaussian splatting for instance-level 3d scene understanding and editing

    Zhaonan Wang, Manyi Li, and Changhe Tu. Ag2aussian: Anchor-graph structured gaussian splatting for instance-level 3d scene understanding and editing. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 26806–26816, 2025. 2

  46. [46]

    Gaussctrl: Multi-view consistent text-driven 3d gaussian splatting edit- ing

    Jing Wu, Jia-Wang Bian, Xinghui Li, Guangrun Wang, Ian Reid, Philip Torr, and Victor Adrian Prisacariu. Gaussctrl: Multi-view consistent text-driven 3d gaussian splatting edit- ing. InEuropean conference on computer vision, pages 55–

  47. [47]

    3dgut: Enabling distorted cameras and secondary rays in gaussian splatting

    Qi Wu, Janick Martinez Esturo, Ashkan Mirzaei, Nicolas Moenne-Loccoz, and Zan Gojcic. 3dgut: Enabling distorted cameras and secondary rays in gaussian splatting. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 26036–26046, 2025. 2

  48. [48]

    Opengaussian: Towards point-level 3d gaussian-based open vocabulary understanding.Advances in Neural Information Processing Systems, 37:19114–19138,

    Yanmin Wu, Jiarui Meng, Haijie Li, Chenming Wu, Yahao Shi, Xinhua Cheng, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, et al. Opengaussian: Towards point-level 3d gaussian-based open vocabulary understanding.Advances in Neural Information Processing Systems, 37:19114–19138,

  49. [49]

    Instascene: Towards complete 3d instance decomposition and reconstruction from cluttered scenes

    Zesong Yang, Bangbang Yang, Wenqi Dong, Chenxuan Cao, Liyuan Cui, Yuewen Ma, Zhaopeng Cui, and Hujun Bao. Instascene: Towards complete 3d instance decomposition and reconstruction from cluttered scenes. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7771–7781, 2025. 2, 3, 7

  50. [50]

    Gaussian grouping: Segment and edit anything in 3d scenes

    Mingqiao Ye, Martin Danelljan, Fisher Yu, and Lei Ke. Gaussian grouping: Segment and edit anything in 3d scenes. InEuropean conference on computer vision, pages 162–179. Springer, 2024. 2, 3, 6, 7, 13, 14

  51. [51]

    Panogs: Gaussian-based panoptic seg- mentation for 3d open vocabulary scene understanding

    Hongjia Zhai, Hai Li, Zhenzhe Li, Xiaokun Pan, Yijia He, and Guofeng Zhang. Panogs: Gaussian-based panoptic seg- mentation for 3d open vocabulary scene understanding. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 14114–14124, 2025. 2

  52. [52]

    Cob-gs: Clear object boundaries in 3dgs seg- mentation based on boundary-adaptive gaussian splitting

    Jiaxin Zhang, Junjun Jiang, Youyu Chen, Kui Jiang, and Xi- anming Liu. Cob-gs: Clear object boundaries in 3dgs seg- mentation based on boundary-adaptive gaussian splitting. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 19335–19344, 2025. 1, 2, 3, 4, 5, 6, 7, 13

  53. [53]

    3ditscene: Editing any scene via language-guided disentan- gled gaussian splatting

    Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, and Ceyuan Yang. 3ditscene: Editing any scene via language-guided disentan- gled gaussian splatting. InThe Thirteenth International Con- ference on Learning Representations, 2025. 2

  54. [54]

    High- fidelity 3d gaussian inpainting: Preserving multi-view con- sistency and photorealistic details.Computers & Graphics, page 104362, 2025

    Jun Zhou, Dinghao Li, Nannan Li, and Mingjie Wang. High- fidelity 3d gaussian inpainting: Preserving multi-view con- sistency and photorealistic details.Computers & Graphics, page 104362, 2025. 2

  55. [55]

    Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields

    Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Ze- hao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, and Achuta Kadambi. Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21676–21685, 2024. 2

  56. [56]

    Rethinking end- to-end 2d to 3d scene segmentation in gaussian splatting

    Runsong Zhu, Shi Qiu, Zhengzhe Liu, Ka-Hei Hui, Qianyi Wu, Pheng-Ann Heng, and Chi-Wing Fu. Rethinking end- to-end 2d to 3d scene segmentation in gaussian splatting. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 3656–3665, 2025. 2, 3, 7

  57. [57]

    Objectgs: Object-aware scene reconstruction and scene understanding via gaussian splatting

    Ruijie Zhu, Mulin Yu, Linning Xu, Lihan Jiang, Yixuan Li, Tianzhu Zhang, Jiangmiao Pang, and Bo Dai. Objectgs: Object-aware scene reconstruction and scene understanding via gaussian splatting. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 8350–8360, 2025. 1, 2, 3, 5, 6, 7, 13

  58. [58]

    Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding.Inter- national Journal of Computer Vision, 133(2):611–627, 2025

    Xingxing Zuo, Pouya Samangouei, Yunwen Zhou, Yan Di, and Mingyang Li. Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding.Inter- national Journal of Computer Vision, 133(2):611–627, 2025. 2 Table A1. Sensitivity analysis on the number of points (k) used to compute the voxel density. Extracted metrics (PSNR, IoU, BIoU)...