arxiv: 2605.09662 · v1 · submitted 2026-05-10 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction

Alessio Mazzucchelli , Maria Naranjo-Almeida , Jorge Bustos-Sanchez , Mariella Dimiccoli , Francesc Moreno-Noguer , Jordi Sanchez-Riera , Adrian Penate-Sanchez

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:23 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D Gaussian Splattingobject extractionsemantic boundariesgeometry optimization3D reconstructionboundary segmentationnovel view synthesis

0 comments

The pith

Two losses reshape 3D Gaussians to match semantic edges

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that standard 3D Gaussian Splatting can be extended past radiance supervision to actively reshape scene geometry using semantic boundary cues. It adds one loss that moves visible Gaussians into alignment by sending gradients straight through the rasterizer and a second loss that repositions non-visible Gaussians through direct parameter updates. If the claim holds, extracted objects gain near-perfect edges that support reliable editing and asset reuse from captured scenes. A sympathetic reader would care because unoptimized geometry has been the main barrier to clean object-level work in these representations. Broad tests against twelve prior methods on four datasets with six metrics are offered as evidence that boundary quality improves overall.

Core claim

We advance this concept further by proposing a novel solution that provides near perfect boundaries in object extraction. We do so by introducing two new losses in the optimization that take care of: 1) a loss that modifies the geometry of visible Gaussians to respect semantic boundaries, and 2) a loss that adjusts the geometry of non-visible Gaussians that appear once the object is extracted. Our first loss propagates gradients directly through the rasterization, allowing for seamless integration within the optimization of the Gaussian parameters. The second loss also propagates gradients to Gaussian parameters but does so without passing through the rasterization, enabling modification of

What carries the argument

A pair of gradient-based losses: one that updates visible Gaussians to fit semantic boundaries via direct rasterization gradients, and one that updates non-visible Gaussians directly without rasterization.

Load-bearing premise

Accurate semantic boundary information must be available to steer the geometry adjustments without creating new artifacts.

What would settle it

Apply the method to scenes supplied with deliberately noisy or shifted semantic boundaries and check whether extracted object metrics fall below the levels reported for competing techniques.

Figures

Figures reproduced from arXiv: 2605.09662 by Adrian Penate-Sanchez, Alessio Mazzucchelli, Francesc Moreno-Noguer, Jordi Sanchez-Riera, Jorge Bustos-Sanchez, Maria Naranjo-Almeida, Mariella Dimiccoli.

**Figure 1.** Figure 1: Top row: Lifting 2D semantic segmentations to 3DGS scenes often produces incorrect object boundaries due to scenes being constructed as a single entity (left). Bottom row: Early approaches freeze the geometry and only assign semantic labels to existing Gaussians, often leading to inaccurate object extractions (bottom left). This limitation is partially mitigated by radiance-based supervision that allows vi… view at source ↗

**Figure 2.** Figure 2: Main Diagram: Given a pretrained 2DGS scene and two-stage segmentations masks obtained using SAM2 [ [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Multiview Reprojection. First column: original RGB images. Second column: two-stage segmentation masks obtained using SAM2 [38]. Third column: improved segmentation masks using our multiview reprojection. The first row shows how our approach incorporates regions that were missed. The second row shows that, by understanding the 3D structure of the scene, our approach yields much more fine grained details. b… view at source ↗

**Figure 4.** Figure 4: Optimization Losses. First column: original Gaussians assignment. Second column: Gaussians optimized using only the 2D Boundary Loss. Third column: Gaussians optimized with both the 2D Boundary Loss and the 3D Occupancy Loss. methods also propagate their loss through the rasterization like COB-GS [52] and ObjectGS [57]. We now highlight the key differences. Our approach naturally supports multi-class se… view at source ↗

**Figure 5.** Figure 5: Qualitative comparison across methods (rows) and scenes (columns): room (MipNeRF-360), teatime (LeRF), fortress (LLFF), [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Most Gaussian Splatting techniques that provide a 3D semantic representation of the scene do not optimize the underlying 3D geometry, making object-level editing or asset extraction challenging. Recent methods, such as COBGS, Trace3D, ObjectGS, acknowledge this limitation and propose approaches that modify the scene's geometry to represent the underlying semantics. We advance this concept further by proposing a novel solution that provides near perfect boundaries in object extraction. We do so by introducing two new losses in the optimization that take care of: 1) a loss that modifies the geometry of visible Gaussians to respect semantic boundaries, and 2) a loss that adjusts the geometry of non-visible Gaussians that appear once the object is extracted. Our first loss propagates gradients directly through the rasterization, allowing for seamless integration within the optimization of the Gaussian parameters. The second loss also propagates gradients to Gaussian parameters but does so without passing through the rasterization, enabling modification of the scene's geometry even when little transmittance reaches a Gaussian (partial or non-visible). Exhaustive comparisons with 12 state of the art methods across 4 datasets, using six metrics, demonstrate that our approach produces overall the best boundary segmentation to date.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BEA-GS adds two boundary-specific losses to 3DGS—one routing gradients through rasterization for visible Gaussians and one bypassing it for the rest—to tighten object extraction, with broad comparisons backing the gains.

read the letter

The main point is that this work targets a practical gap in 3D Gaussian Splatting: standard radiance optimization leaves object boundaries loose, which hurts editing and extraction. The authors introduce two losses that directly use semantic boundaries to move Gaussians. The first sends gradients through the rasterizer for visible points. The second skips rasterization for low-transmittance Gaussians so they still get updated. This split is the clearest technical step beyond the cited priors like COBGS and ObjectGS.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces BEA-GS, an extension to 3D Gaussian Splatting for semantic object extraction. It adds two losses: one that back-propagates gradients through rasterization to adjust positions and covariances of visible Gaussians to match semantic boundaries, and a second that directly updates non-visible (low-transmittance) Gaussians without rasterization. Exhaustive experiments claim superiority over 12 prior methods across 4 datasets and 6 metrics for boundary quality.

Significance. If the results are robust, the work would meaningfully advance semantic 3DGS by directly optimizing geometry rather than radiance only, enabling cleaner object-level editing and asset extraction. The scale of the experimental comparison (12 baselines, 4 datasets, 6 metrics) is a positive feature that could establish a new empirical reference point, provided the evaluation fully captures 3D consistency.

major comments (3)

[Abstract and §3] Abstract and §3 (loss definitions): the two losses directly ingest external semantic boundary labels to drive Gaussian position/covariance updates. No sensitivity analysis or perturbation experiments are reported to quantify how label noise, edge incompleteness, or cross-view inconsistency in the semantics propagates into new 3D artifacts or floaters; this is load-bearing for the 'near perfect boundaries' and 'best to date' claims.
[§4] §4 (evaluation): the six metrics appear to be computed on final 2D rendered boundaries. The manuscript does not report 3D-specific checks such as multi-view coherence of extracted Gaussians, surface consistency after object removal, or re-rendering under perturbed semantic inputs, leaving open whether the reported gains reflect true 3D geometry improvement.
[§3.2] §3.2 (second loss): bypassing rasterization for low-transmittance Gaussians allows direct parameter updates, but the paper provides no analysis of how this affects global scene consistency or introduces unmeasured floaters once the object is extracted; this directly impacts the claim of precise object extraction.

minor comments (2)

[§3] Notation for the two losses should be introduced with explicit equations and variable definitions in §3 to improve readability.
Figure captions could more explicitly annotate the boundary improvements relative to baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive review and positive assessment of the work's potential significance and experimental scale. We address each major comment point-by-point below with honest responses and have revised the manuscript to incorporate additional analysis where needed.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (loss definitions): the two losses directly ingest external semantic boundary labels to drive Gaussian position/covariance updates. No sensitivity analysis or perturbation experiments are reported to quantify how label noise, edge incompleteness, or cross-view inconsistency in the semantics propagates into new 3D artifacts or floaters; this is load-bearing for the 'near perfect boundaries' and 'best to date' claims.

Authors: We agree that explicit sensitivity analysis to semantic label imperfections would strengthen the robustness claims. Our cross-dataset results already span annotations of varying quality, providing indirect support, but to directly address the concern we have added a new subsection with perturbation experiments (label noise, edge incompleteness, and cross-view inconsistency). These show graceful degradation without substantial new artifacts or floaters, supporting the boundary quality claims. revision: yes
Referee: [§4] §4 (evaluation): the six metrics appear to be computed on final 2D rendered boundaries. The manuscript does not report 3D-specific checks such as multi-view coherence of extracted Gaussians, surface consistency after object removal, or re-rendering under perturbed semantic inputs, leaving open whether the reported gains reflect true 3D geometry improvement.

Authors: We acknowledge that 2D boundary metrics alone leave room for questions about 3D consistency. The original manuscript already provides qualitative multi-view renderings demonstrating coherence. In the revision we have added quantitative 3D checks, including multi-view Gaussian position variance, surface consistency metrics post-extraction, and re-rendering results under perturbed semantics, confirming that the reported gains correspond to genuine 3D geometry improvements. revision: yes
Referee: [§3.2] §3.2 (second loss): bypassing rasterization for low-transmittance Gaussians allows direct parameter updates, but the paper provides no analysis of how this affects global scene consistency or introduces unmeasured floaters once the object is extracted; this directly impacts the claim of precise object extraction.

Authors: The second loss is intended to refine non-visible Gaussians precisely to avoid floaters upon extraction. While main results illustrate cleaner extraction, we agree an explicit consistency analysis is warranted. The revised manuscript includes a targeted ablation quantifying global scene consistency and floater counts before/after extraction, using both quantitative metrics and multi-view visual checks. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method and claims are self-contained

full rationale

The paper defines two new loss terms that directly incorporate provided semantic boundary labels to adjust Gaussian positions and covariances (one via rasterization gradients, one bypassing it for low-transmittance cases). These losses are constructed from external inputs and the standard 3DGS rendering pipeline rather than being fitted to or defined in terms of the final boundary metrics being reported. Performance claims rest on empirical comparisons against 12 independent baselines across 4 datasets and 6 metrics, with no reduction of the central result to a self-citation chain, ansatz smuggled via prior work, or renaming of known patterns. The derivation chain therefore remains independent of its own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the introduction of two new loss terms that modify Gaussian geometry using semantic boundaries. No explicit free parameters, axioms, or invented entities are detailed.

free parameters (1)

Loss weighting factors
Relative weights between the new losses and standard 3DGS terms are likely tuned hyperparameters, though not specified in the abstract.

axioms (1)

domain assumption Semantic segmentation or boundary labels are available as input
The losses explicitly rely on semantic information to define and enforce boundaries during optimization.

pith-pipeline@v0.9.0 · 5552 in / 1209 out tokens · 58723 ms · 2026-05-12T04:23:33.573796+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce two new losses... 1) a loss that modifies the geometry of visible Gaussians to respect semantic boundaries... 2) a loss that adjusts the geometry of non-visible Gaussians... Lbound(u) = sum Hi(u) αi Ĝi(x) ... Locc = sum αi Ĝi(x'r) Qi(x'r, Vϕ)
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Exhaustive comparisons with 12 state of the art methods across 4 datasets, using six metrics, demonstrate that our approach produces overall the best boundary segmentation to date.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

[1]

Adobe photoshop

Adobe Inc. Adobe photoshop. 13

work page
[2]

Mip-nerf 360: Unbounded anti-aliased neural radiance fields

Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5470–5479, 2022. 2, 6, 13, 14

work page 2022
[3]

Segment any 3d gaussians

Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xi- aopeng Zhang, Wei Shen, and Qi Tian. Segment any 3d gaussians. InProceedings of the AAAI conference on arti- ficial intelligence, pages 1971–1979, 2025. 2

work page 1971
[4]

Lifting by gaussians: A simple, fast and flexible method for 3d instance segmentation

Rohan Chacko, Nicolai H ¨ani, Eldar Khaliullin, Lin Sun, and Douglas Lee. Lifting by gaussians: A simple, fast and flexible method for 3d instance segmentation. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3497–3507. IEEE, 2025. 2

work page 2025
[5]

Textured gaussians for en- hanced 3d scene appearance modeling

Brian Chao, Hung-Yu Tseng, Lorenzo Porzi, Chen Gao, Tuotuo Li, Qinbo Li, Ayush Saraf, Jia-Bin Huang, Johannes Kopf, Gordon Wetzstein, et al. Textured gaussians for en- hanced 3d scene appearance modeling. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 8964–8974, 2025. 2

work page 2025
[6]

Dge: Di- rect gaussian 3d editing by consistent multi-view editing

Minghao Chen, Iro Laina, and Andrea Vedaldi. Dge: Di- rect gaussian 3d editing by consistent multi-view editing. InEuropean conference on computer vision, pages 74–92. Springer, 2024. 2

work page 2024
[7]

Gaussianeditor: Swift and control- lable 3d editing with gaussian splatting

Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xi- aofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, and Guosheng Lin. Gaussianeditor: Swift and control- lable 3d editing with gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21476–21485, 2024. 2

work page 2024
[8]

Berg, and Alexander Kirillov

Bowen Cheng, Ross Girshick, Piotr Dollar, Alexander C. Berg, and Alexander Kirillov. Boundary iou: Improving object-centric image segmentation evaluation. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15334–15342, 2021. 6

work page 2021
[9]

Click-gaussian: Interactive segmenta- tion to any 3d gaussians

Seokhun Choi, Hyeonseop Song, Jaechul Kim, Taehyeong Kim, and Hoseok Do. Click-gaussian: Interactive segmenta- tion to any 3d gaussians. InEuropean Conference on Com- puter Vision, pages 289–305. Springer, 2024. 2, 3

work page 2024
[10]

Splatfill: 3d scene inpainting via depth- guided gaussian splatting.arXiv preprint arXiv:2509.07809,

Mahtab Dahaghin, Milind G Padalkar, Matteo Toso, and Alessio Del Bue. Splatfill: 3d scene inpainting via depth- guided gaussian splatting.arXiv preprint arXiv:2509.07809,

work page arXiv
[11]

Optimal rates for k-nn density and mode estimation

Sanjoy Dasgupta and Samory Kpotufe. Optimal rates for k-nn density and mode estimation. InAdvances in Neu- ral Information Processing Systems. Curran Associates, Inc.,

work page
[12]

Gaussian frosting: Ed- itable complex radiance fields with real-time rendering

Antoine Gu ´edon and Vincent Lepetit. Gaussian frosting: Ed- itable complex radiance fields with real-time rendering. In European conference on computer vision, pages 413–430. Springer, 2024. 2

work page 2024
[13]

arXiv:2401.17857 (2024)

Xu Hu, Yuxi Wang, Lue Fan, Chuanchen Luo, Junsong Fan, Zhen Lei, Qing Li, Junran Peng, and Zhaoxiang Zhang. Sagd: Boundary-enhanced segment anything in 3d gaussian via gaussian decomposition.arXiv preprint arXiv:2401.17857, 2024. 3, 6, 7

work page arXiv 2024
[14]

2d gaussian splatting for geometrically ac- curate radiance fields

Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically ac- curate radiance fields. InACM SIGGRAPH 2024 conference papers, pages 1–11, 2024. 2, 4, 13

work page 2024
[15]

3d gaussian inpainting with depth-guided cross-view consistency

Sheng-Yu Huang, Zi-Ting Chou, and Yu-Chiang Frank Wang. 3d gaussian inpainting with depth-guided cross-view consistency. InProceedings of the Computer Vision and Pat- tern Recognition Conference (CVPR), pages 26704–26713,

work page
[16]

Ice-g: Image conditional editing of 3d gaussian splats.arXiv preprint arXiv:2406.08488, 2024

Vishnu Jaganathan, Hannah Hanyun Huang, Muham- mad Zubair Irshad, Varun Jampani, Amit Raj, and Zsolt Kira. Ice-g: Image conditional editing of 3d gaussian splats.arXiv preprint arXiv:2406.08488, 2024. 2

work page arXiv 2024
[17]

Gaus- siancut: Interactive segmentation via graph cut for 3d gaus- sian splatting.Advances in Neural Information Processing Systems, 37:89184–89212, 2024

Umangi Jain, Ashkan Mirzaei, and Igor Gilitschenski. Gaus- siancut: Interactive segmentation via graph cut for 3d gaus- sian splatting.Advances in Neural Information Processing Systems, 37:89184–89212, 2024. 2, 3, 6, 7

work page 2024
[18]

Identity-aware language gaussian splatting for open-vocabulary 3d semantic segmen- tation

SungMin Jang and Wonjun Kim. Identity-aware language gaussian splatting for open-vocabulary 3d semantic segmen- tation. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 20467–20476, 2025. 3, 7

work page 2025
[19]

Kim Jun-Seong, GeonU Kim, Kim Yu-Ji, Yu-Chiang Frank Wang, Jaesung Choe, and Tae-Hyun Oh. Dr. splat: Directly referring 3d gaussian splatting via direct language embed- ding registration. InProceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 14137– 14146, 2025. 2, 3, 6, 7

work page 2025
[20]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1,

work page
[21]

Lerf: Language embedded radiance fields

Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, and Matthew Tancik. Lerf: Language embedded radiance fields. InProceedings of the IEEE/CVF interna- tional conference on computer vision, pages 19729–19739,

work page
[22]

3d gaussian splat- ting as markov chain monte carlo.Advances in Neural Infor- mation Processing Systems, 37:80965–80986, 2024

Shakiba Kheradmand, Daniel Rebain, Gopal Sharma, Wei- wei Sun, Yang-Che Tseng, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, and Kwang Moo Yi. 3d gaussian splat- ting as markov chain monte carlo.Advances in Neural Infor- mation Processing Systems, 37:80965–80986, 2024. 2

work page 2024
[23]

Robust 3d- masked part-level editing in 3d gaussian splatting with reg- ularized score distillation sampling

Hayeon Kim, Ji Ha Jang, and Se Young Chun. Robust 3d- masked part-level editing in 3d gaussian splatting with reg- ularized score distillation sampling. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 5501–5510, 2025. 2

work page 2025
[24]

Segment any- thing

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 4015–4026, 2023. 2

work page 2023
[25]

Weakly supervised 3d open- vocabulary segmentation.Advances in Neural Information Processing Systems, 36:53433–53456, 2023

Kunhao Liu, Fangneng Zhan, Jiahui Zhang, Muyu Xu, Yingchen Yu, Abdulmotaleb El Saddik, Christian Theobalt, Eric Xing, and Shijian Lu. Weakly supervised 3d open- vocabulary segmentation.Advances in Neural Information Processing Systems, 36:53433–53456, 2023. 6, 7, 13, 14

work page 2023
[26]

Loftsgaarden and Charles P

Don O. Loftsgaarden and Charles P. Quesenberry. A non- parametric estimate of a multivariate density function.An- nals of Mathematical Statistics, 36:1049–1051, 1965. 5

work page 1965
[27]

Scaffold-GS: Structured 3d gaussians for view-adaptive rendering

Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-GS: Structured 3d gaussians for view-adaptive rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20654–20664, 2024. 2

work page 2024
[28]

Gaga: Group any gaussians via 3d-aware memory bank.arXiv preprint arXiv:2404.07977, 2024

Weijie Lyu, Xueting Li, Abhijit Kundu, Yi-Hsuan Tsai, and Ming-Hsuan Yang. Gaga: Group any gaussians via 3d-aware memory bank.arXiv preprint arXiv:2404.07977, 2024. 3

work page arXiv 2024
[29]

Virgi: View-dependent instant re- coloring of 3d gaussians splats.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2026

Alessio Mazzucchelli, Ivan Ojeda-Martin, Fernando Rivas- Manzaneque, Elena Garces, Adrian Penate-Sanchez, and Francesc Moreno-Noguer. Virgi: View-dependent instant re- coloring of 3d gaussians splats.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2026. 2

work page 2026
[30]

Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines.ACM Transac- tions on Graphics (ToG), 38(4):1–14, 2019

Ben Mildenhall, Pratul P Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines.ACM Transac- tions on Graphics (ToG), 38(4):1–14, 2019. 6, 13, 14

work page 2019
[31]

Nerf: Representing scenes as neural radiance fields for view syn- thesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. InEuropean Conference on Computer Vision, pages 405–421. Springer, 2020. 2

work page 2020
[32]

Reffusion: Reference adapted dif- fusion models for 3d scene inpainting.arXiv preprint arXiv:2404.10765, 2024

Ashkan Mirzaei, Riccardo De Lutio, Seung Wook Kim, David Acuna, Jonathan Kelly, Sanja Fidler, Igor Gilitschen- ski, and Zan Gojcic. Reffusion: Reference adapted dif- fusion models for 3d scene inpainting.arXiv preprint arXiv:2404.10765, 2024. 2

work page arXiv 2024
[33]

3d gaussian ray trac- ing: Fast tracing of particle scenes.ACM Transactions on Graphics (TOG), 43(6):1–19, 2024

Nicolas Moenne-Loccoz, Ashkan Mirzaei, Or Perel, Ric- cardo De Lutio, Janick Martinez Esturo, Gavriel State, Sanja Fidler, Nicholas Sharp, and Zan Gojcic. 3d gaussian ray trac- ing: Fast tracing of particle scenes.ACM Transactions on Graphics (TOG), 43(6):1–19, 2024. 2

work page 2024
[34]

Instant neural graphics primitives with a mul- tiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1–15, 2022

Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a mul- tiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1–15, 2022. 2

work page 2022
[35]

Pytorch: An im- perative style, high-performance deep learning library, 2019

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K ¨opf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An im- perative style, high-perf...

work page 2019
[36]

Langsplat: 3d language gaussian splatting

Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, and Hanspeter Pfister. Langsplat: 3d language gaussian splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20051–20060, 2024. 2

work page 2024
[37]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 2

work page 2021
[38]

SAM 2: Segment anything in images and videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R¨adle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junt- ing Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao- Yuan Wu, Ross Girshick, Piotr Dollar, and Christoph Feicht- enhofer. SAM 2: Segment anything in images and videos. In The Thirteenth Inte...

work page 2025
[39]

Neural volumetric object selection

Zhongzheng Ren, Aseem Agarwala, Bryan Russell, Alexan- der G Schwing, and Oliver Wang. Neural volumetric object selection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6133–6142,

work page
[40]

Trace3d: Consistent segmenta- tion lifting via gaussian instance tracing

Hongyu Shen, Junfeng Ni, Yixin Chen, Weishuo Li, Ming- tao Pei, and Siyuan Huang. Trace3d: Consistent segmenta- tion lifting via gaussian instance tracing. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 6656–6666, 2025. 1, 2, 3, 5, 6, 7, 8, 13

work page 2025
[41]

Flashsplat: 2d to 3d gaussian splatting segmentation solved optimally

Qiuhong Shen, Xingyi Yang, and Xinchao Wang. Flashsplat: 2d to 3d gaussian splatting segmentation solved optimally. In European Conference on Computer Vision, pages 456–472. Springer, 2024. 2, 3, 4, 6, 7

work page 2024
[42]

Language embedded 3d gaussians for open- vocabulary scene understanding

Jin-Chuan Shi, Miao Wang, Hao-Bin Duan, and Shao- Hua Guan. Language embedded 3d gaussians for open- vocabulary scene understanding. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5333–5343, 2024. 2

work page 2024
[43]

Learning 3d geometry and feature consistent gaussian splat- ting for object removal

Yuxin Wang, Qianyi Wu, Guofeng Zhang, and Dan Xu. Learning 3d geometry and feature consistent gaussian splat- ting for object removal. InEuropean conference on computer vision, pages 1–17. Springer, 2024. 2

work page 2024
[44]

View-consistent 3d editing with gaus- sian splatting

Yuxuan Wang, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, and Hanwang Zhang. View-consistent 3d editing with gaus- sian splatting. InEuropean conference on computer vision, pages 404–420. Springer, 2024. 2

work page 2024
[45]

Ag2aussian: Anchor-graph structured gaussian splatting for instance-level 3d scene understanding and editing

Zhaonan Wang, Manyi Li, and Changhe Tu. Ag2aussian: Anchor-graph structured gaussian splatting for instance-level 3d scene understanding and editing. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 26806–26816, 2025. 2

work page 2025
[46]

Gaussctrl: Multi-view consistent text-driven 3d gaussian splatting edit- ing

Jing Wu, Jia-Wang Bian, Xinghui Li, Guangrun Wang, Ian Reid, Philip Torr, and Victor Adrian Prisacariu. Gaussctrl: Multi-view consistent text-driven 3d gaussian splatting edit- ing. InEuropean conference on computer vision, pages 55–

work page
[47]

3dgut: Enabling distorted cameras and secondary rays in gaussian splatting

Qi Wu, Janick Martinez Esturo, Ashkan Mirzaei, Nicolas Moenne-Loccoz, and Zan Gojcic. 3dgut: Enabling distorted cameras and secondary rays in gaussian splatting. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 26036–26046, 2025. 2

work page 2025
[48]

Opengaussian: Towards point-level 3d gaussian-based open vocabulary understanding.Advances in Neural Information Processing Systems, 37:19114–19138,

Yanmin Wu, Jiarui Meng, Haijie Li, Chenming Wu, Yahao Shi, Xinhua Cheng, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, et al. Opengaussian: Towards point-level 3d gaussian-based open vocabulary understanding.Advances in Neural Information Processing Systems, 37:19114–19138,

work page
[49]

Instascene: Towards complete 3d instance decomposition and reconstruction from cluttered scenes

Zesong Yang, Bangbang Yang, Wenqi Dong, Chenxuan Cao, Liyuan Cui, Yuewen Ma, Zhaopeng Cui, and Hujun Bao. Instascene: Towards complete 3d instance decomposition and reconstruction from cluttered scenes. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7771–7781, 2025. 2, 3, 7

work page 2025
[50]

Gaussian grouping: Segment and edit anything in 3d scenes

Mingqiao Ye, Martin Danelljan, Fisher Yu, and Lei Ke. Gaussian grouping: Segment and edit anything in 3d scenes. InEuropean conference on computer vision, pages 162–179. Springer, 2024. 2, 3, 6, 7, 13, 14

work page 2024
[51]

Panogs: Gaussian-based panoptic seg- mentation for 3d open vocabulary scene understanding

Hongjia Zhai, Hai Li, Zhenzhe Li, Xiaokun Pan, Yijia He, and Guofeng Zhang. Panogs: Gaussian-based panoptic seg- mentation for 3d open vocabulary scene understanding. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 14114–14124, 2025. 2

work page 2025
[52]

Cob-gs: Clear object boundaries in 3dgs seg- mentation based on boundary-adaptive gaussian splitting

Jiaxin Zhang, Junjun Jiang, Youyu Chen, Kui Jiang, and Xi- anming Liu. Cob-gs: Clear object boundaries in 3dgs seg- mentation based on boundary-adaptive gaussian splitting. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 19335–19344, 2025. 1, 2, 3, 4, 5, 6, 7, 13

work page 2025
[53]

3ditscene: Editing any scene via language-guided disentan- gled gaussian splatting

Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, and Ceyuan Yang. 3ditscene: Editing any scene via language-guided disentan- gled gaussian splatting. InThe Thirteenth International Con- ference on Learning Representations, 2025. 2

work page 2025
[54]

High- fidelity 3d gaussian inpainting: Preserving multi-view con- sistency and photorealistic details.Computers & Graphics, page 104362, 2025

Jun Zhou, Dinghao Li, Nannan Li, and Mingjie Wang. High- fidelity 3d gaussian inpainting: Preserving multi-view con- sistency and photorealistic details.Computers & Graphics, page 104362, 2025. 2

work page 2025
[55]

Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields

Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Ze- hao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, and Achuta Kadambi. Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21676–21685, 2024. 2

work page 2024
[56]

Rethinking end- to-end 2d to 3d scene segmentation in gaussian splatting

Runsong Zhu, Shi Qiu, Zhengzhe Liu, Ka-Hei Hui, Qianyi Wu, Pheng-Ann Heng, and Chi-Wing Fu. Rethinking end- to-end 2d to 3d scene segmentation in gaussian splatting. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 3656–3665, 2025. 2, 3, 7

work page 2025
[57]

Objectgs: Object-aware scene reconstruction and scene understanding via gaussian splatting

Ruijie Zhu, Mulin Yu, Linning Xu, Lihan Jiang, Yixuan Li, Tianzhu Zhang, Jiangmiao Pang, and Bo Dai. Objectgs: Object-aware scene reconstruction and scene understanding via gaussian splatting. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 8350–8360, 2025. 1, 2, 3, 5, 6, 7, 13

work page 2025
[58]

Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding.Inter- national Journal of Computer Vision, 133(2):611–627, 2025

Xingxing Zuo, Pouya Samangouei, Yunwen Zhou, Yan Di, and Mingyang Li. Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding.Inter- national Journal of Computer Vision, 133(2):611–627, 2025. 2 Table A1. Sensitivity analysis on the number of points (k) used to compute the voxel density. Extracted metrics (PSNR, IoU, BIoU)...

work page 2025