pith. machine review for the scientific record. sign in

arxiv: 2605.07203 · v2 · submitted 2026-05-08 · 💻 cs.CV

Recognition: no theorem link

From Pixels to Primitives: Scene Change Detection in 3D Gaussian Splatting

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:27 UTC · model grok-4.3

classification 💻 cs.CV
keywords scene change detection3D Gaussian splattingprimitive attributesmulti-view consistencygeometric changeappearance changecomputer vision
0
0 comments X

The pith

Scene changes can be detected directly from 3D Gaussian primitive attributes without rendering to images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that native attributes of 3D Gaussians—position, anisotropic covariance, and color—contain enough information to identify scene changes. It models the differences that arise when the same scene is reconstructed independently by using anisotropic drift for geometry and photometry plus an observability term for each primitive. This produces change maps that are consistent across views by design and that separate geometric shifts from appearance shifts. A reader would care because the approach avoids the extra steps and ambiguities of render-then-compare pipelines while revealing both location and type of change.

Core claim

We provide direct evidence that native primitive attributes alone—position, anisotropic covariance, and color—carry sufficient signal for scene change detection. We address the under-constrained nature of independent Gaussian optimizations with anisotropic models of geometric and photometric drift, complemented by a per-primitive observability term. Our method, GS-DIFF, yields change maps that are multi-view consistent by construction and scores geometric and appearance changes separately without supervision or external models.

What carries the argument

Anisotropic geometric and photometric drift models together with a per-primitive observability term that accounts for how well each Gaussian is constrained by the input views.

Load-bearing premise

The under-constrained nature of independent Gaussian optimizations can be adequately captured by the introduced anisotropic geometric and photometric drift models together with the per-primitive observability term, without introducing systematic bias in change scoring.

What would settle it

Optimizing two separate 3D Gaussian splats on images of an identical static scene and checking whether the method reports near-zero change scores or still flags spurious differences.

Figures

Figures reproduced from arXiv: 2605.07203 by Chamuditha Jayanga Galappaththige, Dimity Miller, Donald Dansereau, Jason Lai, Niko Suenderhauf, Timothy Patten.

Figure 1
Figure 1. Figure 1: From pixels to primitives. Prior multi-view SCD methods question [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The GS-DIFF pipeline: We model the expected geometric and photometric drift between 3DGS representations, using the inflated covariance of each primitive to find its cross-scene neighbor set. A geometric kernel and appearance kernel evaluate change over the neighbor set to compute drift￾aware change scores. These change scores are combined and weighted by observation uncertainty, and can then be rendered a… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative results on PASLCD [13]. For each scene: rendered views from reference G (1) (left), inference G (2) (center), and change map M (right). GS-DIFF produces sharply localized responses to both structural and surface-level changes across indoor and outdoor scenes. native attributes of Gaussian primitives: position, covariance, and color. Every baseline relies on features of foundation models [37, 39… view at source ↗
Figure 4
Figure 4. Figure 4: Sensitivity of GS-DIFF to data-driven quantile choices: (a) representation-ambiguity drift scales, (b) appearance bandwidth, (c) observability-weighting reference. We sweep each quantile from 0.05 to 0.95 in 0.05 steps on PASLCD; the green dotted line marks our a priori principled choice (upper quartile, median, lower quartile, respectively). All three choices (green dotted line) sit within 0.01 mIoU of th… view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative results on PASLCD [13]. For each scene: rendered views from reference G (1) (top-left), inference G (2) (top-right), and change map for each individual scene M1 (bottom - left), M2 (bottom - right), and final change map M (bottom - center). GS-DIFF produces sharply localized responses to both structural and surface-level changes across indoor and outdoor scenes. Our bidirectional scoring of cha… view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison with O-SCD [15], the strongest image-space baseline. GS-DIFF produces more geometrically accurate change masks for two reasons. (1) Patch granularity: O-SCD and other image-space approaches compare features from an external foundation model [39] at its patch size (14 × 14 or 16 × 16 in pixels), which sets a floor on spatial resolution; primitive-space comparison is limited only by th… view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative results on kernel scores and disambiguation routing (§3.2.3). Left to right: [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
read the original abstract

Scene change detection methods built on Gaussian splatting universally follow a render-then-compare paradigm: the pre-change scene is rendered into 2D and compared against post-change images via pixel or feature residuals. This change detection problem with Gaussian Splatting has been treated as a question about pixels; we treat it as a question about primitives. We provide direct evidence that native primitive attributes alone -- position, anisotropic covariance, and color -- carry sufficient signal for scene change detection. What makes primitive-space comparison hard is the under-constrained nature of Gaussian splatting representation: independent optimizations yield primitive solutions whose count, positions, shapes, and colors differ even where nothing has changed. We address this challenge with anisotropic models of geometric and photometric drift, complemented by a per-primitive observability term that reflects the extent to which each Gaussian is constrained by the camera geometry. Operating directly on primitives gives our method, GD-DIFF, two properties that distinguish it from render-then-compare methods. First, change maps are multi-view consistent by construction, where prior work had to learn this through an additional optimization objective. Second, geometric and appearance changes are scored separately, identifying not just where but what kind of change occurred, distinguishing structural changes (e.g., an added object) from surface-level ones (e.g., a color change) without supervision or external model dependencies. On real-world benchmarks, GS-DIFF surpasses the prior state-of-the-art approach by $\sim$17% in mean Intersection over Union.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces GS-DIFF (also referred to as GD-DIFF), a scene change detection method that operates directly on 3D Gaussian Splatting primitives rather than rendered 2D images. It claims that native attributes—position, anisotropic covariance, and color—carry sufficient signal for detecting changes once the under-constrained nature of independent per-scene optimizations is addressed via anisotropic models of geometric and photometric drift plus a per-primitive observability term derived from camera geometry. This yields multi-view consistent change maps by construction and allows separate scoring of geometric versus appearance changes without supervision. On real-world benchmarks the method reports an approximately 17% gain in mean Intersection over Union over prior render-then-compare state-of-the-art approaches.

Significance. If the empirical claims and modeling assumptions hold, the work is significant because it shifts scene change detection from pixel-space residuals to direct primitive-space comparison, providing inherent multi-view consistency without an auxiliary optimization objective and enabling unsupervised distinction between structural and surface-level changes. The explicit modeling of optimization-induced drift and observability is a constructive contribution that could improve interpretability and efficiency in downstream tasks such as robotics and augmented reality. The paper also supplies falsifiable predictions through its separate geometric and photometric change scores.

major comments (3)
  1. [Abstract] Abstract: The central empirical claim that GS-DIFF surpasses prior SOTA by ∼17% mIoU is load-bearing for the contribution, yet the abstract (and by extension the experimental evaluation) provides no details on the specific baselines, error bars, data splits, or ablation studies isolating the anisotropic drift models and observability term. Without these, it is impossible to verify whether the reported gains arise from the primitive-space formulation or from other factors.
  2. [Method] Method (drift and observability models): The anisotropic geometric/photometric drift models and per-primitive observability term are introduced to neutralize variability from independent Gaussian optimizations, but no theoretical bound, false-positive analysis on static scenes, or validation against misspecification of the drift distributions is supplied. If these models correlate with scene structure rather than purely with optimization artifacts, change scores will be systematically biased even in the absence of real change.
  3. [Experiments] Experimental evaluation: The claim that native primitive attributes alone suffice for change detection rests on the assumption that the introduced drift and observability components fully resolve non-corresponding primitive sets; however, the manuscript does not report ablations removing these components or quantitative checks that the observability term does not inadvertently encode actual scene geometry.
minor comments (2)
  1. [Abstract] The method is referred to as both GD-DIFF and GS-DIFF; a single consistent name should be used throughout.
  2. [Method] The computation of the per-primitive observability term from camera geometry should be stated with an explicit equation or pseudocode for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central empirical claim that GS-DIFF surpasses prior SOTA by ∼17% mIoU is load-bearing for the contribution, yet the abstract (and by extension the experimental evaluation) provides no details on the specific baselines, error bars, data splits, or ablation studies isolating the anisotropic drift models and observability term. Without these, it is impossible to verify whether the reported gains arise from the primitive-space formulation or from other factors.

    Authors: We agree that the abstract is concise and would benefit from more context for immediate verifiability. The specific baselines (render-then-compare SOTA methods), data splits, error bars over multiple runs, and ablation results isolating the drift and observability components are fully detailed in Section 4 and the supplementary material. We will revise the abstract to explicitly name the primary baseline and reference the evaluation protocol, while noting that gains are measured in mIoU with standard deviations. This will clarify that improvements derive from the primitive-space formulation with the proposed drift and observability modeling. revision: yes

  2. Referee: [Method] Method (drift and observability models): The anisotropic geometric/photometric drift models and per-primitive observability term are introduced to neutralize variability from independent Gaussian optimizations, but no theoretical bound, false-positive analysis on static scenes, or validation against misspecification of the drift distributions is supplied. If these models correlate with scene structure rather than purely with optimization artifacts, change scores will be systematically biased even in the absence of real change.

    Authors: The drift models are empirically derived from observed variability in independent optimizations of identical static scenes (Section 3.2), and the observability term is computed directly from camera geometry and Gaussian projection. While we do not derive a formal theoretical bound (due to the non-convex nature of Gaussian Splatting optimization), we include empirical false-positive analysis on static scenes demonstrating low rates. To address potential correlation with scene structure, we will add a new subsection with quantitative sensitivity analysis and visualizations showing that drift parameters align with optimization degrees of freedom rather than scene content. Misspecification validation via parameter perturbation will also be included. revision: partial

  3. Referee: [Experiments] Experimental evaluation: The claim that native primitive attributes alone suffice for change detection rests on the assumption that the introduced drift and observability components fully resolve non-corresponding primitive sets; however, the manuscript does not report ablations removing these components or quantitative checks that the observability term does not inadvertently encode actual scene geometry.

    Authors: We concur that explicit ablations would provide stronger isolation of each component's contribution. The current results demonstrate the overall performance advantage, but direct removal of drift modeling or the observability term is not tabulated in the main text. In the revision, we will add these ablations to Section 4, reporting mIoU drops for each variant, along with quantitative checks (e.g., Pearson correlation between observability scores and geometric features like surface normals) confirming that the term primarily encodes visibility constraints from camera geometry rather than scene structure itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity; new drift models and observability term introduced independently of target labels.

full rationale

The paper's central derivation introduces anisotropic models of geometric and photometric drift plus a per-primitive observability term explicitly to handle under-constrained independent optimizations in Gaussian splatting. These components are motivated directly from the stated problem of non-corresponding primitives across optimizations and are not shown to be fitted from or equivalent to the downstream change detection labels. No self-citation chains, uniqueness theorems from prior author work, or renamings of known results are invoked as load-bearing steps in the provided text. The performance claim (∼17% mIoU gain) is presented as an empirical outcome on benchmarks rather than a mathematical reduction by construction. The derivation chain remains self-contained with independent modeling choices.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the domain assumption that primitive attributes contain change signal despite optimization variability, plus two newly introduced modeling components whose independent evidence is not supplied in the abstract.

axioms (1)
  • domain assumption Independent optimizations of Gaussian Splatting yield primitive solutions with differing count, positions, shapes, and colors even for unchanged scenes.
    Explicitly stated as the core challenge making primitive-space comparison hard.
invented entities (2)
  • anisotropic models of geometric and photometric drift no independent evidence
    purpose: Capture variations in primitive attributes across independent optimizations
    Introduced to address under-constrained representation; no independent evidence supplied.
  • per-primitive observability term no independent evidence
    purpose: Reflect the extent to which each Gaussian is constrained by camera geometry
    New term added to complement drift models; no independent evidence supplied.

pith-pipeline@v0.9.0 · 5593 in / 1329 out tokens · 55269 ms · 2026-05-12T04:27:38.357218+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 2 internal anchors

  1. [1]

    CL-Splats: Continual learning of Gaussian splatting with local optimization

    Jan Ackermann, Jonas Kulhanek, Shengqu Cai, Xu Haofei, Marc Pollefeys, Gordon Wetzstein, Leonidas Guibas, and Songyou Peng. CL-Splats: Continual learning of Gaussian splatting with local optimization. InIEEE/CVF International Conference on Computer Vision, 2025

  2. [2]

    Street-view change detection with deconvolutional networks.Autonomous Robots, 42(7):1301–1322, 2018

    Pablo F Alcantarilla, Simon Stent, German Ros, Roberto Arroyo, and Riccardo Gherardi. Street-view change detection with deconvolutional networks.Autonomous Robots, 42(7):1301–1322, 2018

  3. [3]

    EMPLACE: Self-supervised urban scene change detection

    Tim Alpherts, Sennay Ghebreab, and Nanne van Noord. EMPLACE: Self-supervised urban scene change detection. InAAAI Conference on Artificial Intelligence, pages 1737–1745, 2025

  4. [4]

    Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M. Buhmann. The balanced accuracy and its posterior distribution. In20th International Conference on Pattern Recognition (ICPR), pages 3121–3124. IEEE, 2010

  5. [5]

    PGSR: Planar-based Gaussian splatting for efficient and high-fidelity surface reconstruction.IEEE Transactions on Visualization and Computer Graphics, 31(9):6100–6111, 2024

    Danpeng Chen, Hai Li, Weicai Ye, Yifan Wang, Weijian Xie, Shangjin Zhai, Nan Wang, Haomin Liu, Hujun Bao, and Guofeng Zhang. PGSR: Planar-based Gaussian splatting for efficient and high-fidelity surface reconstruction.IEEE Transactions on Visualization and Computer Graphics, 31(9):6100–6111, 2024

  6. [6]

    GI-GS: Global illumination decomposition on gaussian splatting for inverse rendering

    HONGZE CHEN, Zehong Lin, and Jun Zhang. GI-GS: Global illumination decomposition on gaussian splatting for inverse rendering. InThe Thirteenth International Conference on Learning Representations, 2025

  7. [7]

    Zero-shot scene change detection

    Kyusik Cho, Dong Yeop Kim, and Euntai Kim. Zero-shot scene change detection. InAAAI Conference on Artificial Intelligence, pages 2509–2517, 2025

  8. [8]

    Gaussian heritage: 3D digitization of cultural heritage with integrated object segmentation

    Mahtab Dahaghin, Myrna Castillo, Kourosh Riahidehkordi, Matteo Toso, and Alessio Del Bue. Gaussian heritage: 3D digitization of cultural heritage with integrated object segmentation. InEuropean Conference on Computer Vision Workshops, 2024

  9. [9]

    Fully convolutional siamese networks for change detection

    Rodrigo Caye Daudt, Bertr Le Saux, and Alexandre Boulch. Fully convolutional siamese networks for change detection. InIEEE International Conference on Image Processing, pages 4063–4067, 2018

  10. [10]

    Siamese KPConv: 3D multiple change detection from raw point clouds using deep learning.ISPRS Journal of Photogrammetry and Remote Sensing, 197: 274–291, 2023

    Iris de Gélis, Sébastien Lefèvre, and Thomas Corpetti. Siamese KPConv: 3D multiple change detection from raw point clouds using deep learning.ISPRS Journal of Photogrammetry and Remote Sensing, 197: 274–291, 2023

  11. [11]

    GOLDILOCS: General object-level detection and labeling of changes in scenes

    Almog Friedlander, Ariel Shamir, and Ohad Fried. GOLDILOCS: General object-level detection and labeling of changes in scenes. InInternational Conference on Learning Representations, 2026

  12. [12]

    Towards generalizing to unseen domains with few labels

    Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana, and Muhammad Haris Khan. Towards generalizing to unseen domains with few labels. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23691–23700, 2024

  13. [13]

    Dansereau, Niko Suenderhauf, and Dimity Miller

    Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim, Donald G. Dansereau, Niko Suenderhauf, and Dimity Miller. Multi-view pose-agnostic change localization with zero labels. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

  14. [14]

    Predictive photometric uncertainty in gaussian splatting for novel view synthesis.arXiv preprint arXiv:2603.22786, 2026

    Chamuditha Jayanga Galappaththige, Thomas Gottwald, Peter Stehr, Edgar Heinert, Niko Suenderhauf, Dimity Miller, and Matthias Rottmann. Predictive photometric uncertainty in gaussian splatting for novel view synthesis.arXiv preprint arXiv:2603.22786, 2026

  15. [15]

    Dansereau, Niko Suenderhauf, and Dimity Miller

    Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim, Donald G. Dansereau, Niko Suenderhauf, and Dimity Miller. Changes in real time: Online scene change detection with multi-view fusion. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

  16. [16]

    Relightable 3d gaussians: Realistic point cloud relighting with brdf decomposition and ray tracing

    Jian Gao, Chun Gu, Youtian Lin, Zhihao Li, Hao Zhu, Xun Cao, Li Zhang, and Yao Yao. Relightable 3d gaussians: Realistic point cloud relighting with brdf decomposition and ray tracing. InEuropean Conference on Computer Vision, pages 73–89. Springer, 2024. 10

  17. [17]

    Kyle Gao, Dening Lu, Liangzhi Li, Nan Chen, Hongjie He, Linlin Xu, and Jonathan Li. Digital build- ings analysis: 3D modeling, GIS integration, and visual descriptions using Gaussian splatting, Chat- GPT/DeepSeek, and Google maps platform.IEEE Geoscience and Remote Sensing Letters, 2025

  18. [18]

    A fast voxel-based indicator for change detection using low resolution octrees.ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4:357–364, 2019

    Joachim Gehrung, Marcus Hebel, Michael Arens, and Uwe Stilla. A fast voxel-based indicator for change detection using low resolution octrees.ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4:357–364, 2019

  19. [19]

    Change detection on point cloud data acquired with a ground laser scanner.International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36, 2005

    Daniel Girardeau-Montaut, Michel Roux, Raphaël Marc, and Guillaume Thibault. Change detection on point cloud data acquired with a ground laser scanner.International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36, 2005

  20. [20]

    Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering

    Antoine Guédon and Vincent Lepetit. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5354–5363, 2024

  21. [21]

    arXiv preprint arXiv:2007.01434 , year=

    Ishaan Gulrajani and David Lopez-Paz. In search of lost domain generalization.arXiv preprint arXiv:2007.01434, 2020

  22. [22]

    Cambridge University Press, New York, NY , USA, 2 edition, 2003

    Richard Hartley and Andrew Zisserman.Multiple View Geometry in Computer Vision. Cambridge University Press, New York, NY , USA, 2 edition, 2003

  23. [23]

    2d gaussian splatting for geometrically accurate radiance fields

    Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accurate radiance fields. InSIGGRAPH 2024 Conference Papers. Association for Computing Machinery, 2024

  24. [24]

    Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods.Machine Learning, 110(3):457–506, 2021

    Eyke Hüllermeier and Willem Waegeman. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods.Machine Learning, 110(3):457–506, 2021

  25. [25]

    Gaussian difference: Find any change instance in 3D scenes

    Binbin Jiang, Rui Huang, Qingyi Zhao, and Yuxiang Zhang. Gaussian difference: Find any change instance in 3D scenes. InIEEE International Conference on Acoustics, Speech, and Signal Processing, pages 1–5, 2025

  26. [26]

    ZeroSCD: Zero-shot street scene change detection

    Shyam Sundar Kannan and Byung-Cheol Min. ZeroSCD: Zero-shot street scene change detection. In IEEE International Conference on Robotics and Automation, pages 4665–4671, 2025

  27. [27]

    What uncertainties do we need in Bayesian deep learning for computer vision? InAdvances in Neural Information Processing Systems, 2017

    Alex Kendall and Yarin Gal. What uncertainties do we need in Bayesian deep learning for computer vision? InAdvances in Neural Information Processing Systems, 2017

  28. [28]

    3D Gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4):139–1, 2023

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3D Gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4):139–1, 2023

  29. [29]

    Towards generalizable scene change detection

    Jae-Woo Kim and Ue-Hwan Kim. Towards generalizable scene change detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24463–24473, 2025

  30. [30]

    SplatPose & detect: Pose- agnostic 3D anomaly detection

    Mathis Kruse, Marco Rudolph, Dominik Woiwode, and Bodo Rosenhahn. SplatPose & detect: Pose- agnostic 3D anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 3950–3960, 2024

  31. [31]

    Dimitri Lague, Nicolas Brodu, and Jérôme Leroux. Accurate 3D comparison of complex topography with terrestrial laser scanner: Application to the Rangitikei canyon (n-z).ISPRS Journal of Photogrammetry and Remote Sensing, 82:10–26, 2013

  32. [32]

    Robust scene change detection using visual foundation models and cross-attention mechanisms

    Chun-Jung Lin, Sourav Garg, Tat-Jun Chin, and Feras Dayoub. Robust scene change detection using visual foundation models and cross-attention mechanisms. InIEEE International Conference on Robotics and Automation, pages 8337–8343, 2025

  33. [33]

    SplatPose+: Real-time image-based pose-agnostic 3D anomaly detection

    Yizhe Liu, Yan Song Hu, Yuhao Chen, and John Zelek. SplatPose+: Real-time image-based pose-agnostic 3D anomaly detection. InEuropean Conference on Computer Vision Workshops, pages 378–391, 2024

  34. [34]

    3D VSG: Long-term semantic scene change prediction through 3D variable scene graphs

    Samuel Looper, Javier Rodriguez-Puigvert, Roland Siegwart, Cesar Cadena, and Lukas Schmid. 3D VSG: Long-term semantic scene change prediction through 3D variable scene graphs. InIEEE International Conference on Robotics and Automation, pages 8179–8186, 2023

  35. [35]

    3DGS-CD: 3D Gaussian splatting-based change detection for physical object rearrangement.IEEE Robotics and Automation Letters, 2025

    Ziqi Lu, Jianbo Ye, and John Leonard. 3DGS-CD: 3D Gaussian splatting-based change detection for physical object rearrangement.IEEE Robotics and Automation Letters, 2025

  36. [36]

    Gaussian splatting SLAM

    Hidenobu Matsuki, Riku Murai, Paul HJ Kelly, and Andrew J Davison. Gaussian splatting SLAM. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18039–18048, 2024. 11

  37. [37]

    DINOv2: Learning Robust Visual Features without Supervision

    Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. DINOv2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023

  38. [38]

    3d change localization and captioning from dynamic scans of indoor scenes

    Yue Qiu, Shintaro Yamamoto, Ryosuke Yamada, Ryota Suzuki, Hirokatsu Kataoka, Kenji Iwata, and Yutaka Satoh. 3d change localization and captioning from dynamic scans of indoor scenes. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 1176–1185, 2023

  39. [39]

    SAM 2: Segment Anything in Images and Videos

    Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, et al. SAM 2: Segment anything in images and videos.arXiv preprint arXiv:2408.00714, 2024

  40. [40]

    FastGS: Training 3D Gaussian splatting in 100 seconds

    Shiwei Ren, Tianci Wen, Yongchun Fang, and Biao Lu. FastGS: Training 3D Gaussian splatting in 100 seconds. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

  41. [41]

    The change you want to see

    Ragav Sachdeva and Andrew Zisserman. The change you want to see. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 3993–4002, 2023

  42. [42]

    The change you want to see (now in 3D)

    Ragav Sachdeva and Andrew Zisserman. The change you want to see (now in 3D). InIEEE/CVF International Conference on Computer Vision Workshops, pages 2060–2069, 2023

  43. [43]

    Change detection from a street image pair using CNN features and superpixel segmentation

    Ken Sakurada and Takayuki Okatani. Change detection from a street image pair using CNN features and superpixel segmentation. InBritish Machine Vision Conference, pages 61.1–61.12, Swansea, 2015. British Machine Vision Association

  44. [44]

    Structure-from-motion revisited

    Johannes L Schonberger and Jan-Michael Frahm. Structure-from-motion revisited. InIEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 4104–4113, 2016

  45. [45]

    Tukey.Exploratory Data Analysis

    John W. Tukey.Exploratory Data Analysis. Addison-Wesley, Reading, MA, 1977

  46. [46]

    ChangeNet: A deep learning architecture for visual change detection

    Ashley Varghese, Jayavardhana Gubbi, Akshaya Ramaswamy, and P Balamuralidhar. ChangeNet: A deep learning architecture for visual change detection. InEuropean Conference on Computer Vision Workshops, 2018

  47. [47]

    Bovik, H.R

    Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE Transactions on Image Processing, 13(4):600–612, 2004

  48. [48]

    City-scale scene change detection using point clouds

    Zi Jian Yew and Gim Hee Lee. City-scale scene change detection using point clouds. InIEEE International Conference on Robotics and Automation, pages 13362–13369, 2021

  49. [49]

    3D scene change modeling with consistent multi-view aggregation

    Zirui Zhou, Junfeng Ni, Shujie Zhang, Yixin Chen, and Siyuan Huang. 3D scene change modeling with consistent multi-view aggregation. InInternational Conference on 3D Vision, 2026

  50. [50]

    EW A splatting.IEEE Transac- tions on Visualization and Computer Graphics, 8(03):223–238, 2002

    Matthias Zwicker, Hanspeter Pfister, Jeroen Van Baar, and Markus Gross. EW A splatting.IEEE Transac- tions on Visualization and Computer Graphics, 8(03):223–238, 2002. 12 A GS-DIFFAlgorithm We provide pseudo-code for the full GS-DIFFpipeline (Algorithm 1). Reconstruction is treated as upstream input: we assume two independently built 3DGS reconstructions ...