GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds

Ao Gao; Jingyu Gong; Lizhuang Ma; Xin Tan; Yuan Xie; Zhizhong Zhang

REVIEW 3 major objections 2 minor 1 cited by

GSCompleter completes 3D Gaussian Splatting scenes from sparse views by generating and registering metric-scale Gaussians instead of using distillation.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-21 00:31 UTC pith:H2VWPK4D

load-bearing objection GSCompleter pushes a generate-then-register workflow for 3DGS completion that avoids distillation loops and claims faster SOTA results, but the metric-scale lifting step needs concrete checks against depth errors in the generated views. the 3 major comments →

arxiv 2604.20155 v2 pith:H2VWPK4D submitted 2026-04-22 cs.CV

GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds

Ao Gao , Jingyu Gong , Xin Tan , Zhizhong Zhang , Lizhuang Ma , Yuan Xie This is my paper

classification cs.CV

keywords 3D Gaussian Splattingscene completiondistillation-freemetric-awareray-constrained registrationneural renderingsparse viewpoints

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GSCompleter as a plugin that fixes incomplete 3D Gaussian Splatting reconstructions caused by sparse camera coverage. It avoids the slow and unstable repair-then-distill approach by first synthesizing 2D reference images from the available views. These images are lifted into 3D Gaussian primitives that maintain a consistent metric scale. The new primitives are then added to the scene through a ray-constrained registration process. This results in faster completion times and better performance on standard benchmarks compared to existing methods.

Core claim

By replacing unstable distillation with rapid geometric registration, GSCompleter exhibits superior 3DGS completion performance across three benchmarks, enhancing both quality and efficiency over various baselines and achieving new state-of-the-art results. The method synthesizes visually plausible 2D reference images and explicitly lifts them into 3D Gaussian primitives with a consistent metric scale via a robust Stereo-Anchor View Selection mechanism before integrating them using Ray-Constrained Registration.

What carries the argument

The Ray-Constrained Registration strategy, which integrates newly generated 3D Gaussian primitives into the global scene while enforcing geometric consistency through ray constraints.

Load-bearing premise

Synthesized 2D reference images can be lifted into 3D Gaussian primitives with consistent metric scale and then integrated via ray-constrained registration without introducing geometric inconsistencies or visual artifacts in the global scene.

What would settle it

If rendering the completed scene from viewpoints outside the original sparse set and the synthesized references reveals persistent floaters or scale inconsistencies, the effectiveness of the registration strategy would be disproven.

Watch this falsifier — get emailed when new claim-graph text bears on it.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

GSCompleter pushes a generate-then-register workflow for 3DGS completion that avoids distillation loops and claims faster SOTA results, but the metric-scale lifting step needs concrete checks against depth errors in the generated views.

read the letter

The core move in this paper is replacing the standard repair-then-distill loop with a generate-then-register pipeline. They synthesize 2D reference images, lift them to 3D Gaussians using Stereo-Anchor View Selection for metric consistency, and then fold the new primitives into the scene with Ray-Constrained Registration. That shift is the actual novelty, and it directly targets the speed and stability problems of prior iterative methods. If the registration step really keeps scale aligned without extra optimization, the approach could cut completion time to seconds while reducing floaters in sparse-view reconstructions. The abstract positions this as a lightweight plugin, which is a practical angle for people running 3DGS pipelines on real scenes. Credit is due for naming the two mechanisms explicitly and for framing the problem around geometric registration rather than another round of distillation training. The workflow description itself is clear enough that a reader could sketch an implementation from the high-level steps. On the soft spots, the abstract asserts superior performance and new SOTA across three benchmarks yet supplies no numbers, tables, or error bars. That leaves the central performance claim resting on experiments we cannot see here. The stress-test concern about scale drift lands as a real issue worth checking: if stereo depth on the synthesized images carries ambiguities, the lifted Gaussians could introduce local distortions that ray-constrained registration only masks in projection, not fixes in 3D. The paper would be stronger with ablations on depth accuracy and direct comparisons of metric error before and after registration. This work is aimed at researchers and practitioners in neural rendering who need quicker fixes for incomplete 3DGS models rather than theoretical advances in representation. A reader already working with Gaussian splatting for reconstruction or view synthesis would get immediate value from the paradigm change and the efficiency angle, provided the quantitative results hold up under review. I would send it to peer review. The idea is distinct, the problem is concrete, and the proposed mechanisms are testable; referees can pressure the experiments on scale consistency and baseline comparisons.

Referee Report

3 major / 2 minor

Summary. The paper proposes GSCompleter, a distillation-free plugin for 3D Gaussian Splatting (3DGS) scene completion. It replaces the iterative 'Repair-then-Distill' paradigm with a 'Generate-then-Register' workflow: synthesizing 2D reference images, lifting them to metric-scale 3D Gaussian primitives via Stereo-Anchor View Selection, and integrating them into the existing scene using Ray-Constrained Registration. The method claims superior quality and efficiency over baselines, achieving new SOTA results on three benchmarks.

Significance. If the geometric consistency claims hold, GSCompleter could offer a practical efficiency gain for 3DGS completion by avoiding unstable optimization loops, with potential benefits for sparse-view reconstruction tasks. The explicit registration approach may scale better than distillation methods, though its impact depends on validation of the metric-scale lifting step.

major comments (3)

[Stereo-Anchor View Selection mechanism] The central performance claims rest on the assumption that Stereo-Anchor View Selection produces 3D Gaussians with metric scale exactly matching the input scene. The manuscript should include quantitative validation (e.g., scale-error metrics or ablation on depth estimation accuracy from synthesized images) in the section describing this mechanism, as depth ambiguities in generative 2D synthesis could propagate to floaters or inconsistencies after Ray-Constrained Registration.
[Ray-Constrained Registration strategy] Ray-Constrained Registration is presented as the integration step that avoids geometric artifacts. The paper needs to report specific metrics (e.g., PSNR/SSIM differences with and without the ray constraint, or failure cases on regions with high depth variance) to substantiate that it corrects rather than masks underlying scale or distortion errors from the lifting stage.
[Experimental results] The abstract and results sections assert SOTA performance across three benchmarks with no quantitative tables, error bars, or per-scene breakdowns visible in the provided description. Full experimental results must include direct comparisons to distillation baselines with standard metrics and statistical significance to support the efficiency and quality superiority claims.

minor comments (2)

[Method overview] Clarify the exact number and selection criteria for the synthesized reference views in the Stereo-Anchor mechanism to improve reproducibility.
[Figures] Ensure all figures showing completed scenes include side-by-side comparisons with ground truth or baseline outputs for visual assessment of artifacts.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our work. We have carefully addressed each major comment below with point-by-point responses. Where the suggestions strengthen the validation of our claims, we have incorporated revisions into the manuscript.

read point-by-point responses

Referee: [Stereo-Anchor View Selection mechanism] The central performance claims rest on the assumption that Stereo-Anchor View Selection produces 3D Gaussians with metric scale exactly matching the input scene. The manuscript should include quantitative validation (e.g., scale-error metrics or ablation on depth estimation accuracy from synthesized images) in the section describing this mechanism, as depth ambiguities in generative 2D synthesis could propagate to floaters or inconsistencies after Ray-Constrained Registration.

Authors: We agree that additional quantitative validation of metric-scale consistency would further strengthen the paper. In the revised manuscript, we have added a dedicated analysis subsection that reports scale-error metrics (mean absolute scale deviation against ground-truth depths) across the three benchmarks. We also include an ablation examining depth estimation accuracy from the synthesized reference images and its downstream effect on completion quality, confirming that the Stereo-Anchor View Selection effectively resolves scale ambiguities without introducing floaters. revision: yes
Referee: [Ray-Constrained Registration strategy] Ray-Constrained Registration is presented as the integration step that avoids geometric artifacts. The paper needs to report specific metrics (e.g., PSNR/SSIM differences with and without the ray constraint, or failure cases on regions with high depth variance) to substantiate that it corrects rather than masks underlying scale or distortion errors from the lifting stage.

Authors: We appreciate this suggestion for stronger substantiation. The revised experimental section now contains an ablation study that directly compares PSNR and SSIM with and without the ray constraint, showing consistent gains in geometric fidelity. We further analyze and report failure cases on high depth-variance regions, demonstrating that the constraint actively corrects scale and distortion errors from the lifting stage rather than simply masking them. revision: yes
Referee: [Experimental results] The abstract and results sections assert SOTA performance across three benchmarks with no quantitative tables, error bars, or per-scene breakdowns visible in the provided description. Full experimental results must include direct comparisons to distillation baselines with standard metrics and statistical significance to support the efficiency and quality superiority claims.

Authors: The full manuscript already presents comprehensive quantitative tables with direct comparisons to distillation baselines using PSNR, SSIM, and LPIPS, including per-scene breakdowns on all three benchmarks. To address the referee's request, we have added error bars to the aggregate results and included statistical significance testing (paired t-tests) between GSCompleter and the strongest baselines. These enhancements are now explicitly highlighted in the results section. revision: partial

Circularity Check

0 steps flagged

No circularity: method relies on independent geometric registration pipeline

full rationale

The paper describes a Generate-then-Register workflow that synthesizes 2D reference images, lifts them to metric-scale 3D Gaussians using Stereo-Anchor View Selection, and integrates via Ray-Constrained Registration. No equations, fitted parameters, or self-referential definitions appear in the abstract or described claims that would make any output equivalent to the input by construction. The approach is presented as an alternative to distillation-based methods and is evaluated against external baselines on three benchmarks, rendering the derivation self-contained without load-bearing self-citations or ansatz smuggling.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are quantified, but the method introduces two new algorithmic components whose details and assumptions are not provided.

invented entities (2)

Stereo-Anchor View Selection no independent evidence
purpose: Selects views to lift 2D images into 3D Gaussians with consistent metric scale
New mechanism described in the proposed workflow
Ray-Constrained Registration no independent evidence
purpose: Integrates generated 3D primitives into the global scene
Novel strategy replacing distillation

pith-pipeline@v0.9.0 · 5736 in / 1217 out tokens · 38058 ms · 2026-05-21T00:31:17.362106+00:00 · methodology

0 comments

read the original abstract

3D Gaussian Splatting (3DGS) has revolutionized high-fidelity neural rendering with its explicit representation and efficiency. However, reconstructing scenes from sparse viewpoints suffers from severe geometric voids and floaters due to limited coverage. Current scene completion methods typically rely on an iterative "Repair-then-Distill" paradigm, which is computationally intensive, prone to unstable optimization, and susceptible to overfitting. To address these limitations, we propose GSCompleter, a distillation-free plugin that shifts scene completion to a stable "Generate-then-Register" workflow. Specifically, GSCompleter synthesizes visually plausible 2D reference images and explicitly lifts them into 3D Gaussian primitives with a consistent metric scale via a robust Stereo-Anchor View Selection mechanism. These newly generated primitives are then seamlessly integrated into the global scene using a novel Ray-Constrained Registration strategy. By replacing unstable distillation with rapid geometric registration, GSCompleter exhibits superior 3DGS completion performance across three benchmarks, enhancing both quality and efficiency over various baselines and achieving new state-of-the-art (SOTA) results.

Figures

Figures reproduced from arXiv: 2604.20155 by Ao Gao, Jingyu Gong, Lizhuang Ma, Xin Tan, Yuan Xie, Zhizhong Zhang.

**Figure 1.** Figure 1: Overview of GSCompleter. We propose a “Generatethen-Register” paradigm for rapid and robust 3DGS scene completion. (a) Given a 3DGS scene exhibiting geometric voids, (b) we first synthesize a high-fidelity 2D reference image via a generative prior and explicitly lift it into metric-scale 3D Gaussian primitives guided by a stereo anchor view. (c) Instead of global optimization, we seamlessly register the… view at source ↗

**Figure 2.** Figure 2: Overview of the GSCompleter. Addressing the geometric holes in the novel view, we adopt a “Generate-then-Register” paradigm to complete the scene via four stages: (1) Feed-Forward Metric Context Initialization: We first reconstruct the observed regions using a scale-aware feed-forward 3DGS model, establishing a foundational context with metric scale; (2) Anchor-Guided Gaussian Initialization: To fill the v… view at source ↗

**Figure 3.** Figure 3: Stereo-Anchor Selection Strategy. We identify the optimal reference for 3D lifting through a prioritized process: (1) Filtering: Context views with relative rotation ∆θ > 45◦ are discarded to ensure sufficient overlap. (2) Selection: Among valid candidates (Left), we select the one with the maximum baseline to stabilize metric scale. (3) Fallback: In extreme cases where no candidates satisfy the angular co… view at source ↗

**Figure 4.** Figure 4: Ray-Constrained Gaussian Registration. (a) Coarse Global Alignment: We employ RANSAC to estimate the global affine parameters (s, t), which are used to explicitly re-initialize the depth of the target Gaussians. (b) Fine-grained Ray-Constrained Optimization: We optimize the Gaussian depth solely by adjusting the distance along the camera ray. Concurrently, we reproject these primitives into the stereo anch… view at source ↗

**Figure 5.** Figure 5: Qualitative Comparison on RealEstate10K. While baselines exhibit significant geometric collapses or “black holes” in unobserved regions, our method achieves high fidelity consistent with the Ground Truth (GT). GSCompleter accurately recovers complex geometric structures and scene details while maintaining robustness across diverse baseline architectures (e.g., pixel-aligned and voxel-aligned). 4. Experimen… view at source ↗

**Figure 6.** Figure 6: Comparison between our method and the densification strategy. While densification tends to overfit the reference view, our method effectively mitigates this issue [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Visual comparison with RegGS. RegGS suffers from severe geometric distortions (highlighted in red) arising from scaleagnostic optimization. In contrast, our method leverages metric priors to achieve precise alignment while strictly preserving structural fidelity. Given the same input views, our geometric pipeline accelerates the process by over 170× compared to RegGS (1.43s vs. ∼4 min). itives via the ex… view at source ↗

**Figure 8.** Figure 8: Visual illustration of the 2-view input and 1-view target extrapolation setting. A.2. Robustness Analysis on Extrapolation Span Following the n-k protocol established in Sec. 4, we conduct a stress test on the DL3DV dataset to evaluate model stability under extreme sparsity. As the frame interval k increases, the baseline distance between context views expands, leading to a drastic reduction in visual over… view at source ↗

**Figure 9.** Figure 9: and [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 10.** Figure 10: Qualitative comparison on long sequences. RegGS suffers from progressive blurring and structural drift due to metric scale instability. In contrast, GSCompleter maintains sharp details and global consistency, effectively rectifying artifacts in challenging frames. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: More results on the RealEstate10K dataset. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗

**Figure 12.** Figure 12: More results on the ACID dataset. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗

**Figure 13.** Figure 13: More results on the DL3DV dataset. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

shifts scene completion to a stable 'Generate-then-Register' workflow... Stereo-Anchor View Selection... Ray-Constrained Registration
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

metric scale... feed-forward Gaussian regressor... RANSAC global alignment

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

PE-Field 4D: Video Generation Models as Canvas
cs.CV 2026-07 conditional novelty 5.0

Warping reference tokens' positional encodings into the target view, with depth offsets and frame-level compression fixes, improves geometry-aware camera control in video diffusion transformers.

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

arXiv preprint arXiv:2510.20385 (2025)

Bai, Y ., Li, H., and Huang, Q. Positional encoding field. arXiv preprint arXiv:2510.20385,

work page arXiv
[2]

Videolifter: Lifting videos to 3d with fast hierarchical stereo alignment.arXiv preprint arXiv:2501.01949,

Cong, W., Zhu, H., Wang, K., Lei, J., Stearns, C., Cai, Y ., Guibas, L., Wang, Z., and Fan, Z. Videolifter: Lifting videos to 3d with fast hierarchical stereo alignment.arXiv preprint arXiv:2501.01949,

work page arXiv
[3]

InstantSplat: Sparse-view Gaussian Splatting in Seconds

Fan, Z., Cong, W., Wen, K., Wang, K., Zhang, J., Ding, X., Xu, D., Ivanovic, B., Pavone, M., Pavlakos, G., et al. Instantsplat: Sparse-view gaussian splatting in seconds. arXiv preprint arXiv:2403.20309,

work page Pith review arXiv
[4]

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Gao, R., Holynski, A., Henzler, P., Brussee, A., Martin- Brualla, R., Srinivasan, P., Barron, J. T., and Poole, B. Cat3d: Create anything in 3d with multi-view diffusion models.arXiv preprint arXiv:2405.10314,

work page internal anchor Pith review Pith/arXiv arXiv
[5]

VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction

Wang, W., Chen, Y ., Zhang, Z., Liu, H., Wang, H., Feng, Z., Qin, W., Chen, F., Zhu, Z., Chen, D. Y ., et al. V olsplat: Rethinking feed-forward 3d gaussian splatting with voxel- aligned prediction.arXiv preprint arXiv:2509.19297,

work page internal anchor Pith review arXiv
[6]

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

Ye, B., Liu, S., Xu, H., Li, X., Pollefeys, M., Yang, M.-H., and Peng, S. No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images.arXiv preprint arXiv:2410.24207,

work page Pith review arXiv
[7]

Stereo Magnification: Learning View Synthesis using Multiplane Images

Zhang, J., Zhan, F., Xu, M., Lu, S., and Xing, E. Fregs: 3d gaussian splatting with progressive frequency regu- larization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21424– 21433, 2024a. Zhang, Z., Hu, W., Lao, Y ., He, T., and Zhao, H. Pixel-gs: Density control with pixel-aware gradient for 3d gaussian splat...

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Generate-then-Register

11 GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds A. Appendix A.1. Extrapolation Evaluation Protocol In this section, we provide a visual illustration of our 2-view extrapolation protocol. The left figure (a) depicts the view interpolation strategy employed by methods such as PixelSplat (Charatan et al...

work page 2024

[1] [1]

arXiv preprint arXiv:2510.20385 (2025)

Bai, Y ., Li, H., and Huang, Q. Positional encoding field. arXiv preprint arXiv:2510.20385,

work page arXiv

[2] [2]

Videolifter: Lifting videos to 3d with fast hierarchical stereo alignment.arXiv preprint arXiv:2501.01949,

Cong, W., Zhu, H., Wang, K., Lei, J., Stearns, C., Cai, Y ., Guibas, L., Wang, Z., and Fan, Z. Videolifter: Lifting videos to 3d with fast hierarchical stereo alignment.arXiv preprint arXiv:2501.01949,

work page arXiv

[3] [3]

InstantSplat: Sparse-view Gaussian Splatting in Seconds

Fan, Z., Cong, W., Wen, K., Wang, K., Zhang, J., Ding, X., Xu, D., Ivanovic, B., Pavone, M., Pavlakos, G., et al. Instantsplat: Sparse-view gaussian splatting in seconds. arXiv preprint arXiv:2403.20309,

work page Pith review arXiv

[4] [4]

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Gao, R., Holynski, A., Henzler, P., Brussee, A., Martin- Brualla, R., Srinivasan, P., Barron, J. T., and Poole, B. Cat3d: Create anything in 3d with multi-view diffusion models.arXiv preprint arXiv:2405.10314,

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction

Wang, W., Chen, Y ., Zhang, Z., Liu, H., Wang, H., Feng, Z., Qin, W., Chen, F., Zhu, Z., Chen, D. Y ., et al. V olsplat: Rethinking feed-forward 3d gaussian splatting with voxel- aligned prediction.arXiv preprint arXiv:2509.19297,

work page internal anchor Pith review arXiv

[6] [6]

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

Ye, B., Liu, S., Xu, H., Li, X., Pollefeys, M., Yang, M.-H., and Peng, S. No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images.arXiv preprint arXiv:2410.24207,

work page Pith review arXiv

[7] [7]

Stereo Magnification: Learning View Synthesis using Multiplane Images

Zhang, J., Zhan, F., Xu, M., Lu, S., and Xing, E. Fregs: 3d gaussian splatting with progressive frequency regu- larization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21424– 21433, 2024a. Zhang, Z., Hu, W., Lao, Y ., He, T., and Zhao, H. Pixel-gs: Density control with pixel-aware gradient for 3d gaussian splat...

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

Generate-then-Register

11 GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds A. Appendix A.1. Extrapolation Evaluation Protocol In this section, we provide a visual illustration of our 2-view extrapolation protocol. The left figure (a) depicts the view interpolation strategy employed by methods such as PixelSplat (Charatan et al...

work page 2024