pith. machine review for the scientific record. sign in

arxiv: 2604.27590 · v1 · submitted 2026-04-30 · 💻 cs.CV

Recognition: unknown

Fake3DGS: A Benchmark for 3D Manipulation Detection in Neural Rendering

Authors on Pith no claims yet

Pith reviewed 2026-05-07 09:07 UTC · model grok-4.3

classification 💻 cs.CV
keywords 3D fake detectionGaussian splattingneural renderingmanipulation detectionbenchmark datasetmulti-view coherence3D scene editing
0
0 comments X

The pith

3D manipulations in Gaussian splatting scenes defeat standard 2D fake detectors, while a new method using multi-view consistency and scene features performs substantially better.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a benchmark dataset of 3D Gaussian splatting scenes that include both original renders and versions altered through controlled changes to geometry, appearance, and spatial layout. It shows that leading single-image detectors struggle to identify the altered versions even though the images retain high visual quality. The authors then describe a detection approach that draws on consistency across multiple rendered views together with information taken directly from the underlying 3D representation. This combination yields clearly higher detection rates on the same test images. The result points to the need for authenticity methods that operate on 3D structure rather than on 2D pixel patterns alone.

Core claim

Manipulations performed inside 3D Gaussian splatting models produce photorealistic images that current state-of-the-art 2D detectors cannot reliably separate from authentic renders. The authors supply a public dataset of paired original and manipulated scenes together with a 3D-aware detector that exploits multi-view coherence and features derived from the Gaussian representation, demonstrating that these additional cues raise detection performance markedly.

What carries the argument

The Fake3DGS benchmark dataset of original and manipulated 3D Gaussian splatting scenes, paired with a detector that uses multi-view coherence and Gaussian-derived features to identify edits.

If this is right

  • Standard 2D detectors will require supplementation whenever content originates from editable 3D neural representations.
  • Future authenticity systems for rendered media will need either multiple views or direct access to the 3D model parameters.
  • The three categories of controlled manipulation provide a concrete testbed for developing and comparing other 3D-aware detectors.
  • Detection accuracy rises when consistency checks across views are added to single-image analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multi-view approach could be tested on other neural rendering pipelines such as NeRF variants.
  • The benchmark could be extended to include temporal sequences to address detection in 3D video or animated content.
  • A practical detector might combine the new 3D cues with existing 2D tools to handle mixed media pipelines.
  • Real captured scenes edited in commercial 3D software would serve as a stronger test of whether the coherence features survive domain shift.

Load-bearing premise

The controlled changes to geometry, appearance, and layout in the dataset reflect the properties of real-world 3D forgeries while preserving visual realism, and the multi-view coherence features work on scenes outside this particular collection.

What would settle it

Apply the proposed detector to a fresh set of 3D scenes that were edited independently by other researchers and then rendered with Gaussian splatting; if accuracy drops sharply compared with the benchmark, the method does not generalize.

Figures

Figures reproduced from arXiv: 2604.27590 by Davide Di Nucci, Guido Borghi, Riccardo Catalini, Roberto Vezzani.

Figure 1
Figure 1. Figure 1: Sample renderings from the Fake3DGS Dataset. Below each view of the scene, the corresponding prompt used to generate edited samples is reported. ing manipulations are often photorealistic and view consistent, making detection based solely on 2D artifacts challenging. 3.1 Real 3D Scenes Building a real-world benchmark for our task requires processing a large number of scenes, each with multi-view imagery, a… view at source ↗
Figure 2
Figure 2. Figure 2: Ablation over Gaussian feature groups. We report the change in accuracy (in percentage points) obtained by removing each feature group from the input, relative to the full-feature model. 4.4 Ablation Study As shown in view at source ↗
Figure 3
Figure 3. Figure 3: Some sample renderings of the data in our dataset. Below each image, there is the correspondent prompt used to generate edited samples. 5 Conclusions and Future Work In this work, we introduced the important – but still underexplored – concept of 3D fake detection, which currently represent a security threat. To further investigate this novel task, we presented Fake3DGS, a large-scale benchmark composed of… view at source ↗
read the original abstract

Recent advances in 3D reconstruction and neural rendering,particularly 3D Gaussian Splatting, make it feasible and simple to edit 3D scenes and re-render them as highly realistic images. Therefore, security concerns arise regarding the authenticity of 3D content. Despite this threat, 3D fake detection remains largely unexplored in the literature, and most existing work is limited to 2D space. Therefore, in this paper, we formalize the concept of 3D fake detection and introduce Fake3DGS, a dataset of 3D Gaussian splatting scenes and corresponding rendered views, where fake images are produced by controlled manipulations of geometry, appearance, and spatial layout, while preserving high visual realism. Using this benchmark, we demonstrate that current state-of-the-art 2D detectors struggle to distinguish between original and 3D manipulated images. To bridge this gap, we introduce a 3D-aware detection method that leverages multi-view coherence and features derived from the Gaussian splatting representation. Experimental results demonstrate a substantial improvement in recognizing modified 3D content, underscoring the validity of the new dataset and the necessity for authenticity assessment techniques that extend beyond 2D evidence. Code and data are publicly released for future investigations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper formalizes 3D fake detection for neural rendering and introduces the Fake3DGS benchmark: a collection of 3D Gaussian Splatting scenes together with rendered views in which fake images are generated by controlled manipulations of geometry, appearance, and spatial layout while preserving high visual realism. Using this benchmark the authors report that state-of-the-art 2D detectors fail to distinguish original from manipulated images and propose a 3D-aware detector that exploits multi-view coherence together with features derived from the underlying Gaussian splatting representation, claiming a substantial improvement in detection performance. Code and data are released publicly.

Significance. If the quantitative claims hold, the work is significant because it identifies an emerging authenticity threat created by editable 3D neural scenes, supplies the first public benchmark and code release for 3D manipulation detection, and demonstrates the limitations of purely 2D forensic methods. The public release of code and data is a clear strength that supports reproducibility and future research in multimedia forensics.

major comments (2)
  1. [Abstract] Abstract: the central claim that 'current state-of-the-art 2D detectors struggle to distinguish between original and 3D manipulated images' and that the proposed method yields 'a substantial improvement' is unsupported by any numerical results, dataset sizes, error bars, ablation studies, or performance tables. Without these load-bearing details the magnitude and reliability of the reported improvement cannot be assessed.
  2. [Abstract] Abstract: the assumption that the controlled geometry/appearance/spatial manipulations produce realistic 3D fakes that generalize beyond the synthetic benchmark is not validated by cross-benchmark testing, real-world captured data, or explicit ranges of manipulation severity. Consequently it remains possible that the multi-view coherence and GS-derived features succeed only because of artifacts specific to how Fake3DGS was constructed rather than because they capture general 3D manipulation cues.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the significance of introducing the first benchmark and detection approach for 3D manipulations in neural rendering. We address each major comment point by point below, indicating planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 'current state-of-the-art 2D detectors struggle to distinguish between original and 3D manipulated images' and that the proposed method yields 'a substantial improvement' is unsupported by any numerical results, dataset sizes, error bars, ablation studies, or performance tables. Without these load-bearing details the magnitude and reliability of the reported improvement cannot be assessed.

    Authors: We agree that the abstract would benefit from explicit quantitative support for the claims. The full manuscript (Section 4) presents detailed results on the Fake3DGS benchmark, including performance tables for multiple state-of-the-art 2D detectors versus our multi-view coherence method, dataset statistics (number of scenes, rendered views, and manipulation types), and ablation studies. We will revise the abstract to include key numerical highlights, such as the average detection performance drop for 2D methods and the improvement margin of the proposed 3D-aware detector, while keeping the abstract concise. revision: yes

  2. Referee: [Abstract] Abstract: the assumption that the controlled geometry/appearance/spatial manipulations produce realistic 3D fakes that generalize beyond the synthetic benchmark is not validated by cross-benchmark testing, real-world captured data, or explicit ranges of manipulation severity. Consequently it remains possible that the multi-view coherence and GS-derived features succeed only because of artifacts specific to how Fake3DGS was constructed rather than because they capture general 3D manipulation cues.

    Authors: The benchmark employs controlled manipulations of geometry, appearance, and layout within 3D Gaussian Splatting scenes, with visual realism preserved and verified through qualitative renderings and perceptual metrics reported in the paper. We acknowledge that explicit cross-benchmark testing on real-world captured data or broader severity ranges is not included in the current experiments. We will add a dedicated limitations paragraph in the discussion section clarifying the synthetic yet realistic scope of Fake3DGS, the rationale for the controlled design to isolate 3D cues, and the public code/data release to enable community-driven extensions to real data. This addresses the concern without requiring new experiments at this stage. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical benchmark and detector evaluation are independent of inputs

full rationale

The paper introduces a new dataset (Fake3DGS) via controlled manipulations of 3DGS scenes and reports experimental results showing 2D detectors fail while a multi-view + GS-feature detector improves performance. These are direct empirical measurements on held-out rendered views, not reductions by construction, fitted parameters renamed as predictions, or self-citation chains. No equations, uniqueness theorems, or ansatzes are invoked in a load-bearing way; the central claims rest on new data and standard detection metrics rather than re-deriving inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that controlled 3D manipulations produce realistic fakes and that multi-view features from Gaussian Splatting provide independent signal for detection.

axioms (1)
  • domain assumption Manipulations of geometry, appearance, and spatial layout preserve high visual realism in rendered views
    Invoked when creating the fake images in the benchmark to ensure they remain challenging for detectors.

pith-pipeline@v0.9.0 · 5529 in / 1249 out tokens · 53796 ms · 2026-05-07T09:07:19.634097+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 7 canonical work pages · 1 internal anchor

  1. [1]

    AI@Meta: Llama 3 model card (2024), https://github.com/meta-llama/llama3/ blob/main/MODEL_CARD.md

  2. [2]

    In: Proceedings of the European Conference on Computer Vision (2024)

    Baraldi, L., Cocchi, F., Cornia, M., Baraldi, L., Nicolosi, A., Cucchiara, R.: Con- trasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similari- ties. In: Proceedings of the European Conference on Computer Vision (2024)

  3. [3]

    IEEE Access9, 136561–136579 (2021)

    Borghi, G., Franco, A., Graffieti, G., Maltoni, D.: Automated artifact retouching in morphed images with attention maps. IEEE Access9, 136561–136579 (2021)

  4. [4]

    Sensors21(10), 3466 (2021)

    Borghi, G., Pancisi, E., Ferrara, M., Maltoni, D.: A double siamese framework for differential morphing attack detection. Sensors21(10), 3466 (2021)

  5. [5]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Brooks, T., Holynski, A., Efros, A.A.: Instructpix2pix: Learning to follow image editing instructions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 18392–18402 (2023)

  6. [6]

    In: International Con- ference on Social Robotics

    Catalini, R., Biagi, F., Salici, G., Borghi, G., Vezzani, R., Biagiotti, L.: Llms and humanoid robot diversity: The pose generation challenge. In: International Con- ference on Social Robotics. pp. 531–538. Springer (2025)

  7. [7]

    In: NeurIPS (2023)

    Cen, J., Zhou, Z., Fang, J., Shen, W., Xie, L., Jiang, D., Zhang, X., Tian, Q., et al.: Segment anything in 3d with nerfs. In: NeurIPS (2023)

  8. [8]

    In: CVPR (2024)

    Chen, Y., Chen, Z., Zhang, C., Wang, F., Yang, X., Wang, Y., Cai, Z., Yang, L., Liu, H., Lin, G.: Gaussianeditor: Swift and controllable 3d editing with gaussian splatting. In: CVPR (2024)

  9. [9]

    In: CVPR (2024)

    Chen, Z., Wang, F., Wang, Y., Liu, H.: Text-to-3d using gaussian splatting. In: CVPR (2024)

  10. [10]

    In: CVPR (2025)

    Chen, Z., Wang, G., Zhu, J., Lai, J., Xie, X.: Guardsplat: Efficient and robust watermarking for 3d gaussian splatting. In: CVPR (2025)

  11. [11]

    In: CVPR (2023)

    Corvi, R., Cozzolino, D., Poggi, G., Nagano, K., Verdoliva, L.: Intriguing properties of synthetic images: from generative adversarial networks to diffusion models. In: CVPR (2023)

  12. [12]

    In: ICASSP (2023)

    Corvi, R., Cozzolino, D., Zingarini, G., Poggi, G., Nagano, K., Verdoliva, L.: On the detection of synthetic images generated by diffusion models. In: ICASSP (2023)

  13. [13]

    arXiv preprint arXiv:1812.02510 (2018)

    Cozzolino, D., Thies, J., Rössler, A., Riess, C., Nießner, M., Verdoliva, L.: Foren- sictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv preprint arXiv:1812.02510 (2018)

  14. [14]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition

    Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition. pp. 5781–5790 (2020)

  15. [15]

    In: 2025 IEEE Intelligent Vehicles Symposium (IV)

    Di Nucci, D., Tomei, M., Borghi, G., Ciuffreda, L., Vezzani, R., Cucchiara, R.: Brum: Robust 3d vehicle reconstruction from 360 sparse images. In: 2025 IEEE Intelligent Vehicles Symposium (IV). pp. 1353–1360. IEEE (2025)

  16. [16]

    arXiv preprint arXiv:2207.13744 (2022)

    Farid, H.: Lighting (in) consistency of paint by text. arXiv preprint arXiv:2207.13744 (2022)

  17. [17]

    arXiv preprint arXiv:2206.14617 (2022)

    Farid, H.: Perspective (in) consistency of paint by text. arXiv preprint arXiv:2206.14617 (2022)

  18. [18]

    In: ICML (2020)

    Frank, J., Eisenhofer, T., Schönherr, L., Fischer, A., Kolossa, D., Holz, T.: Lever- aging frequency analysis for deep fake image recognition. In: ICML (2020)

  19. [19]

    In: CVPR (2022)

    Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenox- els: Radiance fields without neural networks. In: CVPR (2022)

  20. [20]

    Advances in neural in- formation processing systems27(2014) Fake3DGS 13

    Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in neural in- formation processing systems27(2014) Fake3DGS 13

  21. [21]

    ICME (2021)

    Gragnaniello, D., Cozzolino, D., Marra, F., Poggi, G., Verdoliva, L.: Are gan gener- ated images easy to detect? a critical analysis of the state-of-the-art. ICME (2021)

  22. [22]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    He, R., Huang, S., Nie, X., Hui, T., Liu, L., Dai, J., Han, J., Li, G., Liu, S.: Cus- tomize your nerf: Adaptive source driven 3d scene editing via local-global iterative training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6966–6975 (2024)

  23. [23]

    arXiv:2401.17857 (2024)

    Hu, X., Wang, Y., Fan, L., Fan, J., Peng, J., Lei, Z., Li, Q., Zhang, Z.: Semantic anything in 3d gaussians. arXiv:2401.17857 (2024)

  24. [24]

    Huang, X., Luo, Z., Song, Q., Wang, R., Wan, R.: Marksplatter: Generalizable watermarking for 3d gaussian splatting model via splatter image structure (2025)

  25. [25]

    Hull, M., Yang, H., Mehta, P., Phute, M., Cho, A., Wang, H., Lau, M., Lee, W., Lunardi, W.T., Andreoni, M., Chau, P.: 3d gaussian splat vulnerabilities (2025)

  26. [26]

    In, S., Jang, Y., Jeong, U., Jang, M., Park, H., Park, E., Kim, S.: Compmarkgs: Robust watermarking for compressed 3d gaussian splatting (2025)

  27. [27]

    In: CVPR (2025)

    Jang, Y., Park, H., Yang, F., Ko, H., Choo, E., Kim, S.: 3d-gsw: 3d gaussian splatting for robust watermarking. In: CVPR (2025)

  28. [28]

    In: ACM TOG (2024)

    Jiang, Y., Yu, C., Xie, T., Li, X., Feng, Y., Wang, H., Li, M., Lau, H.Y.K., Gao, F., Yang, Y., Jiang, C.: Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality. In: ACM TOG (2024)

  29. [29]

    ACM TOG (2023)

    Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM TOG (2023)

  30. [30]

    Auto-Encoding Variational Bayes

    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  31. [31]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2025)

    Li, Y., Ma, Q., Yang, R., Li, H., Ma, M., Ren, B., Popovic, N., Sebe, N., Konukoglu, E., Gevers, T., et al.: Scenesplat: Gaussian splatting-based scene understanding with vision-language pretraining. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2025)

  32. [32]

    Li, Z., Chen, Y., Zhao, L., Liu, P.: Controllable text-to-3d generation via surface- aligned gaussian splatting (2025)

  33. [33]

    In: CVPR (2024)

    Liang, Y., Yang, X., Lin, J., Li, H., Xu, X., Chen, Y.: Luciddreamer: Towards high-fidelity text-to-3d generation via interval score matching. In: CVPR (2024)

  34. [34]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Liu, X., Tayal, P., Wang, J., Zarzar, J., Monnier, T., Tertikas, K., Duan, J., Toisoul, A., Zhang, J.Y., Neverova, N., et al.: Uncommon objects in 3d. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 14102–14113 (2025)

  35. [35]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Liu, Y.T., Guo, Y.C., Luo, G., Sun, H., Yin, W., Zhang, S.H.: Pi3d: Efficient text- to-3d generation with pseudo-image diffusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 19915–19924 (2024)

  36. [36]

    In: 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)

    Matern, F., Riess, C., Stamminger, M.: Exploiting visual artifacts to expose deep- fakes and face manipulations. In: 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). IEEE (2019)

  37. [37]

    Commu- nications of the ACM (2021)

    Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commu- nications of the ACM (2021)

  38. [38]

    In: ECCV (2024)

    Morgenstern, W., Barthel, F., Hilsmann, A., Eisert, P.: Compact 3d scene repre- sentation via self-organizing gaussian grids. In: ECCV (2024)

  39. [39]

    ACM TOG (2022)

    Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM TOG (2022)

  40. [40]

    In: CVPR (2024)

    Niedermayr, S., Stumpfegger, J., Westermann, R.: Compressed 3d gaussian splat- ting for accelerated novel view synthesis. In: CVPR (2024)

  41. [41]

    In: CVPR (2023) 14 D

    Ojha, U., Li, Y., Lee, Y.J.: Towards universal fake image detectors that generalize across generative models. In: CVPR (2023) 14 D. Di Nucci et al

  42. [42]

    Transactions on Ma- chine Learning Research (TMLR) (2024)

    Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Huang, P.Y., Li, S.W., Misra, I., Rabbat, M., Sharma, V., Syn- naeve, G., Xu, H., Jégou, H., Mairal, J., Labatut, P., Joulin, A., Bojanowski, P.: DINOv2: Learning robust visual fe...

  43. [43]

    In: Proceed- ings of the 38th International Conference on Machine Learning (ICML)

    Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sas- try, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learn- ing transferable visual models from natural language supervision. In: Proceed- ings of the 38th International Conference on Machine Learning (ICML). Pro- ceedings of Machine Learning Research, vol. 139...

  44. [44]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)

  45. [45]

    In: CVPR (2019)

    Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nießner, M.: Face- forensics++: Learning to detect manipulated facial images. In: CVPR (2019)

  46. [46]

    In: ACM (2023)

    Sha, Z., Li, Z., Yu, N., Zhang, Y.: De-fake: Detection and attribution of fake images generated by text-to-image generation models. In: ACM (2023)

  47. [47]

    In: NeurIPS (2024)

    Shi, Z., zheng, h., Xu, C., Dong, C., Pan, B., xueshuo, X., He, A., Li, T., Fu, H.: Resfusion: Denoising diffusion probabilistic models for image restoration based on prior residual noise. In: NeurIPS (2024)

  48. [48]

    Vachha, C., Haque, A.: Instruct-gs2gs: Editing 3d gaussian splats with instructions (2024), https://instruct-gs2gs.github.io/

  49. [49]

    Fakespotter: A simple baseline for spotting ai-synthesized fake faces,

    Wang, R.,Juefei-Xu, F., Ma, L., Xie, X., Huang, Y., Wang, J., Liu,Y.: Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces. arXiv preprint arXiv:1909.06122 (2019)

  50. [50]

    Wang, S.Y., Wang, O., Zhang, R., Owens, A., Efros, A.A.: Cnn-generated images are surprisingly easy to spot... for now. In: CVPR (2020)

  51. [51]

    In: ICCV (2023)

    Wang, Z., Bao, J., Zhou, W., Wang, W., Hu, H., Chen, H., Li, H.: Dire for diffusion- generated image detection. In: ICCV (2023)

  52. [52]

    In: European Conference on Computer Vision

    Wu, J., Bian, J.W., Li, X., Wang, G., Reid, I., Torr, P., Prisacariu, V.A.: Gauss- ctrl: Multi-view consistent text-driven 3d gaussian splatting editing. In: European Conference on Computer Vision. pp. 55–71. Springer (2024)

  53. [53]

    In: CVPR (2024)

    Wu, X., Jiang, L., Wang, P.S., Liu, Z., Liu, X., Qiao, Y., Ouyang, W., He, T., Zhao, H.: Point transformer v3: Simpler, faster, stronger. In: CVPR (2024)

  54. [54]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Xu, J., Wang, X., Cheng, W., Cao, Y.P., Shan, Y., Qie, X., Gao, S.: Dream3d: Zero-shot text-to-3d synthesis using 3d shape prior and text-to-image diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 20908–20918 (2023)

  55. [55]

    Journal of Machine Learning Research (2025)

    Ye, V., Li, R., Kerr, J., Turkulainen, M., Yi, B., Pan, Z., Seiskari, O., Ye, J., Hu, J., Tancik, M., Kanazawa, A.: gsplat: An open-source library for gaussian splatting. Journal of Machine Learning Research (2025)

  56. [56]

    In: IEEE Conference on Computer Vision and Patten Recognition (CVPR) (2020)

    Yuezun, L., Xin, Y., Pu, S., Honggang, Q., Siwei, L.: Celeb-df: A large-scale chal- lenging dataset for deepfake forensics. In: IEEE Conference on Computer Vision and Patten Recognition (CVPR) (2020)

  57. [57]

    Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models (2023)

  58. [58]

    In: ECCV (2024) Fake3DGS 15

    Zhang, X., Ge, X., Xu, T., He, D., Wang, Y., Qin, H., Lu, G., Geng, J., Zhang, J.: Gaussianimage: 1000 fps image representation and compression by 2d gaussian splatting. In: ECCV (2024) Fake3DGS 15

  59. [59]

    In: ECCV (2024)

    Zhou, S., Fan, Z., Xu, D., Chang, H., Chari, P., Bharadwaj, T., You, S., Wang, Z., Kadambi, A.: Dreamscene360: Unconstrained text-to-3d scene generation with panoramic gaussian splatting. In: ECCV (2024)

  60. [60]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Zhou, X., Lin, Z., Shan, X., Wang, Y., Sun, D., Yang, M.H.: Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 21634–21643 (2024)

  61. [61]

    arXiv preprint arXiv:2406.09386 (2024)

    Zhou, Y., Simon, M., Peng, Z., Mo, S., Zhu, H., Guo, M., Zhou, B.: Simgen: Simulator-conditioned driving scene generation. arXiv preprint arXiv:2406.09386 (2024)