Who Generated This 3D Asset? Learning Source Attribution for Generative 3D Models

Dacheng Tao; Sihan Ma; Siyuan Liang

arxiv: 2605.18132 · v1 · pith:WTDDUA6Anew · submitted 2026-05-18 · 💻 cs.CV · cs.AI

Who Generated This 3D Asset? Learning Source Attribution for Generative 3D Models

Sihan Ma , Siyuan Liang , Dacheng Tao This is my paper

Pith reviewed 2026-05-20 11:50 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords 3D generationsource attributiongenerative modelsfingerprintingmulti-view fusionbenchmarkcontent provenancegeometric artifacts

0 comments

The pith

Generative 3D models leave stable fingerprints that reveal which model created a given asset.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that generative 3D models leave two stable types of fingerprints consisting of cross-view inconsistencies and structural artifacts visible in geometric statistics and frequency-domain cues. It builds the first benchmark covering 22 representative generators and tests attribution under full supervision, few-shot learning with 1 percent of labels, and realistic mixed real-synthetic conditions. A hierarchical multi-view multi-modal Transformer is introduced to fuse appearance, geometric, and frequency features inside each view while relating signals across views. Experiments reach 97.22 percent accuracy with full data and 77.17 percent with fewer than five samples per generator. This matters because it supplies a concrete way to trace the origin of 3D content used in games, robotics, and immersive environments.

Core claim

Modern generative 3D models leave two types of stable fingerprints: cross-view inconsistency and structural artifacts reflected in geometric statistics and frequency-domain cues. By constructing a benchmark with 22 generators under standard, few-shot, and realistic deployment protocols and training a hierarchical multi-view multi-modal Transformer that fuses appearance, geometric, and frequency-domain features within each view while modeling global relationships across views, the work demonstrates that these dispersed signals support source attribution at 97.22 percent accuracy under full supervision and 77.17 percent accuracy with only 1 percent training data.

What carries the argument

Hierarchical multi-view multi-modal Transformer that fuses appearance, geometric, and frequency-domain features within each view and models global relationships across views to capture dispersed attribution signals.

Load-bearing premise

The fingerprints remain stable and detectable under realistic deployment constraints including scarce labels, degraded prompts, and mixed real or synthetic assets.

What would settle it

A substantial drop in attribution accuracy when evaluating on assets produced with heavily varied prompts, post-processed geometry, or combinations of generators absent from the original benchmark.

Figures

Figures reproduced from arXiv: 2605.18132 by Dacheng Tao, Sihan Ma, Siyuan Liang.

**Figure 1.** Figure 1: Overview of our attribution framework. Given multi-view renderings and structural priors of a 3D asset, our model learns discriminative fingerprints through intra-view fusion and cross-view reasoning, enabling accurate attribution across 22 generative 3D models. Abstract Generative 3D models are deployed in gaming, robotics, and immersive creation, making source attribution critical: given a 3D asset, can … view at source ↗

**Figure 2.** Figure 2: Representative benchmark samples [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Dispersed attribution fingerprints in generated 3D assets. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Overview of our attribution model. Given multi-view renderings and structural priors from a 3D asset, our model learns dispersed attribution signals across viewpoints and modalities through multi-modal fusion within each single view and cross-view reasoning, achieving strong attribution accuracy and interpretable clustering across 22 generative 3D models. Observable cue tokenization. For each viewpoint i, … view at source ↗

**Figure 5.** Figure 5: Visualization of attribution fingerprints captured by our framework. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Overview of our attribution benchmark [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: View number ablation. View number ablation study. We further study the effect of the number of rendered views in [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 8.** Figure 8: Row-normalized confusion matrix heatmap. Each value denotes the proportion of samples [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

read the original abstract

Generative 3D models are deployed in gaming, robotics, and immersive creation, making source attribution critical: given a 3D asset, can we identify whether and which generative model created it? This problem faces two core challenges: dispersed attribution signals, where 3D fingerprints are distributed across multi-view, geometric, and frequency-domain cues; and realistic deployment constraints, where scarce labels, degraded prompts, and mixed real/synthetic assets undermine attribution reliability. To systematically study this problem, we construct, to the best of our knowledge, the first passive source attribution benchmark for modern generated assets, covering 22 representative 3D generators under standard, few-shot, and realistic deployment protocols. Based on this benchmark, we find that generative 3D models leave two types of stable fingerprints: cross-view inconsistency and structural artifacts reflected in geometric statistics and frequency-domain cues. To capture these dispersed signals, we propose a hierarchical multi-view multi-modal Transformer that fuses appearance, geometric, and frequency-domain features within each view and models global relationships across views. Extensive experiments demonstrate strong performance, achieving 97.22% accuracy under full supervision and 77.17% accuracy with only 1% training data, corresponding to fewer than five samples per generator. These results show that modern 3D generators leave stable and attributable fingerprints, establishing a new benchmark and methodological foundation for trustworthy 3D content provenance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper builds the first benchmark for source attribution across 22 3D generators and shows a hierarchical fusion transformer that reaches 97% full-supervision accuracy and 77% with 1% labels, but the stability of the claimed fingerprints under asset normalization or post-processing is not yet demonstrated.

read the letter

The main point is that the authors have put together the first benchmark for passive attribution of modern 3D generative models and paired it with a hierarchical multi-view multi-modal Transformer that fuses appearance, geometric, and frequency features. They report 97.22% accuracy when fully supervised and 77.17% when trained on only 1% of the data, which is fewer than five examples per generator. That few-shot number is the most practically interesting result here.

Referee Report

2 major / 2 minor

Summary. The paper introduces the first passive source attribution benchmark for 3D generative models, spanning 22 generators under standard, few-shot, and realistic deployment protocols. It identifies two types of stable fingerprints—cross-view inconsistency and structural artifacts in geometric statistics and frequency-domain cues—and proposes a hierarchical multi-view multi-modal Transformer to fuse appearance, geometric, and frequency features across views. Experiments report 97.22% accuracy under full supervision and 77.17% accuracy with only 1% training data (fewer than five samples per generator), supporting the claim that modern 3D generators leave attributable fingerprints and establishing a foundation for 3D content provenance.

Significance. If the results hold, this work is significant for enabling trustworthy provenance in generative 3D content, an area of growing importance in gaming, robotics, and immersive applications. The construction of a new benchmark covering multiple generators and protocols, combined with strong empirical performance in the low-data regime, provides a concrete methodological foundation and falsifiable baseline for future attribution research.

major comments (2)

[Section 3] Benchmark Construction and Evaluation Protocols (Section 3): The central claim that fingerprints are 'stable' and remain detectable under realistic constraints is load-bearing for both the 77.17% few-shot result and the motivation for multi-modal fusion, yet the paper does not report experiments testing invariance to common post-processing operations such as mesh simplification, format conversion, coordinate normalization, or integration into mixed real/synthetic scenes. The benchmark treats collected assets as fixed representations, leaving open the possibility that reported accuracies partly reflect export-pipeline artifacts rather than intrinsic generator properties.
[Section 4] Hierarchical Fusion Method (Section 4): While the Transformer fuses multi-view and multi-modal cues, there is insufficient ablation or analysis quantifying the individual contribution of cross-view inconsistency versus geometric/frequency artifacts, particularly in the 1%-data regime; this weakens the justification for the hierarchical design over simpler baselines.

minor comments (2)

[Abstract] The abstract and introduction could more explicitly situate the new benchmark against prior 2D attribution or forensic work to strengthen the 'first' claim.
[Section 3] Clarify the precise definition and implementation of 'realistic deployment protocols' to make the generalization claims easier to evaluate.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help strengthen the robustness and analysis in our work on source attribution for generative 3D models. We address each major comment below and will incorporate revisions to improve the manuscript.

read point-by-point responses

Referee: [Section 3] Benchmark Construction and Evaluation Protocols (Section 3): The central claim that fingerprints are 'stable' and remain detectable under realistic constraints is load-bearing for both the 77.17% few-shot result and the motivation for multi-modal fusion, yet the paper does not report experiments testing invariance to common post-processing operations such as mesh simplification, format conversion, coordinate normalization, or integration into mixed real/synthetic scenes. The benchmark treats collected assets as fixed representations, leaving open the possibility that reported accuracies partly reflect export-pipeline artifacts rather than intrinsic generator properties.

Authors: We agree that explicit tests for post-processing operations would further validate the stability of the identified fingerprints. Our benchmark already incorporates realistic deployment protocols, including few-shot settings and mixed real/synthetic scenes, but we did not include dedicated experiments on operations such as mesh simplification or format conversion. In the revised manuscript, we will add a new analysis subsection with preliminary results on a subset of generators evaluating attribution performance after mesh simplification and coordinate normalization. We will also explicitly discuss the possibility of export-pipeline artifacts as a limitation. This addresses the concern while preserving the core claim that the fingerprints are attributable to generator properties. revision: yes
Referee: [Section 4] Hierarchical Fusion Method (Section 4): While the Transformer fuses multi-view and multi-modal cues, there is insufficient ablation or analysis quantifying the individual contribution of cross-view inconsistency versus geometric/frequency artifacts, particularly in the 1%-data regime; this weakens the justification for the hierarchical design over simpler baselines.

Authors: We acknowledge that more detailed ablations would strengthen the justification for the hierarchical multi-view multi-modal design. The architecture is motivated by the dispersed signals observed in the benchmark (cross-view inconsistency and structural artifacts in geometric and frequency domains), but we did not provide component-wise breakdowns specifically for the 1% data regime. In the revision, we will expand Section 4 with additional ablation studies, including performance when removing the cross-view modeling module and when using only geometric or frequency features, reported under both full supervision and 1% training data. These results will better quantify contributions and compare against simpler baselines. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark construction and model evaluation

full rationale

The paper constructs a new benchmark covering 22 3D generators and reports empirical accuracies (97.22% full supervision, 77.17% with 1% data) for a proposed hierarchical multi-view multi-modal Transformer. No derivation chain, equations, or predictions are presented that reduce by construction to fitted inputs, self-citations, or renamed ansatzes. The fingerprints are observed from the benchmark rather than defined into existence, and the method is a standard fusion architecture trained and evaluated on held-out data. The work is self-contained against its external benchmark with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Review based on abstract only; specific model hyperparameters and exact benchmark construction details are not provided.

free parameters (1)

Transformer fusion and attention hyperparameters
The hierarchical multi-view multi-modal Transformer requires choices for layer counts, attention mechanisms, and feature fusion weights that are typically tuned on the training data.

axioms (1)

domain assumption Generative 3D models produce assets containing stable, detectable fingerprints in cross-view, geometric, and frequency domains.
This assumption underpins both the benchmark findings and the design of the attribution model.

pith-pipeline@v0.9.0 · 5786 in / 1379 out tokens · 66851 ms · 2026-05-20T11:50:21.175441+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

generative 3D models leave two types of stable fingerprints: cross-view inconsistency and structural artifacts reflected in geometric statistics and frequency-domain cues... hierarchical multi-view multi-modal Transformer that fuses appearance, geometric, and frequency-domain features
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

achieving 97.22% accuracy under full supervision and 77.17% accuracy with only 1% training data

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 7 internal anchors

[1]

Constraint is all you need: optimization-based 3d level generation with llms

Kaijie Xu and Clark Verbrugge. Constraint is all you need: optimization-based 3d level generation with llms. InProceedings of the 20th International Conference on the Foundations of Digital Games, pages 1–13, 2025

work page 2025
[2]

Sketch2play: Runtime generation of playable 3d game content from sketches

Tianpei Zang, Keyuan Chen, Haiyan Li, and Guoyu Sun. Sketch2play: Runtime generation of playable 3d game content from sketches. InAdjunct Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology, pages 1–3, 2025

work page 2025
[3]

Procthor: Large-scale embodied ai using procedural generation.Advances in Neural Information Processing Systems, 35:5982–5994, 2022

Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Kiana Ehsani, Jordi Salvador, Winson Han, Eric Kolve, Aniruddha Kembhavi, and Roozbeh Mottaghi. Procthor: Large-scale embodied ai using procedural generation.Advances in Neural Information Processing Systems, 35:5982–5994, 2022

work page 2022
[4]

Holodeck: Language guided generation of 3d embodied ai environments

Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, et al. Holodeck: Language guided generation of 3d embodied ai environments. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16227–16237, 2024

work page 2024
[5]

Syncity: Training-free generation of 3d worlds

Paul Engstler, Aleksandar Shtedritski, Iro Laina, Christian Rupprecht, and Andrea Vedaldi. Syncity: Training-free generation of 3d worlds. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 27585–27595, 2025

work page 2025
[6]

Text2nerf: Text-driven 3d scene generation with neural radiance fields.IEEE Transactions on Visualization and Computer Graphics, 30(12):7749–7762, 2024

Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, and Jing Liao. Text2nerf: Text-driven 3d scene generation with neural radiance fields.IEEE Transactions on Visualization and Computer Graphics, 30(12):7749–7762, 2024

work page 2024
[7]

Text2room: Extracting textured 3d meshes from 2d text-to-image models

Lukas Höllein, Ang Cao, Andrew Owens, Justin Johnson, and Matthias Nießner. Text2room: Extracting textured 3d meshes from 2d text-to-image models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7909–7920, 2023

work page 2023
[8]

DreamFusion: Text-to-3D using 2D Diffusion

Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. Dreamfusion: Text-to-3d using 2d diffusion.arXiv preprint arXiv:2209.14988, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[9]

Magic3d: High-resolution text-to-3d content creation

Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. Magic3d: High-resolution text-to-3d content creation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023

work page 2023
[10]

Pro- lificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation

Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, and Jun Zhu. Pro- lificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. Advances in Neural Information Processing Systems, 36, 2024

work page 2024
[11]

Objaverse: A universe of annotated 3d objects

Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13142–13153, 2023

work page 2023
[12]

Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36, 2024

Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram V oleti, Samir Yitzhak Gadre, et al. Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36, 2024

work page 2024
[13]

2025.doi:10.48550/arXiv.2505.07747

Weiyu Li, Xuanyang Zhang, Zheng Sun, Di Qi, Hao Li, Wei Cheng, Weiwei Cai, Shihao Wu, Jiarui Liu, Zihao Wang, et al. Step1x-3d: Towards high-fidelity and controllable generation of textured 3d assets.arXiv preprint arXiv:2505.07747, 2025

work page arXiv 2025
[14]

ORCA: Open research content archive.https://developer.nvidia.com/orca

NVIDIA. ORCA: Open research content archive.https://developer.nvidia.com/orca

work page
[15]

Evading data provenance in deep neural networks

Hongyu Zhu, Sichu Liang, Wenwen Wang, Zhuomeng Zhang, Fangqi Li, and Shi-Lin Wang. Evading data provenance in deep neural networks. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 1249–1260, 2025. 10

work page 2025
[16]

Data governance: Organizing data for trustworthy artificial intelligence.Government information quarterly, 37(3):101493, 2020

Marijn Janssen, Paul Brous, Elsa Estevez, Luis S Barbosa, and Tomasz Janowski. Data governance: Organizing data for trustworthy artificial intelligence.Government information quarterly, 37(3):101493, 2020

work page 2020
[17]

Mapping the evolution of data governance scientific research.Data & Policy, 7:e51, 2025

Hossein Hassani, Xu Huang, and Steve MacFeely. Mapping the evolution of data governance scientific research.Data & Policy, 7:e51, 2025

work page 2025
[18]

SAGA: Source Attribution of Generative AI Videos

Rohit Kundu, Vishal Mohanty, Hao Xiong, Shan Jia, Athula Balachandran, and Amit K Roy- Chowdhury. Saga: Source attribution of generative ai videos.arXiv preprint arXiv:2511.12834, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[19]

Where did i come from? origin attribution of ai-generated images.Advances in neural information processing systems, 36:74478–74500, 2023

Zhenting Wang, Chen Chen, Yi Zeng, Lingjuan Lyu, and Shiqing Ma. Where did i come from? origin attribution of ai-generated images.Advances in neural information processing systems, 36:74478–74500, 2023

work page 2023
[20]

Deepfake network architecture attribution

Tianyun Yang, Ziyao Huang, Juan Cao, Lei Li, and Xirong Li. Deepfake network architecture attribution. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4662–4670, 2022

work page 2022
[21]

3dgen-bench: comprehensive benchmark suite for 3d generative models.arXiv preprint arXiv:2503.21745, 2025

Yuhan Zhang, Mengchen Zhang, Tong Wu, Tengfei Wang, Gordon Wetzstein, Dahua Lin, and Ziwei Liu. 3dgen-bench: comprehensive benchmark suite for 3d generative models.arXiv preprint arXiv:2503.21745, 2025

work page arXiv 2025
[22]

Scalable 3d captioning with pretrained models.Advances in Neural Information Processing Systems, 36:75307–75337, 2023

Tiange Luo, Chris Rockwell, Honglak Lee, and Justin Johnson. Scalable 3d captioning with pretrained models.Advances in Neural Information Processing Systems, 36:75307–75337, 2023

work page 2023
[23]

Point-E: A System for Generating 3D Point Clouds from Complex Prompts

Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, and Mark Chen. Point-e: A system for generating 3d point clouds from complex prompts.arXiv preprint arXiv:2212.08751, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[24]

Shap-E: Generating Conditional 3D Implicit Functions

Heewoo Jun and Alex Nichol. Shap-e: Generating conditional 3d implicit functions.arXiv preprint arXiv:2305.02463, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[25]

Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. CNN- generated images are surprisingly easy to spot. . . for now. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8695–8704, 2020

work page 2020
[26]

FaceForensics++: Learning to detect manipulated facial images

Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. FaceForensics++: Learning to detect manipulated facial images. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1–11, 2019

work page 2019
[27]

On the detection of synthetic images generated by diffusion models

Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. On the detection of synthetic images generated by diffusion models. InIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, 2023

work page 2023
[28]

Poisoned forgery face: Towards backdoor attacks on face forgery detection

Jiawei Liang, Siyuan Liang, Aishan Liu, Xiaojun Jia, Junhao Kuang, and Xiaochun Cao. Poisoned forgery face: Towards backdoor attacks on face forgery detection.arXiv preprint arXiv:2402.11473, 2024

work page arXiv 2024
[29]

Do GANs leave artificial fingerprints? In2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 506–511, 2019

Francesco Marra, Diego Gragnaniello, Luisa Verdoliva, and Giovanni Poggi. Do GANs leave artificial fingerprints? In2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 506–511, 2019

work page 2019
[30]

Attributing fake images to gans: Learning and analyzing gan fingerprints

Ning Yu, Larry S Davis, and Mario Fritz. Attributing fake images to gans: Learning and analyzing gan fingerprints. InProceedings of the IEEE/CVF international conference on computer vision, pages 7556–7566, 2019

work page 2019
[31]

Artificial fingerprinting for generative models: Rooting deepfake attribution in training data

Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, and Mario Fritz. Artificial fingerprinting for generative models: Rooting deepfake attribution in training data. InProceedings of the IEEE/CVF International conference on computer vision, pages 14448–14457, 2021. 11

work page 2021
[32]

Towards discovery and attribution of open-world gan generated images

Sharath Girish, Saksham Suri, Sai Saketh Rambhatla, and Abhinav Shrivastava. Towards discovery and attribution of open-world gan generated images. InProceedings of the IEEE/CVF international conference on computer vision, pages 14094–14103, 2021

work page 2021
[33]

Responsible disclosure of generative models using scalable fingerprinting

Ning Yu, Vladislav Skripniuk, Dingfan Chen, Larry Davis, and Mario Fritz. Responsible disclosure of generative models using scalable fingerprinting. InInternational Conference on Learning Representations (ICLR), 2022

work page 2022
[34]

Source generator attribution via inversion

Michael Albright and Scott McCloskey. Source generator attribution via inversion. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Media Forensics, pages 96–103, 2019

work page 2019
[35]

Progres- sive open space expansion for open-set model attribution

Tianyun Yang, Danding Wang, Fan Tang, Xinying Zhao, Juan Cao, and Sheng Tang. Progres- sive open space expansion for open-set model attribution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15856–15865, 2023

work page 2023
[36]

ManiFPT: Defining and analyzing fingerprints of generative models

Hae Jin Song, Mahyar Khayatkhoei, and Wael AbdAlmageed. ManiFPT: Defining and analyzing fingerprints of generative models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024
[37]

Fakepcd: Fake point cloud detection via source attribution

Yiting Qu, Zhikun Zhang, Yun Shen, Michael Backes, and Yang Zhang. Fakepcd: Fake point cloud detection via source attribution. InProceedings of the 19th ACM Asia Conference on Computer and Communications Security, pages 930–946, 2024

work page 2024
[38]

Universal watermark vaccine: Universal adversarial perturbations for watermark protection

Jianbo Chen, Xinwei Liu, Siyuan Liang, Xiaojun Jia, and Yuan Xun. Universal watermark vaccine: Universal adversarial perturbations for watermark protection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023

work page 2023
[39]

arXiv preprint arXiv:2412.01528 , year=

Zhixiang Guo, Siyuan Liang, Aishan Liu, and Dacheng Tao. Copyrightshield: Spatial similarity guided backdoor defense against copyright infringement in diffusion models.arXiv preprint arXiv:2412.01528, 2024

work page arXiv 2024
[40]

Me: Trigger element combination backdoor attack on copyright infringement.arXiv preprint arXiv:2506.10776, 2025

Feiyu Yang, Siyuan Liang, Aishan Liu, and Dacheng Tao. Me: Trigger element combination backdoor attack on copyright infringement.arXiv preprint arXiv:2506.10776, 2025

work page arXiv 2025
[41]

Imitated detectors: Stealing knowledge of black-box object detectors

Siyuan Liang, Aishan Liu, Jiawei Liang, Longkang Li, Yang Bai, and Xiaochun Cao. Imitated detectors: Stealing knowledge of black-box object detectors. InProceedings of the 30th ACM International Conference on Multimedia, pages 4839–4847, 2022

work page 2022
[42]

A large- scale audit of dataset licensing and attribution in ai.Nature Machine Intelligence, 6(8):975–987, 2024

Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, et al. A large- scale audit of dataset licensing and attribution in ai.Nature Machine Intelligence, 6(8):975–987, 2024

work page 2024
[43]

AI models collapse when trained on recursively generated data.Nature, 631:755–759, 2024

Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, and Yarin Gal. AI models collapse when trained on recursively generated data.Nature, 631:755–759, 2024

work page 2024
[44]

Intrigu- ing properties of synthetic images: from generative adversarial networks to diffusion models

Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. Intrigu- ing properties of synthetic images: from generative adversarial networks to diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 973–982, 2023

work page 2023
[45]

Leveraging frequency analysis for deep fake image recognition

Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. Leveraging frequency analysis for deep fake image recognition. InInternational confer- ence on machine learning, pages 3247–3258. PMLR, 2020

work page 2020
[46]

Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions

Ricard Durall, Margret Keuper, and Janis Keuper. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020. 12

work page 2020
[47]

Vishal Asnani, Xi Yin, Tal Hassner, and Xiaoming Liu. Reverse engineering of generative models: Inferring model hyperparameters from generated images.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12):15477–15493, 2023

work page 2023
[48]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[49]

Gaussiandreamer: Fast generation from text to 3d gaussians.arXiv preprint arXiv:2311.11284, 2023

Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiaogang Xu, and Yingcong Chen. Lucid- dreamer: Towards high-fidelity text-to-3d generation via interval score matching.arXiv preprint arXiv:2311.11284, 2023

work page arXiv 2023
[50]

Grm: Large gaussian reconstruction model for ef- ficient 3d reconstruction and generation.arXiv preprint arXiv:2403.14621, 2024

Yinghao Xu, Zifan Shi, Wang Yifan, Sida Peng, Ceyuan Yang, Yujun Shen, and Wetzstein Gor- don. Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation. arxiv: 2403.14621, 2024

work page arXiv 2024
[51]

MVDream: Multi-view Diffusion for 3D Generation

Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, and Xiao Yang. Mvdream: Multi- view diffusion for 3d generation.arXiv preprint arXiv:2308.16512, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[52]

Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers.arXiv preprint arXiv:2312.09147, 2023

Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, and Song-Hai Zhang. Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers.arXiv preprint arXiv:2312.09147, 2023

work page arXiv 2023
[53]

Openlrm: Open-source large reconstruction models

Zexin He and Tengfei Wang. Openlrm: Open-source large reconstruction models. https: //github.com/3DTopia/OpenLRM, 2023

work page 2023
[54]

Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors

Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, and Bernard Ghanem. Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors. InThe Twelfth International Conference on Learning Representations (ICLR), 2024

work page 2024
[55]

Zero-1-to-3: Zero-shot one image to 3d object, 2023

Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl V ondrick. Zero-1-to-3: Zero-shot one image to 3d object, 2023

work page 2023
[56]

Wonder3d: Sin- gle image to 3d using cross-domain diffusion.arXiv preprint arXiv:2310.15008, 2023

Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, et al. Wonder3d: Single image to 3d using cross-domain diffusion.arXiv preprint arXiv:2310.15008, 2023

work page arXiv 2023
[57]

Free3d: Consistent novel view synthesis without 3d representation.arXiv, 2023

Chuanxia Zheng and Andrea Vedaldi. Free3d: Consistent novel view synthesis without 3d representation.arXiv, 2023

work page 2023
[58]

Eschernet: A generative model for scalable view synthesis.arXiv preprint arXiv:2402.03908, 2024

Xin Kong, Shikun Liu, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, and Andrew J Davison. Eschernet: A generative model for scalable view synthesis.arXiv preprint arXiv:2402.03908, 2024

work page arXiv 2024
[59]

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, and Wenping Wang. Syncdreamer: Generating multiview-consistent images from a single-view image.arXiv preprint arXiv:2309.03453, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[60]

Lgm: Large multi-view gaussian model for high-resolution 3d content creation.arXiv preprint arXiv:2402.05054, 2024

Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. Lgm: Large multi-view gaussian model for high-resolution 3d content creation.arXiv preprint arXiv:2402.05054, 2024

work page arXiv 2024
[61]

Latent-nerf for shape-guided generation of 3d shapes and textures

Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, and Daniel Cohen-Or. Latent-nerf for shape-guided generation of 3d shapes and textures. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12663–12673, 2023

work page 2023
[62]

Empty” denotes missing prompts only during inference, while “Empty*

Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A Yeh, and Greg Shakhnarovich. Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12619–12629, 2023. 13 Figure 6: Overview of our attribution benchmark. Table 6: Overview of 3D Generation M...

work page 2023

[1] [1]

Constraint is all you need: optimization-based 3d level generation with llms

Kaijie Xu and Clark Verbrugge. Constraint is all you need: optimization-based 3d level generation with llms. InProceedings of the 20th International Conference on the Foundations of Digital Games, pages 1–13, 2025

work page 2025

[2] [2]

Sketch2play: Runtime generation of playable 3d game content from sketches

Tianpei Zang, Keyuan Chen, Haiyan Li, and Guoyu Sun. Sketch2play: Runtime generation of playable 3d game content from sketches. InAdjunct Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology, pages 1–3, 2025

work page 2025

[3] [3]

Procthor: Large-scale embodied ai using procedural generation.Advances in Neural Information Processing Systems, 35:5982–5994, 2022

Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Kiana Ehsani, Jordi Salvador, Winson Han, Eric Kolve, Aniruddha Kembhavi, and Roozbeh Mottaghi. Procthor: Large-scale embodied ai using procedural generation.Advances in Neural Information Processing Systems, 35:5982–5994, 2022

work page 2022

[4] [4]

Holodeck: Language guided generation of 3d embodied ai environments

Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, et al. Holodeck: Language guided generation of 3d embodied ai environments. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16227–16237, 2024

work page 2024

[5] [5]

Syncity: Training-free generation of 3d worlds

Paul Engstler, Aleksandar Shtedritski, Iro Laina, Christian Rupprecht, and Andrea Vedaldi. Syncity: Training-free generation of 3d worlds. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 27585–27595, 2025

work page 2025

[6] [6]

Text2nerf: Text-driven 3d scene generation with neural radiance fields.IEEE Transactions on Visualization and Computer Graphics, 30(12):7749–7762, 2024

Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, and Jing Liao. Text2nerf: Text-driven 3d scene generation with neural radiance fields.IEEE Transactions on Visualization and Computer Graphics, 30(12):7749–7762, 2024

work page 2024

[7] [7]

Text2room: Extracting textured 3d meshes from 2d text-to-image models

Lukas Höllein, Ang Cao, Andrew Owens, Justin Johnson, and Matthias Nießner. Text2room: Extracting textured 3d meshes from 2d text-to-image models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7909–7920, 2023

work page 2023

[8] [8]

DreamFusion: Text-to-3D using 2D Diffusion

Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. Dreamfusion: Text-to-3d using 2d diffusion.arXiv preprint arXiv:2209.14988, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[9] [9]

Magic3d: High-resolution text-to-3d content creation

Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. Magic3d: High-resolution text-to-3d content creation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023

work page 2023

[10] [10]

Pro- lificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation

Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, and Jun Zhu. Pro- lificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. Advances in Neural Information Processing Systems, 36, 2024

work page 2024

[11] [11]

Objaverse: A universe of annotated 3d objects

Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13142–13153, 2023

work page 2023

[12] [12]

Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36, 2024

Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram V oleti, Samir Yitzhak Gadre, et al. Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36, 2024

work page 2024

[13] [13]

2025.doi:10.48550/arXiv.2505.07747

Weiyu Li, Xuanyang Zhang, Zheng Sun, Di Qi, Hao Li, Wei Cheng, Weiwei Cai, Shihao Wu, Jiarui Liu, Zihao Wang, et al. Step1x-3d: Towards high-fidelity and controllable generation of textured 3d assets.arXiv preprint arXiv:2505.07747, 2025

work page arXiv 2025

[14] [14]

ORCA: Open research content archive.https://developer.nvidia.com/orca

NVIDIA. ORCA: Open research content archive.https://developer.nvidia.com/orca

work page

[15] [15]

Evading data provenance in deep neural networks

Hongyu Zhu, Sichu Liang, Wenwen Wang, Zhuomeng Zhang, Fangqi Li, and Shi-Lin Wang. Evading data provenance in deep neural networks. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 1249–1260, 2025. 10

work page 2025

[16] [16]

Data governance: Organizing data for trustworthy artificial intelligence.Government information quarterly, 37(3):101493, 2020

Marijn Janssen, Paul Brous, Elsa Estevez, Luis S Barbosa, and Tomasz Janowski. Data governance: Organizing data for trustworthy artificial intelligence.Government information quarterly, 37(3):101493, 2020

work page 2020

[17] [17]

Mapping the evolution of data governance scientific research.Data & Policy, 7:e51, 2025

Hossein Hassani, Xu Huang, and Steve MacFeely. Mapping the evolution of data governance scientific research.Data & Policy, 7:e51, 2025

work page 2025

[18] [18]

SAGA: Source Attribution of Generative AI Videos

Rohit Kundu, Vishal Mohanty, Hao Xiong, Shan Jia, Athula Balachandran, and Amit K Roy- Chowdhury. Saga: Source attribution of generative ai videos.arXiv preprint arXiv:2511.12834, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[19] [19]

Where did i come from? origin attribution of ai-generated images.Advances in neural information processing systems, 36:74478–74500, 2023

Zhenting Wang, Chen Chen, Yi Zeng, Lingjuan Lyu, and Shiqing Ma. Where did i come from? origin attribution of ai-generated images.Advances in neural information processing systems, 36:74478–74500, 2023

work page 2023

[20] [20]

Deepfake network architecture attribution

Tianyun Yang, Ziyao Huang, Juan Cao, Lei Li, and Xirong Li. Deepfake network architecture attribution. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4662–4670, 2022

work page 2022

[21] [21]

3dgen-bench: comprehensive benchmark suite for 3d generative models.arXiv preprint arXiv:2503.21745, 2025

Yuhan Zhang, Mengchen Zhang, Tong Wu, Tengfei Wang, Gordon Wetzstein, Dahua Lin, and Ziwei Liu. 3dgen-bench: comprehensive benchmark suite for 3d generative models.arXiv preprint arXiv:2503.21745, 2025

work page arXiv 2025

[22] [22]

Scalable 3d captioning with pretrained models.Advances in Neural Information Processing Systems, 36:75307–75337, 2023

Tiange Luo, Chris Rockwell, Honglak Lee, and Justin Johnson. Scalable 3d captioning with pretrained models.Advances in Neural Information Processing Systems, 36:75307–75337, 2023

work page 2023

[23] [23]

Point-E: A System for Generating 3D Point Clouds from Complex Prompts

Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, and Mark Chen. Point-e: A system for generating 3d point clouds from complex prompts.arXiv preprint arXiv:2212.08751, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[24] [24]

Shap-E: Generating Conditional 3D Implicit Functions

Heewoo Jun and Alex Nichol. Shap-e: Generating conditional 3d implicit functions.arXiv preprint arXiv:2305.02463, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[25] [25]

Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. CNN- generated images are surprisingly easy to spot. . . for now. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8695–8704, 2020

work page 2020

[26] [26]

FaceForensics++: Learning to detect manipulated facial images

Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. FaceForensics++: Learning to detect manipulated facial images. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1–11, 2019

work page 2019

[27] [27]

On the detection of synthetic images generated by diffusion models

Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. On the detection of synthetic images generated by diffusion models. InIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, 2023

work page 2023

[28] [28]

Poisoned forgery face: Towards backdoor attacks on face forgery detection

Jiawei Liang, Siyuan Liang, Aishan Liu, Xiaojun Jia, Junhao Kuang, and Xiaochun Cao. Poisoned forgery face: Towards backdoor attacks on face forgery detection.arXiv preprint arXiv:2402.11473, 2024

work page arXiv 2024

[29] [29]

Do GANs leave artificial fingerprints? In2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 506–511, 2019

Francesco Marra, Diego Gragnaniello, Luisa Verdoliva, and Giovanni Poggi. Do GANs leave artificial fingerprints? In2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 506–511, 2019

work page 2019

[30] [30]

Attributing fake images to gans: Learning and analyzing gan fingerprints

Ning Yu, Larry S Davis, and Mario Fritz. Attributing fake images to gans: Learning and analyzing gan fingerprints. InProceedings of the IEEE/CVF international conference on computer vision, pages 7556–7566, 2019

work page 2019

[31] [31]

Artificial fingerprinting for generative models: Rooting deepfake attribution in training data

Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, and Mario Fritz. Artificial fingerprinting for generative models: Rooting deepfake attribution in training data. InProceedings of the IEEE/CVF International conference on computer vision, pages 14448–14457, 2021. 11

work page 2021

[32] [32]

Towards discovery and attribution of open-world gan generated images

Sharath Girish, Saksham Suri, Sai Saketh Rambhatla, and Abhinav Shrivastava. Towards discovery and attribution of open-world gan generated images. InProceedings of the IEEE/CVF international conference on computer vision, pages 14094–14103, 2021

work page 2021

[33] [33]

Responsible disclosure of generative models using scalable fingerprinting

Ning Yu, Vladislav Skripniuk, Dingfan Chen, Larry Davis, and Mario Fritz. Responsible disclosure of generative models using scalable fingerprinting. InInternational Conference on Learning Representations (ICLR), 2022

work page 2022

[34] [34]

Source generator attribution via inversion

Michael Albright and Scott McCloskey. Source generator attribution via inversion. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Media Forensics, pages 96–103, 2019

work page 2019

[35] [35]

Progres- sive open space expansion for open-set model attribution

Tianyun Yang, Danding Wang, Fan Tang, Xinying Zhao, Juan Cao, and Sheng Tang. Progres- sive open space expansion for open-set model attribution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15856–15865, 2023

work page 2023

[36] [36]

ManiFPT: Defining and analyzing fingerprints of generative models

Hae Jin Song, Mahyar Khayatkhoei, and Wael AbdAlmageed. ManiFPT: Defining and analyzing fingerprints of generative models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024

[37] [37]

Fakepcd: Fake point cloud detection via source attribution

Yiting Qu, Zhikun Zhang, Yun Shen, Michael Backes, and Yang Zhang. Fakepcd: Fake point cloud detection via source attribution. InProceedings of the 19th ACM Asia Conference on Computer and Communications Security, pages 930–946, 2024

work page 2024

[38] [38]

Universal watermark vaccine: Universal adversarial perturbations for watermark protection

Jianbo Chen, Xinwei Liu, Siyuan Liang, Xiaojun Jia, and Yuan Xun. Universal watermark vaccine: Universal adversarial perturbations for watermark protection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023

work page 2023

[39] [39]

arXiv preprint arXiv:2412.01528 , year=

Zhixiang Guo, Siyuan Liang, Aishan Liu, and Dacheng Tao. Copyrightshield: Spatial similarity guided backdoor defense against copyright infringement in diffusion models.arXiv preprint arXiv:2412.01528, 2024

work page arXiv 2024

[40] [40]

Me: Trigger element combination backdoor attack on copyright infringement.arXiv preprint arXiv:2506.10776, 2025

Feiyu Yang, Siyuan Liang, Aishan Liu, and Dacheng Tao. Me: Trigger element combination backdoor attack on copyright infringement.arXiv preprint arXiv:2506.10776, 2025

work page arXiv 2025

[41] [41]

Imitated detectors: Stealing knowledge of black-box object detectors

Siyuan Liang, Aishan Liu, Jiawei Liang, Longkang Li, Yang Bai, and Xiaochun Cao. Imitated detectors: Stealing knowledge of black-box object detectors. InProceedings of the 30th ACM International Conference on Multimedia, pages 4839–4847, 2022

work page 2022

[42] [42]

A large- scale audit of dataset licensing and attribution in ai.Nature Machine Intelligence, 6(8):975–987, 2024

Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, et al. A large- scale audit of dataset licensing and attribution in ai.Nature Machine Intelligence, 6(8):975–987, 2024

work page 2024

[43] [43]

AI models collapse when trained on recursively generated data.Nature, 631:755–759, 2024

Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, and Yarin Gal. AI models collapse when trained on recursively generated data.Nature, 631:755–759, 2024

work page 2024

[44] [44]

Intrigu- ing properties of synthetic images: from generative adversarial networks to diffusion models

Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. Intrigu- ing properties of synthetic images: from generative adversarial networks to diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 973–982, 2023

work page 2023

[45] [45]

Leveraging frequency analysis for deep fake image recognition

Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. Leveraging frequency analysis for deep fake image recognition. InInternational confer- ence on machine learning, pages 3247–3258. PMLR, 2020

work page 2020

[46] [46]

Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions

Ricard Durall, Margret Keuper, and Janis Keuper. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020. 12

work page 2020

[47] [47]

Vishal Asnani, Xi Yin, Tal Hassner, and Xiaoming Liu. Reverse engineering of generative models: Inferring model hyperparameters from generated images.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12):15477–15493, 2023

work page 2023

[48] [48]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[49] [49]

Gaussiandreamer: Fast generation from text to 3d gaussians.arXiv preprint arXiv:2311.11284, 2023

Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiaogang Xu, and Yingcong Chen. Lucid- dreamer: Towards high-fidelity text-to-3d generation via interval score matching.arXiv preprint arXiv:2311.11284, 2023

work page arXiv 2023

[50] [50]

Grm: Large gaussian reconstruction model for ef- ficient 3d reconstruction and generation.arXiv preprint arXiv:2403.14621, 2024

Yinghao Xu, Zifan Shi, Wang Yifan, Sida Peng, Ceyuan Yang, Yujun Shen, and Wetzstein Gor- don. Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation. arxiv: 2403.14621, 2024

work page arXiv 2024

[51] [51]

MVDream: Multi-view Diffusion for 3D Generation

Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, and Xiao Yang. Mvdream: Multi- view diffusion for 3d generation.arXiv preprint arXiv:2308.16512, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[52] [52]

Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers.arXiv preprint arXiv:2312.09147, 2023

Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, and Song-Hai Zhang. Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers.arXiv preprint arXiv:2312.09147, 2023

work page arXiv 2023

[53] [53]

Openlrm: Open-source large reconstruction models

Zexin He and Tengfei Wang. Openlrm: Open-source large reconstruction models. https: //github.com/3DTopia/OpenLRM, 2023

work page 2023

[54] [54]

Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors

Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, and Bernard Ghanem. Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors. InThe Twelfth International Conference on Learning Representations (ICLR), 2024

work page 2024

[55] [55]

Zero-1-to-3: Zero-shot one image to 3d object, 2023

Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl V ondrick. Zero-1-to-3: Zero-shot one image to 3d object, 2023

work page 2023

[56] [56]

Wonder3d: Sin- gle image to 3d using cross-domain diffusion.arXiv preprint arXiv:2310.15008, 2023

Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, et al. Wonder3d: Single image to 3d using cross-domain diffusion.arXiv preprint arXiv:2310.15008, 2023

work page arXiv 2023

[57] [57]

Free3d: Consistent novel view synthesis without 3d representation.arXiv, 2023

Chuanxia Zheng and Andrea Vedaldi. Free3d: Consistent novel view synthesis without 3d representation.arXiv, 2023

work page 2023

[58] [58]

Eschernet: A generative model for scalable view synthesis.arXiv preprint arXiv:2402.03908, 2024

Xin Kong, Shikun Liu, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, and Andrew J Davison. Eschernet: A generative model for scalable view synthesis.arXiv preprint arXiv:2402.03908, 2024

work page arXiv 2024

[59] [59]

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, and Wenping Wang. Syncdreamer: Generating multiview-consistent images from a single-view image.arXiv preprint arXiv:2309.03453, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[60] [60]

Lgm: Large multi-view gaussian model for high-resolution 3d content creation.arXiv preprint arXiv:2402.05054, 2024

Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. Lgm: Large multi-view gaussian model for high-resolution 3d content creation.arXiv preprint arXiv:2402.05054, 2024

work page arXiv 2024

[61] [61]

Latent-nerf for shape-guided generation of 3d shapes and textures

Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, and Daniel Cohen-Or. Latent-nerf for shape-guided generation of 3d shapes and textures. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12663–12673, 2023

work page 2023

[62] [62]

Empty” denotes missing prompts only during inference, while “Empty*

Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A Yeh, and Greg Shakhnarovich. Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12619–12629, 2023. 13 Figure 6: Overview of our attribution benchmark. Table 6: Overview of 3D Generation M...

work page 2023