Recognition: unknown
DiffusionPrint: Learning Generative Fingerprints for Diffusion-Based Inpainting Localization
Pith reviewed 2026-05-10 15:41 UTC · model grok-4.3
The pith
A patch-level contrastive learner extracts consistent generative fingerprints from diffusion-inpainted regions to aid forgery localization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DiffusionPrint trains a convolutional backbone with a MoCo-style contrastive objective and cross-category hard negative mining plus a generator-aware classification head; the resulting forensic feature map acts as a highly discriminative secondary modality that, when fused into existing IFL pipelines, improves localization of diffusion-based inpainting across multiple generators, with gains of up to 28 percent on mask types held out from fine-tuning and confirmed generalization to unseen generative architectures.
What carries the argument
Patch-level contrastive learning that treats inpainted regions from the same diffusion model as positives and uses cross-category hard negatives to produce a forensic feature map robust to latent-decoder spectral distortions.
If this is right
- The learned feature map can be fused into TruFor, MMFusion, or lightweight baselines to raise localization performance on diffusion inpainting.
- Gains reach up to 28 percent on mask types that were never shown during fine-tuning.
- The same feature map generalizes to inpainting pipelines from generative architectures not encountered in training.
- The method supplies a secondary forensic modality that works even when camera-level noise patterns have been erased by latent decoding.
Where Pith is reading between the lines
- The same fingerprints could support model attribution tasks that identify which specific diffusion architecture created a given inpainted region.
- The contrastive training recipe might transfer to other latent-decoder pipelines such as those used in text-to-image generation or video synthesis.
- A lightweight version of the backbone could be deployed on-device for real-time screening of social-media uploads.
- Combining the new generative-fingerprint channel with residual noise or frequency-domain cues could produce still stronger hybrid detectors.
Load-bearing premise
Inpainted regions produced by one diffusion model share a consistent generative fingerprint that survives latent decoding and can be recovered by patch-level contrastive learning.
What would settle it
A controlled test in which adding the learned DiffusionPrint feature map to TruFor or MMFusion yields no measurable gain in localization IoU or F1 on a held-out set of diffusion models and mask shapes would falsify the central claim.
Figures
read the original abstract
Modern diffusion-based inpainting models pose significant challenges for image forgery localization (IFL), as their full regeneration pipelines reconstruct the entire image via a latent decoder, disrupting the camera-level noise patterns that existing forensic methods rely on. We propose DiffusionPrint, a patch-level contrastive learning framework that learns a forensic signal robust to the spectral distortions introduced by latent decoding. It exploits the fact that inpainted regions generated by the same model share a consistent generative fingerprint, using this as a self-supervisory signal. DiffusionPrint trains a convolutional backbone via a MoCo-style objective with cross-category hard negative mining and a generator-aware classification head, producing a forensic feature map that serves as a highly discriminative secondary modality in fusion-based IFL frameworks. Integrated into TruFor, MMFusion, and a lightweight fusion baseline, DiffusionPrint consistently improves localization across multiple generative models, with gains of up to +28% on mask types unseen during fine-tuning and confirmed generalization to unseen generative architectures. Code is available at https://github.com/mever-team/diffusionprint
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces DiffusionPrint, a patch-level contrastive learning framework (MoCo-style objective with cross-category hard negative mining and a generator-aware classification head) that extracts a forensic feature map from diffusion-based inpainted regions by exploiting consistent generative fingerprints across patches from the same model. This map is integrated as a secondary modality into fusion-based image forgery localization pipelines (TruFor, MMFusion, and a lightweight baseline), yielding reported localization improvements across multiple generative models, including gains of up to +28% on mask types unseen during fine-tuning and generalization to unseen diffusion architectures. Code is released at https://github.com/mever-team/diffusionprint.
Significance. If the empirical results hold under detailed validation, DiffusionPrint supplies a practical, self-supervised forensic signal that addresses the disruption of camera noise patterns by latent decoding in modern diffusion inpainting. The approach of learning generator-specific fingerprints via contrastive loss on the generative process itself, combined with public code for reproducibility, represents a constructive contribution to IFL in the diffusion era. The claimed robustness to unseen masks and architectures, if substantiated, would be a notable strength.
major comments (2)
- Abstract: the reported quantitative gains (including +28% on unseen masks) and generalization claims lack accompanying details on dataset sizes, number of generative models, statistical tests, ablation studies on the contrastive components (temperature, queue size, augmentation), or exact baseline implementations; without these, it is difficult to assess whether the improvements are robust or sensitive to post-hoc choices.
- §4 (Experimental results, assumed from abstract claims): the central hypothesis that inpainted regions from the same diffusion model share a consistent fingerprint robust to latent-decoding distortions is load-bearing for the method; the manuscript should include a targeted analysis (e.g., feature clustering by generator or t-SNE visualizations) demonstrating that the learned embeddings separate by model rather than by image content or mask geometry.
minor comments (3)
- Method section: the integration of the generator-aware classification head with the MoCo backbone would be clearer with a diagram or explicit pseudocode showing how the head is used only at training time.
- Figure captions and tables: several result tables would benefit from explicit reporting of standard deviations across multiple runs or seeds to support the generalization claims.
- Related work: the discussion of prior contrastive forensic methods could include a brief comparison table highlighting differences in negative mining strategy.
Simulated Author's Rebuttal
We thank the referee for the thorough review and positive recommendation for minor revision. We address each major comment below and outline the changes we will make to the manuscript.
read point-by-point responses
-
Referee: Abstract: the reported quantitative gains (including +28% on unseen masks) and generalization claims lack accompanying details on dataset sizes, number of generative models, statistical tests, ablation studies on the contrastive components (temperature, queue size, augmentation), or exact baseline implementations; without these, it is difficult to assess whether the improvements are robust or sensitive to post-hoc choices.
Authors: We agree that the abstract would benefit from more context on the experimental scale. In the revised version, we will modify the abstract to mention the number of diffusion models and the overall dataset size used for training and evaluation. Additionally, we will expand Section 4 to include ablation studies on the contrastive learning hyperparameters (temperature, queue size, augmentations), statistical analyses of the improvements, and clearer descriptions of the baseline implementations. This will help demonstrate the robustness of the results. revision: yes
-
Referee: §4 (Experimental results, assumed from abstract claims): the central hypothesis that inpainted regions from the same diffusion model share a consistent fingerprint robust to latent-decoding distortions is load-bearing for the method; the manuscript should include a targeted analysis (e.g., feature clustering by generator or t-SNE visualizations) demonstrating that the learned embeddings separate by model rather than by image content or mask geometry.
Authors: We concur that a direct demonstration of the hypothesis would strengthen the paper. Although the generalization performance to unseen models and mask types provides supporting evidence, we will incorporate t-SNE visualizations of the feature embeddings in the revised Section 4. These plots will be colored according to the generating model to illustrate clustering by model rather than by content or mask geometry. We will also discuss how this supports the robustness to latent-decoding distortions. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's central contribution is a patch-level contrastive learning setup (MoCo-style with hard negatives and generator-aware head) that treats same-model inpainted patches as positive pairs to learn a forensic feature map. This is a standard self-supervised objective applied to the hypothesis that diffusion inpainting leaves model-specific traces robust to latent decoding. No equations reduce the output forensic map to an input parameter by construction, no fitted quantity is relabeled as a prediction, and no load-bearing step depends on a self-citation chain or imported uniqueness theorem. Reported gains are empirical improvements when the learned map is fused into external baselines (TruFor, MMFusion) on held-out masks and architectures. The derivation remains self-contained against external benchmarks and code release.
Axiom & Free-Parameter Ledger
free parameters (1)
- MoCo training hyperparameters (temperature, queue size, augmentation strength)
axioms (1)
- domain assumption Inpainted regions from the same generative model share a consistent forensic fingerprint robust to latent decoding distortions
Reference graph
Works this paper leans on
-
[1]
Bringing generative AI into creative cloud with Adobe Firefly.https : / / blog
Adobe. Bringing generative AI into creative cloud with Adobe Firefly.https : / / blog . adobe . com / en / publish/2023/03/21/bringing- gen- ai- to- creative-cloud-adobe-firefly, 2023. Accessed: 2025-09-26. 1
2023
-
[2]
Blended latent diffusion.ACM Trans
Omri Avrahami, Ohad Fried, and Dani Lischinski. Blended latent diffusion.ACM Trans. Graph., 42(4), 2023. 1, 2
2023
-
[3]
Dragon: A large-scale dataset of realistic images generated by diffusion models, 2025
Giulia Bertazzini, Daniele Baracchi, Dasara Shullani, Isao Echizen, and Alessandro Piva. Dragon: A large-scale dataset of realistic images generated by diffusion models, 2025. 5
2025
-
[4]
Im- proved DCT coefficient analysis for forgery localization in JPEG images
Tiziano Bianchi, Alessia De Rosa, and Alessandro Piva. Im- proved DCT coefficient analysis for forgery localization in JPEG images. InIEEE International Conference on Acous- tics, Speech and Signal Processing (ICASSP), pages 2444–
-
[5]
Flux.https://github.com/ black- forest- labs/flux, 2024
Black Forest Labs. Flux.https://github.com/ black- forest- labs/flux, 2024. Accessed: 2025- 09-19. 1, 2
2024
-
[6]
Advances in ai-generated images and videos.International Journal of Interactive Multimedia and Artificial Intelligence, 9(1):173– 208, 2024
Hessen Bougueffa, Mamadou Keita, Wassim Hamidouche, Abdelmalik Taleb-Ahmed, Helena Liz-L ´opez, Alejandro Mart´ın, David Camacho, and Abdenour Hadid. Advances in ai-generated images and videos.International Journal of Interactive Multimedia and Artificial Intelligence, 9(1):173– 208, 2024. 1
2024
-
[7]
Emerg- ing properties in self-supervised vision transformers
Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg- ing properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9650–9660, 2021. 3, 5
2021
-
[8]
Cloud Yu, and Chih-Chuan Chang
I-Cheng Chang, J. Cloud Yu, and Chih-Chuan Chang. A forgery detection algorithm for exemplar-based inpainting images using multi-region relation.Image and Vision Com- puting, 31(1):57–71, 2013. 2
2013
-
[9]
A simple framework for contrastive learn- ing of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Ge- offrey Hinton. A simple framework for contrastive learn- ing of visual representations. InICML, pages 1597–1607. PMLR, 2020. 2, 3, 5
2020
-
[10]
PRNU-based detection of small-size image forgeries
Giovanni Chierchia, Sara Parrilli, Giovanni Poggi, Luisa Verdoliva, and Carlo Sansone. PRNU-based detection of small-size image forgeries. InInternational Conference on Digital Signal Processing (DSP), pages 1–6. IEEE, 2011. 2
2011
-
[11]
Intriguing properties of syn- thetic images: From generative adversarial networks to dif- fusion models
Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. Intriguing properties of syn- thetic images: From generative adversarial networks to dif- fusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, pages 973–982, 2023. 3
2023
-
[12]
Noiseprint: A cnn- based camera model fingerprint.IEEE Transactions on In- formation Forensics and Security, 15:144–159, 2020
Davide Cozzolino and Luisa Verdoliva. Noiseprint: A cnn- based camera model fingerprint.IEEE Transactions on In- formation Forensics and Security, 15:144–159, 2020. 2, 5
2020
-
[13]
Splicebuster: A new blind image splicing detector
Davide Cozzolino, Giovanni Poggi, and Luisa Verdoliva. Splicebuster: A new blind image splicing detector. InIEEE International Workshop on Information Forensics and Secu- rity (WIFS), pages 1–6. IEEE, 2015. 2
2015
-
[14]
RAISE: A raw images dataset for dig- ital image forensics
Duc-Tien Dang-Nguyen, Cecilia Pasquini, Valentina Conot- ter, and Giulia Boato. RAISE: A raw images dataset for dig- ital image forensics. InProceedings of the 6th ACM multi- media systems conference, pages 219–224, 2015. 5
2015
-
[15]
MVSS-Net: Multi-view multi-scale super- vised networks for image manipulation detection.IEEE Trans
Chengbo Dong, Xinru Chen, Ruohan Hu, Juan Cao, and Xirong Li. MVSS-Net: Multi-view multi-scale super- vised networks for image manipulation detection.IEEE Trans. Pattern Analysis and Machine Intel., 45(3):3539– 3553, 2022. 2
2022
-
[16]
Exposing digital forgeries from JPEG ghosts
Hany Farid. Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security, 4 (1):154–160, 2009. 2
2009
-
[17]
Image forgery localization via fine-grained analysis of cfa artifacts.IEEE Transactions on Information Forensics and Security, 7(5):1566–1577, 2012
Pasquale Ferrara, Tiziano Bianchi, Alessia De Rosa, and Alessandro Piva. Image forgery localization via fine-grained analysis of cfa artifacts.IEEE Transactions on Information Forensics and Security, 7(5):1566–1577, 2012. 2
2012
-
[18]
Relational Rep- resentation Distillation, 2024
Nikolaos Giakoumoglou and Tania Stathaki. Relational Rep- resentation Distillation, 2024. 3
2024
-
[19]
SynCo: Syn- thetic Hard Negatives for Contrastive Visual Representation Learning, 2025
Nikolaos Giakoumoglou and Tania Stathaki. SynCo: Syn- thetic Hard Negatives for Contrastive Visual Representation Learning, 2025. 3
2025
-
[20]
A Review on Discriminative Self-supervised Learn- ing Methods in Computer Vision, 2025
Nikolaos Giakoumoglou, Tania Stathaki, and Athanasios Gkelias. A Review on Discriminative Self-supervised Learn- ing Methods in Computer Vision, 2025. 3
2025
-
[21]
SAGI: Se- mantically aligned and uncertainty guided ai image inpaint- ing
Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos, and Panagiotis C Petrantonakis. SAGI: Se- mantically aligned and uncertainty guided ai image inpaint- ing. InProc. IEEE/CVF Int. Conf Computer Vision (ICCV),
-
[22]
Bootstrap your own latent - a new approach to self-supervised learning
Jean-Bastien Grill, Florian Strub, Florent Altch ´e, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Ghesh- laghi Azar, Bilal Piot, koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent - a new approach to self-supervised learning. InAdvances in Neural Information Process...
2020
-
[23]
TruFor: Leveraging all-round clues for trustworthy image forgery detection and localiza- tion
Fabrizio Guillaro, Davide Cozzolino, Avneesh Sud, Nicholas Dufour, and Luisa Verdoliva. TruFor: Leveraging all-round clues for trustworthy image forgery detection and localiza- tion. InProc. IEEE/CVF Conf. Computer Vision Pattern Recogn. (CVPR), pages 20606–20615, 2023. 2, 5, 6, 7, 1
2023
-
[24]
Momentum contrast for unsupervised visual repre- sentation learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual repre- sentation learning. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR),
-
[25]
Content-aware de- tection of JPEG grid inconsistencies for intuitive image forensics.Journal of Visual Communication and Image Rep- resentation, 54:155–170, 2018
Chryssanthi Iakovidou, Markos Zampoglou, Symeon Pa- padopoulos, and Yiannis Kompatsiaris. Content-aware de- tection of JPEG grid inconsistencies for intuitive image forensics.Journal of Visual Communication and Image Rep- resentation, 54:155–170, 2018. 2
2018
-
[26]
Autosplice: A text-prompt manipu- lated image dataset for media forensics
Shan Jia, Mingzhen Huang, Zhou Zhou, Yan Ju, Jialing Cai, and Siwei Lyu. Autosplice: A text-prompt manipu- lated image dataset for media forensics. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 893–903, 2023. 2
2023
-
[27]
BrushNet: A plug-and-play image inpainting model with decomposed dual-branch diffusion
Xuan Ju, Xian Liu, Xintao Wang, Yuxuan Bian, Ying Shan, and Qiang Xu. BrushNet: A plug-and-play image inpainting model with decomposed dual-branch diffusion. InEuropean Conference on Computer Vision, pages 150–168. Springer,
-
[28]
Hard Negative Mix- ing for Contrastive Learning
Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, and Diane Larlus. Hard Negative Mix- ing for Contrastive Learning. InNeurIPS, 2020. 3
2020
-
[29]
Fusion transformer with object mask guidance for image forgery analysis
Dimitrios Karageorgiou, Giorgos Kordopatis-Zilos, and Symeon Papadopoulos. Fusion transformer with object mask guidance for image forgery analysis. InProc. IEEE/CVF Conf. Computer Vission Pattern Recogn., pages 4345–4355,
-
[30]
Localization of diffusion-based inpainting in digital images.IEEE Trans Inf
Haodong Li, Weiqi Luo, and Jiwu Huang. Localization of diffusion-based inpainting in digital images.IEEE Trans Inf. Forensics Security, 12(12):3050–3064, 2017. 2
2017
-
[31]
An efficient forgery detection algorithm for object removal by exemplar-based image inpainting.Journal of Visual Com- munication and Image Representation, 30:75–85, 2015
Zaoshan Liang, Gaobo Yang, Xiangling Ding, and Leida Li. An efficient forgery detection algorithm for object removal by exemplar-based image inpainting.Journal of Visual Com- munication and Image Representation, 30:75–85, 2015. 2
2015
-
[32]
PSCC-Net: Progressive spatio-channel correlation network for image manipulation detection and localization.IEEE Trans
Xiaohong Liu, Yaojie Liu, Jun Chen, and Xiaoming Liu. PSCC-Net: Progressive spatio-channel correlation network for image manipulation detection and localization.IEEE Trans. Circuits Systems Video Technol., 32(11):7505–7517,
-
[33]
Tbformer: Two-branch transformer for image forgery localization.IEEE Signal Processing Letters, 2023
Yaqi Liu, Binbin Lv, Xin Jin, Xiaoyu Chen, and Xiaokun Zhang. Tbformer: Two-branch transformer for image forgery localization.IEEE Signal Processing Letters, 2023. 2
2023
-
[34]
MUN: Image forgery localization based on M 3 encoder and UN decoder.Proceedings of the AAAI Conference on Artificial Intelligence, 39(6):5685– 5693, 2025
Yaqi Liu, Shuhuan Chen, Haichao Shi, Xiao-Yu Zhang, Song Xiao, and Qiang Cai. MUN: Image forgery localization based on M 3 encoder and UN decoder.Proceedings of the AAAI Conference on Artificial Intelligence, 39(6):5685– 5693, 2025. 6
2025
-
[35]
Digi- tal camera identification from sensor pattern noise.IEEE Transactions on Information Forensics and Security, 1(2): 205–214, 2006
Jan Lukas, Jessica Fridrich, and Miroslav Goljan. Digi- tal camera identification from sensor pattern noise.IEEE Transactions on Information Forensics and Security, 1(2): 205–214, 2006. 2
2006
-
[36]
Using noise inconsisten- cies for blind image forensics.Image and Vision Computing, 27(10):1497–1503, 2009
Babak Mahdian and Stanislav Saic. Using noise inconsisten- cies for blind image forensics.Image and Vision Computing, 27(10):1497–1503, 2009. 2
2009
-
[37]
Hd-painter: High-resolution prompt-faithful text-guided image inpainting, 2024
Ara Manukyan. Hd-painter: High-resolution prompt-faithful text-guided image inpainting, 2024. 2
2024
-
[38]
TGIF: Text-guided inpainting forgery dataset
Hannes Mareen, Dimitrios Karageorgiou, Glenn Van Wal- lendael, Peter Lambert, and Symeon Papadopoulos. TGIF: Text-guided inpainting forgery dataset. In2024 IEEE In- ternational Workshop on Information Forensics and Security (WIFS), 2024. 1, 2
2024
-
[39]
Tgif2: Extended text-guided inpaint- ing forgery dataset and benchmark, 2026
Hannes Mareen, Dimitrios Karageorgiou, Paschalis Gi- akoumoglou, Peter Lambert, Symeon Papadopoulos, and Glenn Van Wallendael. Tgif2: Extended text-guided inpaint- ing forgery dataset and benchmark, 2026. 2, 7
2026
-
[40]
Ai- generated image detectors overrely on global artifacts: Ev- idence from inpainting exchange, 2026
Elif Nebioglu, Emirhan Bilgic ¸, and Adrian Popescu. Ai- generated image detectors overrely on global artifacts: Ev- idence from inpainting exchange, 2026. 2
2026
-
[41]
GLIDE: Towards photorealistic image genera- tion and editing with text-guided diffusion models
Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. GLIDE: Towards photorealistic image genera- tion and editing with text-guided diffusion models. InPro- ceedings of the 39th International Conference on Machine Learning, pages 16784–16804. PMLR, 2022. 2
2022
-
[42]
ZERO: A local JPEG grid origin detector based on the number of DCT zeros and its applications in image forensics.Image Processing On Line, 11:396–433, 2021
Tina Nikoukhah, J ´er´emy Anger, Miguel Colom, Jean-Michel Morel, and Rafael Grompone von Gioi. ZERO: A local JPEG grid origin detector based on the number of DCT zeros and its applications in image forensics.Image Processing On Line, 11:396–433, 2021. 2
2021
-
[43]
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas M ¨uller, Joe Penna, and Robin Rombach. SDXL: Improving latent diffusion mod- els for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023. 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[44]
Image forgery identification using convolu- tion neural network.International Journal of Recent Tech- nology and Engineering, 8(1):311–320, 2019
N Hema Rajini. Image forgery identification using convolu- tion neural network.International Journal of Recent Tech- nology and Engineering, 8(1):311–320, 2019. 2
2019
-
[45]
Contrastive learning with hard negative samples,
Joshua Robinson, Ching-Yao Chuang, Suvrit Sra, and Ste- fanie Jegelka. Contrastive learning with hard negative sam- ples.arXiv preprint arXiv:2010.04592, 2020. 3
-
[46]
High-resolution image syn- thesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High-resolution image syn- thesis with latent diffusion models. InProc. IEEE/CVF Conf. Computer Vission Pattern Recogn., 2022. 1, 2
2022
-
[47]
Rethinking image editing detection in the era of generative AI revolution
Zhihao Sun, Haipeng Fang, Juan Cao, Xinying Zhao, and Danding Wang. Rethinking image editing detection in the era of generative AI revolution. InProceedings of the 32nd ACM International Conference on Multimedia, pages 3538– 3547, 2024. 2
2024
-
[48]
Exploring multi-modal fusion for image manipulation detection and lo- calization
Konstantinos Triaridis and Vasileios Mezaris. Exploring multi-modal fusion for image manipulation detection and lo- calization. InInt. Conf. Multimedia Model., pages 198–211. Springer, 2024. 2, 6, 7, 1
2024
-
[49]
Media forensics and deepfakes: an overview.IEEE J Selected Topics Signal Process., 14(5): 910–932, 2020
Luisa Verdoliva. Media forensics and deepfakes: an overview.IEEE J Selected Topics Signal Process., 14(5): 910–932, 2020. 1
2020
-
[50]
Fleet, Radu Soricut, Jason Baldridge, Mo- hammad Norouzi, Peter Anderson, and William Chan
Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont- Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mo- hammad Norouzi, Peter Anderson, and William Chan. Im- agen editor and EditBench: Advancing and evaluating text- guided image inpainting. InProc. IEEE/CVF Conf. Com- puter Vission Patt...
2023
-
[51]
OpenSDI: Spotting diffusion-generated images in the open world
Yabin Wang, Zhiwu Huang, and Xiaopeng Hong. OpenSDI: Spotting diffusion-generated images in the open world. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 4291–4301, 2025. 2
2025
-
[52]
DIRE for diffusion-generated image detection
Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, and Houqiang Li. DIRE for diffusion-generated image detection. InProc. IEEE/CVF Int. Conf Computer Vision (ICCV), pages 22445–22455, 2023. 2
2023
-
[53]
Iid-net: Image inpainting de- tection network via neural architecture search and attention
Haiwei Wu and Jiantao Zhou. Iid-net: Image inpainting de- tection network via neural architecture search and attention. IEEE Transactions on Circuits and Systems for Video Tech- nology, 32(3):1172–1185, 2022. 2
2022
-
[54]
Detection of digital doctoring in exemplar-based in- painted images
Qiong Wu, Shao-Jie Sun, Wei Zhu, Guo-Hui Li, and Dan Tu. Detection of digital doctoring in exemplar-based in- painted images. In2008 International Conference on Ma- chine Learning and Cybernetics, pages 1222–1226, 2008. 2
2008
-
[55]
ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features
Yue Wu, Wael AbdAlmageed, and Premkumar Natarajan. ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. InProc. IEEE/CVF Conf. Computer Vision Pattern Recogn., pages 9543–9552, 2019. 2
2019
-
[56]
Alvarez, and Ping Luo
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. Segformer: Simple and effi- cient design for semantic segmentation with transformers. In Advances in Neural Information Processing Systems, pages 12077–12090. Curran Associates, Inc., 2021. 6, 1
2021
-
[57]
Common Inpainted Objects In-N-Out of Context
Tianze Yang, Tyson Jordan, Ninghao Liu, and Jin Sun. Com- mon inpainted objects in-n-out of context.arXiv preprint arXiv:2506.00721, 2025. 2
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[58]
arXiv preprint arXiv:2304.06790 , year=
Tao Yu, Runseng Feng, Ruoyu Feng, Jinming Liu, Xin Jin, Wenjun Zeng, and Zhibo Chen. Inpaint anything: Segment anything meets image inpainting.arXiv preprint arXiv:2304.06790, 2023. 2
-
[59]
A robust forgery detection algorithm for object removal by exemplar-based image in- painting.Multimedia Tools and Applications, 77(10):11823– 11842, 2018
Dengyong Zhang, Zaoshan Liang, Gaobo Yang, Qingguo Li, Leida Li, and Xingming Sun. A robust forgery detection algorithm for object removal by exemplar-based image in- painting.Multimedia Tools and Applications, 77(10):11823– 11842, 2018. 2
2018
-
[60]
Cmx: Cross-modal fusion for rgb-x semantic segmentation with transformers.IEEE Transactions on Intelligent Transportation Systems, 2023
Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruip- ing Liu, and Rainer Stiefelhagen. Cmx: Cross-modal fusion for rgb-x semantic segmentation with transformers.IEEE Transactions on Intelligent Transportation Systems, 2023. 7, 1
2023
-
[61]
Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE Transactions on Image Processing, 26(7):3142–3155, 2017
Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE Transactions on Image Processing, 26(7):3142–3155, 2017. 5
2017
-
[62]
Image region forgery detection: A deep learning ap- proach
Ying Zhang, Jonathan Goh, Lei Lei Win, and Vrizlynn LL Thing. Image region forgery detection: A deep learning ap- proach. InSG-CRC, pages 1–11, 2016. 2
2016
-
[63]
PRNU-based image forgery localization with deep multi- scale fusion.ACM Transactions on Multimedia Computing, Communications and Applications, 19(2):1–20, 2023
Yushu Zhang, Qing Tan, Shuren Qi, and Mingfu Xue. PRNU-based image forgery localization with deep multi- scale fusion.ACM Transactions on Multimedia Computing, Communications and Applications, 19(2):1–20, 2023. 2
2023
-
[64]
Ressl: Relational self-supervised learning with weak augmentation.Advances in Neural Information Processing Systems, 34:2543–2555,
Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang, and Chang Xu. Ressl: Relational self-supervised learning with weak augmentation.Advances in Neural Information Processing Systems, 34:2543–2555,
-
[65]
Morariu, and Larry S
Peng Zhou, Xintong Han, Vlad I. Morariu, and Larry S. Davis. Two-stream neural networks for tampered face de- tection, 2018. 2
2018
-
[66]
A task is worth one word: Learning with task prompts for high-quality versatile image inpainting
Junhao Zhuang, Yanhong Zeng, Wenran Liu, Chun Yuan, and Kai Chen. A task is worth one word: Learning with task prompts for high-quality versatile image inpainting. In European Conference on Computer Vision, pages 195–211. Springer, 2024. 2 DiffusionPrint: Learning Generative Fingerprints for Diffusion-Based Inpainting Localization Supplementary Material
2024
-
[67]
However, in the context of image forensics, aug- mentations must be chosen carefully to avoid destroying the delicate traces left by the generative process
Augmentation Strategies Contrastive learning relies heavily on data augmentation to generate diverse, positive views of the same underlying in- stance. However, in the context of image forensics, aug- mentations must be chosen carefully to avoid destroying the delicate traces left by the generative process. In this section, we evaluate the impact of diffe...
-
[68]
Lite Baseline Architecture. In addition to the state-of-the-art frameworks, we evaluate a custom lightweight two-stream baseline (Lite Baseline) to isolate the effectiveness of the forensic modalities with a simpler fusion mechanism. The RGB stream utilizes an ImageNet-pretrained Mix Transformer encoder (MiT-B2) from the SegFormer architecture [56]. For t...
-
[69]
The lite baseline was adapted from the TruFor implementation
Training Details For the integration of DiffusionPrint into the TruFor [23] and MMFusion [48] frameworks, we retain their original ar- chitectural designs and training protocols, referring readers to the respective papers for exhaustive network details. The lite baseline was adapted from the TruFor implementation. Across all frameworks, input images are r...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.