Spectral Forensics of Diffusion Attention Graphs for Copy-Move Forgery Detection
Pith reviewed 2026-05-10 06:14 UTC · model grok-4.3
The pith
Copy-move forgeries create approximate subgraph duplications in diffusion model self-attention graphs, shifting the spectrum of the normalized graph Laplacian enough to detect via Wasserstein distance to authentic spectra.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Copy-move manipulation induces approximate subgraph duplication in the self-attention graph, leading to measurable spectral redistribution in the normalized graph Laplacian. This redistribution is formalized through perturbation arguments and turned into an anomaly score by comparing each image's Laplacian spectrum to a reference distribution of authentic spectra using the Wasserstein distance.
What carries the argument
The normalized graph Laplacian spectrum of the self-attention graph extracted from a pretrained Stable Diffusion U-Net, whose Wasserstein distance to an authentic reference distribution serves as the forgery anomaly score.
If this is right
- The detector operates without any retraining on forged examples and can be applied directly to new images.
- The normalized Laplacian spectrum yields higher AUROC than raw attention spectra by 0.057 on the largest tested benchmark.
- On CoMoFoD the pipeline reaches AUPRC of 0.833 and TPR of 32.5 percent at 1 percent false-positive rate.
- Null-graph controls and manipulation-strength ablations show the signal tracks the presence and extent of duplication rather than trivial graph statistics.
Where Pith is reading between the lines
- The same spectral comparison could be tested on attention graphs from other pretrained generative models to check whether the duplication signal is model-specific.
- If the reference distribution is constructed from a broader and more diverse set of authentic images, the method's false-positive rate on natural content might decrease.
- The approach might be combined with pixel-level or frequency-based forensic tools to localize the duplicated region after the image-level flag is raised.
Load-bearing premise
Copy-move forgeries produce subgraph duplication and spectral redistribution in the attention graphs that is distinct enough from natural image variations or other edits to be isolated by Wasserstein distance on the Laplacian spectrum.
What would settle it
A large set of copy-move forgeries in which the duplicated regions produce no measurable shift in the normalized Laplacian spectrum relative to matched authentic images, causing Wasserstein distances to overlap completely with the authentic reference distribution.
Figures
read the original abstract
Copy-move forgery, where a region within an image is duplicated to hide or fabricate content, remains a persistent threat to visual media integrity. We introduce GraphSpecForge, a training-free framework that detects copy-move forgery by analysing the spectral structure of attention graphs from a pretrained Stable Diffusion U-Net. Our central insight is that copy-move manipulation induces approximate subgraph duplication in the self-attention graph, leading to measurable spectral redistribution in the normalized graph Laplacian. We formalise this link with perturbation-based arguments and build an image-level anomaly detector using Wasserstein distances between per-image Laplacian spectra and an authentic reference distribution. We evaluate GraphSpecForge on four copy-move benchmarks without forgery-specific retraining. On RecodAI-LUC (5,128 images), our best configuration achieves AUROC = 0.606 (95% CI: 0.580-0.638; permutation p = 0.005), and the normalized Laplacian outperforms raw attention spectra by +0.057 AUROC. On MICC-F220, CoMoFoD, and COVERAGE, the same pipeline attains AUROCs of 0.752, 0.774, and 0.673, respectively; on CoMoFoD it also reaches AUPRC = 0.833, balanced accuracy = 0.712, MCC = 0.499, and TPR@1%FPR = 32.5%. Additional ablation and falsification experiments confirm the signal's specificity and sensitivity to manipulation strength, while null-graph controls rule out trivial-statistic explanations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces GraphSpecForge, a training-free framework for copy-move forgery detection that extracts self-attention graphs from a pretrained Stable Diffusion U-Net, computes spectra of the normalized graph Laplacian, and flags anomalies via Wasserstein distance to a reference distribution derived from authentic images. The central claim is that copy-move manipulations induce approximate subgraph duplication in these attention graphs, producing measurable spectral redistribution that the Wasserstein statistic can detect; this link is formalized via perturbation arguments. Evaluations on RecodAI-LUC, MICC-F220, CoMoFoD, and COVERAGE yield AUROCs of 0.606, 0.752, 0.774, and 0.673 respectively, supported by ablations, falsification tests, and null-graph controls.
Significance. If the hypothesized connection between copy-move forgeries and specific spectral shifts in diffusion attention graphs holds, the work would offer a genuinely novel, training-free forensic tool that exploits internal representations of large pretrained models rather than hand-crafted features or supervised fine-tuning. The provision of permutation tests, ablation studies, and controls against trivial statistics strengthens the empirical foundation and makes the contribution more falsifiable than many purely heuristic detectors in the field.
major comments (2)
- [§3.2] §3.2 (perturbation arguments formalizing the subgraph-duplication claim): the analysis invokes standard eigenvalue perturbation bounds for graph Laplacians, which assume small, additive edge-weight changes relative to the original graph. Copy-move forgeries, however, introduce large-scale, non-local duplications that add correlated blocks of nodes and edges; the manuscript does not demonstrate that the same bounds continue to guarantee a detectable, forgery-specific redistribution under these conditions, leaving the theoretical grounding for the Wasserstein detector load-bearing but incompletely justified.
- [§4.3] §4.3 and Table 2 (RecodAI-LUC results): while the reported AUROC of 0.606 (p=0.005) exceeds chance, the modest effect size and the fact that the normalized Laplacian only improves over raw attention spectra by +0.057 AUROC indicate that the spectral signal may be only weakly specific to copy-move; additional controls or larger effect sizes would be needed to substantiate that the detector is driven by the claimed duplication mechanism rather than generic image statistics.
minor comments (2)
- [§2.1] §2.1: the precise construction of the attention graph (node definition, edge weighting, layer selection within the U-Net) is described at a high level; a short pseudocode block or explicit equations for the adjacency matrix would improve reproducibility.
- [Figure 3] Figure 3: the spectral plots would benefit from error bands or overlaid reference distributions to make the Wasserstein separation visually clearer.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major point below with clarifications and indicate where revisions have been made to improve the manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2 (perturbation arguments formalizing the subgraph-duplication claim): the analysis invokes standard eigenvalue perturbation bounds for graph Laplacians, which assume small, additive edge-weight changes relative to the original graph. Copy-move forgeries, however, introduce large-scale, non-local duplications that add correlated blocks of nodes and edges; the manuscript does not demonstrate that the same bounds continue to guarantee a detectable, forgery-specific redistribution under these conditions, leaving the theoretical grounding for the Wasserstein detector load-bearing but incompletely justified.
Authors: We acknowledge that the standard eigenvalue perturbation bounds referenced in §3.2 assume small, additive perturbations and do not directly extend to the large-scale, structured duplications introduced by copy-move forgeries. These bounds were included to provide directional intuition for the expected spectral redistribution rather than a rigorous guarantee under arbitrary large changes. In the revised manuscript, we have updated §3.2 to explicitly note the heuristic and illustrative role of the perturbation analysis, its limited scope, and to stress that the detector's justification relies primarily on the empirical results, including permutation tests, ablations, and null-graph controls. This revision makes the theoretical contribution more transparent. revision: partial
-
Referee: [§4.3] §4.3 and Table 2 (RecodAI-LUC results): while the reported AUROC of 0.606 (p=0.005) exceeds chance, the modest effect size and the fact that the normalized Laplacian only improves over raw attention spectra by +0.057 AUROC indicate that the spectral signal may be only weakly specific to copy-move; additional controls or larger effect sizes would be needed to substantiate that the detector is driven by the claimed duplication mechanism rather than generic image statistics.
Authors: We agree that the AUROC of 0.606 on RecodAI-LUC is modest and reflects the dataset's difficulty and diversity. The result remains statistically significant (permutation p=0.005), and the +0.057 AUROC gain from the normalized Laplacian over raw spectra indicates that graph structure adds value beyond generic attention statistics. The manuscript already reports ablations on graph construction, falsification tests that modulate manipulation strength, and null-graph controls that exclude trivial-statistic explanations. We have revised §4.3 to more explicitly connect these controls to the duplication mechanism and to discuss the modest effect size in context of the dataset challenges. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's central step formalizes the effect of copy-move subgraph duplication on normalized Laplacian spectra via perturbation-based arguments, presented as an independent theoretical link rather than a reduction to fitted inputs or self-definition. The image-level detector computes Wasserstein distances from per-image spectra to an authentic reference distribution constructed externally, without evidence that this statistic reduces by construction to any parameter fitted on the target forgery data. No self-citations are load-bearing for the uniqueness or formalization claims, no ansatz is smuggled, and no renaming of known results occurs. The pipeline uses a fixed pretrained external U-Net and evaluates on separate benchmarks, keeping the chain self-contained.
Axiom & Free-Parameter Ledger
free parameters (1)
- authentic reference distribution
axioms (2)
- domain assumption Copy-move manipulation induces approximate subgraph duplication in the self-attention graph of the pretrained Stable Diffusion U-Net
- domain assumption Spectral redistribution in the normalized graph Laplacian is measurable and specific to copy-move forgeries
Reference graph
Works this paper leans on
-
[1]
J. Fridrich, D. Soukal, and J. Lukáš. Detection of copy-move forgery in digital images. In Proceedings of the Digital Forensic Research Workshop, 2003
work page 2003
-
[2]
A. C. Popescu and H. Farid. Exposing digital forgeries by detecting duplicated image regions. Technical Report TR2004-515, Department of Computer Science, Dartmouth College, 2004
work page 2004
-
[3]
I. Amerini, L. Ballan, R. Caldelli, A. Del Bimbo, and G. Serra. A SIFT-based forensic method for copy-move attack detection and transformation recovery.IEEE Transactions on Informa- tion Forensics and Security, 6(3):1099–1110, 2011
work page 2011
-
[4]
S. Ryu, M. Lee, and H. Lee. Detection of copy-rotate-move forgery using Zernike moments. InProceedings of Information Hiding, pages 51–65, 2010
work page 2010
-
[5]
D. Cozzolino, G. Poggi, and L. Verdoliva. Efficient dense-field copy-move forgery detection. IEEE Transactions on Information Forensics and Security, 10(11):2284–2297, 2015
work page 2015
-
[6]
V . Christlein, C. Riess, J. Jordan, C. Riess, and E. Angelopoulou. An evaluation of popular copy-move forgery detection approaches.IEEE Transactions on Information Forensics and Security, 7(6):1841–1854, 2012
work page 2012
-
[7]
Y . Wu, W. Abd-Almageed, and P. Natarajan. BusterNet: Detecting copy-move image forgery with source/target localization. InProceedings of the European Conference on Computer Vi- sion (ECCV), pages 168–184, 2018. 20
work page 2018
-
[8]
B. Chen, W. Tan, G. Coatrieux, Y . Zheng, and Y . Q. Shi. A serial image copy-move forgery localization scheme with source/target distinguishment.IEEE Transactions on Multimedia, 23:3506–3517, 2021
work page 2021
-
[9]
A. Islam, C. Long, A. Basharat, and A. Hoogs. DOA-GAN: Dual-order attentive generative adversarial network for image copy-move forgery detection and localization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4676–4685, 2020
work page 2020
-
[10]
J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems, 2020
work page 2020
-
[11]
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image syn- thesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022
work page 2022
-
[12]
M. R. Uddin, T.-H. Nguyen, H. M. S. Tabib, K. Gandhi, and M. Xu. Unsupervised multi-scale segmentation of cellular cryo-electron tomograms with stable diffusion foundation model. bioRxiv, 2025. doi:10.1101/2025.06.25.661425
-
[13]
Z. Wang, J. Bao, W. Zhou, W. Wang, H. Hu, H. Chen, and H. Li. DIRE for diffusion-generated image detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023
work page 2023
-
[14]
U. Ojha, Y . Li, and Y . J. Lee. Towards universal fake image detectors that generalize across generative models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24480–24489, 2023
work page 2023
- [15]
-
[16]
R. Tang, L. Liu, A. Pandey, Z. Jiang, G. Yang, K. Kumar, P. Stenetorp, J. Lin, and F. Ture. What the DAAM: Interpreting stable diffusion using cross attention. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5644–5659, 2023
work page 2023
- [17]
- [18]
-
[19]
F. R. K. Chung.Spectral Graph Theory. CBMS Regional Conference Series in Mathematics, V ol. 92. American Mathematical Society, 1997
work page 1997
-
[20]
U. von Luxburg. A tutorial on spectral clustering.Statistics and Computing, 17(4):395–416, 2007
work page 2007
-
[21]
G. W. Stewart and J. Sun.Matrix Perturbation Theory. Academic Press, 1990
work page 1990
-
[22]
W. H. Haemers. Interlacing eigenvalues and graphs.Linear Algebra and its Applications, 226–228:593–616, 1995
work page 1995
-
[23]
D. K. Hammond, P. Vandergheynst, and R. Gribonval. Wavelets on graphs via spectral graph theory.Applied and Computational Harmonic Analysis, 30(2):129–150, 2011
work page 2011
-
[24]
J. Dong, W. Wang, and T. Tan. CASIA image tampering detection evaluation database. In 2013 IEEE China Summit and International Conference on Signal and Information Processing, pages 422–426, 2013. 21
work page 2013
-
[25]
D. Trali ´c, I. Zupancic, S. Grgic, and M. Grgic. CoMoFoD—new database for copy-move forgery detection. InProceedings of the 55th International Symposium ELMAR, pages 49–54, 2013
work page 2013
-
[26]
B. Wen, Y . Zhu, R. Subramanian, T.-T. Ng, X. Shen, and S. Winkler. COVERAGE—a novel database for copy-move forgery detection. In2016 IEEE International Conference on Image Processing (ICIP), pages 161–165, 2016
work page 2016
-
[27]
Recod.ai/LUC – Scientific Image Forgery Detection
Kaggle. Recod.ai/LUC – Scientific Image Forgery Detection. Kaggle competition dataset, 2025. 22 A Reproducibility Software.Python 3.10, PyTorch 2.1, diffusers 0.24, NumPy 1.24, SciPy 1.11, scikit-learn 1.3, matplotlib 3.8. Hardware.All experiments on a single NVIDIA H100 (80 GB VRAM) via Kaggle. Full pipeline runtime:∼4hours for 5,128 images across 16 lay...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.