pith. sign in

arxiv: 2604.17287 · v1 · submitted 2026-04-19 · 💻 cs.CV

Spectral Forensics of Diffusion Attention Graphs for Copy-Move Forgery Detection

Pith reviewed 2026-05-10 06:14 UTC · model grok-4.3

classification 💻 cs.CV
keywords copy-move forgerygraph Laplacianspectral analysisself-attention graphWasserstein distanceanomaly detectionimage forensicsdiffusion models
0
0 comments X

The pith

Copy-move forgeries create approximate subgraph duplications in diffusion model self-attention graphs, shifting the spectrum of the normalized graph Laplacian enough to detect via Wasserstein distance to authentic spectra.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that duplicating a region inside an image produces repeated substructures in the self-attention graph extracted from a pretrained diffusion U-Net. These repetitions redistribute the eigenvalues of the normalized graph Laplacian in a way that can be measured as a distance from the distribution of spectra drawn from authentic images. An image-level detector is built by computing this Wasserstein distance for each new image and flagging outliers, all without any forgery-specific training. Experiments on four standard benchmarks show the approach yields AUROCs between 0.606 and 0.774, with the normalized Laplacian outperforming raw attention spectra. The core argument is that the induced spectral change is specific to the duplication operation rather than to natural image content or other edits.

Core claim

Copy-move manipulation induces approximate subgraph duplication in the self-attention graph, leading to measurable spectral redistribution in the normalized graph Laplacian. This redistribution is formalized through perturbation arguments and turned into an anomaly score by comparing each image's Laplacian spectrum to a reference distribution of authentic spectra using the Wasserstein distance.

What carries the argument

The normalized graph Laplacian spectrum of the self-attention graph extracted from a pretrained Stable Diffusion U-Net, whose Wasserstein distance to an authentic reference distribution serves as the forgery anomaly score.

If this is right

  • The detector operates without any retraining on forged examples and can be applied directly to new images.
  • The normalized Laplacian spectrum yields higher AUROC than raw attention spectra by 0.057 on the largest tested benchmark.
  • On CoMoFoD the pipeline reaches AUPRC of 0.833 and TPR of 32.5 percent at 1 percent false-positive rate.
  • Null-graph controls and manipulation-strength ablations show the signal tracks the presence and extent of duplication rather than trivial graph statistics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same spectral comparison could be tested on attention graphs from other pretrained generative models to check whether the duplication signal is model-specific.
  • If the reference distribution is constructed from a broader and more diverse set of authentic images, the method's false-positive rate on natural content might decrease.
  • The approach might be combined with pixel-level or frequency-based forensic tools to localize the duplicated region after the image-level flag is raised.

Load-bearing premise

Copy-move forgeries produce subgraph duplication and spectral redistribution in the attention graphs that is distinct enough from natural image variations or other edits to be isolated by Wasserstein distance on the Laplacian spectrum.

What would settle it

A large set of copy-move forgeries in which the duplicated regions produce no measurable shift in the normalized Laplacian spectrum relative to matched authentic images, causing Wasserstein distances to overlap completely with the authentic reference distribution.

Figures

Figures reproduced from arXiv: 2604.17287 by H. M. Shadman Tabib, Nafis Tahmid, Tasriad Ahmed Tias.

Figure 1
Figure 1. Figure 1: Sample images from the RecodAI-LUC dataset. Top row: authentic images. Bottom row: corresponding forged copies with copy-move manipulations. The duplicated regions are visu￾ally subtle, making detection challenging even for human observers. For anonymity and to avoid cluttering the figure, only the dataset identifier is stated here; individual source filenames are omit￾ted. Diffusion models for image foren… view at source ↗
Figure 2
Figure 2. Figure 2: The duplicated-subgraph hypothesis. (a) An authentic image’s self-attention induces a token graph with subgraph S. (b) Copy-move forgery approximately duplicates S to create S ′ , perturbing the Laplacian L˜ = L + ∆ under real-world transforms and blending. (c) Represen￾tative illustration of the resulting spectral redistribution: eigenvalue crowding/compression and a shift in the empirical spectral densit… view at source ↗
Figure 3
Figure 3. Figure 3: GraphSpecForge pipeline overview. (a) Attention extraction: a single denoising forward pass through a pretrained Stable Diffusion v1.5 U-Net captures self-attention affinity matrices A(ℓ) at all 16 attn1 layers, spanning resolutions from 64×64 down to 8×8 tokens across the encoder (down_blocks), bottleneck (mid_block), and decoder (up_blocks). (b) Graph-spectral scoring: each matrix is symmetrised and non-… view at source ↗
Figure 4
Figure 4. Figure 4: Layer-distributed spectral clusters from an attention-derived graph Laplacian on a sam￾ple authentic image. Colours indicate cluster assignments from spectral clustering of the normalized Laplacian, demonstrating that the attention graph encodes meaningful spatial structure [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Spectral feature extraction and FSEL fusion. (a) Spectral pipelines: the sym￾metrised attention matrix A¯ is processed through two parallel paths—raw eigendecomposition (Pipeline A) and normalized Laplacian eigendecomposition (Pipeline B)—isolating structural con￾nectivity via degree normalisation. (b) Feature map Φ(xi , ℓ): four feature families are extracted per layer—transport (W1, energy, MMD), spectra… view at source ↗
Figure 6
Figure 6. Figure 6: Final detector performance curves for the best configuration (Laplacian, top-k fusion, plain z-score). (a) The ROC curve shows consistent separation above the diagonal across all oper￾ating points. (b) The PR curve maintains high precision at low recall, with AUPRC well above the class-prior baseline. The signal is statistically significant (permutation p = 0.005 across all Laplacian configurations) but mo… view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative one-vs-one case study (RecodAI-LUC pair 013) on the top decoder atten￾tion layer up_blocks.2.attentions.0.transformer_blocks.0.attn1. Left: authentic im￾age. Middle: corresponding copy-move forged image—the duplicated region is visually impercep￾tible. Right: per-image Laplacian empirical spectral density for the two images. Despite the near￾identical visual content, the forged ESD (red) is sys… view at source ↗
Figure 8
Figure 8. Figure 8: Falsification and specificity experiments (400-image subset). (A) Natural self￾similarity: score distributions for low-repetition authentics, high-repetition authentics, and copy￾move forgeries—the detector partially but not fully separates natural repetition from forgery. (B) Non-copy-move negatives: six alternative corruption types (splicing, random patch, JPEG, blur, noise, inpainting) all score below g… view at source ↗
read the original abstract

Copy-move forgery, where a region within an image is duplicated to hide or fabricate content, remains a persistent threat to visual media integrity. We introduce GraphSpecForge, a training-free framework that detects copy-move forgery by analysing the spectral structure of attention graphs from a pretrained Stable Diffusion U-Net. Our central insight is that copy-move manipulation induces approximate subgraph duplication in the self-attention graph, leading to measurable spectral redistribution in the normalized graph Laplacian. We formalise this link with perturbation-based arguments and build an image-level anomaly detector using Wasserstein distances between per-image Laplacian spectra and an authentic reference distribution. We evaluate GraphSpecForge on four copy-move benchmarks without forgery-specific retraining. On RecodAI-LUC (5,128 images), our best configuration achieves AUROC = 0.606 (95% CI: 0.580-0.638; permutation p = 0.005), and the normalized Laplacian outperforms raw attention spectra by +0.057 AUROC. On MICC-F220, CoMoFoD, and COVERAGE, the same pipeline attains AUROCs of 0.752, 0.774, and 0.673, respectively; on CoMoFoD it also reaches AUPRC = 0.833, balanced accuracy = 0.712, MCC = 0.499, and TPR@1%FPR = 32.5%. Additional ablation and falsification experiments confirm the signal's specificity and sensitivity to manipulation strength, while null-graph controls rule out trivial-statistic explanations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces GraphSpecForge, a training-free framework for copy-move forgery detection that extracts self-attention graphs from a pretrained Stable Diffusion U-Net, computes spectra of the normalized graph Laplacian, and flags anomalies via Wasserstein distance to a reference distribution derived from authentic images. The central claim is that copy-move manipulations induce approximate subgraph duplication in these attention graphs, producing measurable spectral redistribution that the Wasserstein statistic can detect; this link is formalized via perturbation arguments. Evaluations on RecodAI-LUC, MICC-F220, CoMoFoD, and COVERAGE yield AUROCs of 0.606, 0.752, 0.774, and 0.673 respectively, supported by ablations, falsification tests, and null-graph controls.

Significance. If the hypothesized connection between copy-move forgeries and specific spectral shifts in diffusion attention graphs holds, the work would offer a genuinely novel, training-free forensic tool that exploits internal representations of large pretrained models rather than hand-crafted features or supervised fine-tuning. The provision of permutation tests, ablation studies, and controls against trivial statistics strengthens the empirical foundation and makes the contribution more falsifiable than many purely heuristic detectors in the field.

major comments (2)
  1. [§3.2] §3.2 (perturbation arguments formalizing the subgraph-duplication claim): the analysis invokes standard eigenvalue perturbation bounds for graph Laplacians, which assume small, additive edge-weight changes relative to the original graph. Copy-move forgeries, however, introduce large-scale, non-local duplications that add correlated blocks of nodes and edges; the manuscript does not demonstrate that the same bounds continue to guarantee a detectable, forgery-specific redistribution under these conditions, leaving the theoretical grounding for the Wasserstein detector load-bearing but incompletely justified.
  2. [§4.3] §4.3 and Table 2 (RecodAI-LUC results): while the reported AUROC of 0.606 (p=0.005) exceeds chance, the modest effect size and the fact that the normalized Laplacian only improves over raw attention spectra by +0.057 AUROC indicate that the spectral signal may be only weakly specific to copy-move; additional controls or larger effect sizes would be needed to substantiate that the detector is driven by the claimed duplication mechanism rather than generic image statistics.
minor comments (2)
  1. [§2.1] §2.1: the precise construction of the attention graph (node definition, edge weighting, layer selection within the U-Net) is described at a high level; a short pseudocode block or explicit equations for the adjacency matrix would improve reproducibility.
  2. [Figure 3] Figure 3: the spectral plots would benefit from error bands or overlaid reference distributions to make the Wasserstein separation visually clearer.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major point below with clarifications and indicate where revisions have been made to improve the manuscript.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (perturbation arguments formalizing the subgraph-duplication claim): the analysis invokes standard eigenvalue perturbation bounds for graph Laplacians, which assume small, additive edge-weight changes relative to the original graph. Copy-move forgeries, however, introduce large-scale, non-local duplications that add correlated blocks of nodes and edges; the manuscript does not demonstrate that the same bounds continue to guarantee a detectable, forgery-specific redistribution under these conditions, leaving the theoretical grounding for the Wasserstein detector load-bearing but incompletely justified.

    Authors: We acknowledge that the standard eigenvalue perturbation bounds referenced in §3.2 assume small, additive perturbations and do not directly extend to the large-scale, structured duplications introduced by copy-move forgeries. These bounds were included to provide directional intuition for the expected spectral redistribution rather than a rigorous guarantee under arbitrary large changes. In the revised manuscript, we have updated §3.2 to explicitly note the heuristic and illustrative role of the perturbation analysis, its limited scope, and to stress that the detector's justification relies primarily on the empirical results, including permutation tests, ablations, and null-graph controls. This revision makes the theoretical contribution more transparent. revision: partial

  2. Referee: [§4.3] §4.3 and Table 2 (RecodAI-LUC results): while the reported AUROC of 0.606 (p=0.005) exceeds chance, the modest effect size and the fact that the normalized Laplacian only improves over raw attention spectra by +0.057 AUROC indicate that the spectral signal may be only weakly specific to copy-move; additional controls or larger effect sizes would be needed to substantiate that the detector is driven by the claimed duplication mechanism rather than generic image statistics.

    Authors: We agree that the AUROC of 0.606 on RecodAI-LUC is modest and reflects the dataset's difficulty and diversity. The result remains statistically significant (permutation p=0.005), and the +0.057 AUROC gain from the normalized Laplacian over raw spectra indicates that graph structure adds value beyond generic attention statistics. The manuscript already reports ablations on graph construction, falsification tests that modulate manipulation strength, and null-graph controls that exclude trivial-statistic explanations. We have revised §4.3 to more explicitly connect these controls to the duplication mechanism and to discuss the modest effect size in context of the dataset challenges. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's central step formalizes the effect of copy-move subgraph duplication on normalized Laplacian spectra via perturbation-based arguments, presented as an independent theoretical link rather than a reduction to fitted inputs or self-definition. The image-level detector computes Wasserstein distances from per-image spectra to an authentic reference distribution constructed externally, without evidence that this statistic reduces by construction to any parameter fitted on the target forgery data. No self-citations are load-bearing for the uniqueness or formalization claims, no ansatz is smuggled, and no renaming of known results occurs. The pipeline uses a fixed pretrained external U-Net and evaluates on separate benchmarks, keeping the chain self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about how forgeries affect attention graphs in pretrained diffusion models and the detectability of resulting spectral changes, without introducing new entities or many explicit free parameters beyond the reference distribution.

free parameters (1)
  • authentic reference distribution
    Used as baseline for Wasserstein distances; its construction from authentic images is data-dependent but details unspecified in abstract.
axioms (2)
  • domain assumption Copy-move manipulation induces approximate subgraph duplication in the self-attention graph of the pretrained Stable Diffusion U-Net
    Central insight formalized via perturbation-based arguments in the abstract.
  • domain assumption Spectral redistribution in the normalized graph Laplacian is measurable and specific to copy-move forgeries
    Links the manipulation to the anomaly signal detected by Wasserstein distance.

pith-pipeline@v0.9.0 · 5593 in / 1608 out tokens · 50956 ms · 2026-05-10T06:14:17.634556+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    Fridrich, D

    J. Fridrich, D. Soukal, and J. Lukáš. Detection of copy-move forgery in digital images. In Proceedings of the Digital Forensic Research Workshop, 2003

  2. [2]

    A. C. Popescu and H. Farid. Exposing digital forgeries by detecting duplicated image regions. Technical Report TR2004-515, Department of Computer Science, Dartmouth College, 2004

  3. [3]

    Amerini, L

    I. Amerini, L. Ballan, R. Caldelli, A. Del Bimbo, and G. Serra. A SIFT-based forensic method for copy-move attack detection and transformation recovery.IEEE Transactions on Informa- tion Forensics and Security, 6(3):1099–1110, 2011

  4. [4]

    S. Ryu, M. Lee, and H. Lee. Detection of copy-rotate-move forgery using Zernike moments. InProceedings of Information Hiding, pages 51–65, 2010

  5. [5]

    Cozzolino, G

    D. Cozzolino, G. Poggi, and L. Verdoliva. Efficient dense-field copy-move forgery detection. IEEE Transactions on Information Forensics and Security, 10(11):2284–2297, 2015

  6. [6]

    Christlein, C

    V . Christlein, C. Riess, J. Jordan, C. Riess, and E. Angelopoulou. An evaluation of popular copy-move forgery detection approaches.IEEE Transactions on Information Forensics and Security, 7(6):1841–1854, 2012

  7. [7]

    Y . Wu, W. Abd-Almageed, and P. Natarajan. BusterNet: Detecting copy-move image forgery with source/target localization. InProceedings of the European Conference on Computer Vi- sion (ECCV), pages 168–184, 2018. 20

  8. [8]

    B. Chen, W. Tan, G. Coatrieux, Y . Zheng, and Y . Q. Shi. A serial image copy-move forgery localization scheme with source/target distinguishment.IEEE Transactions on Multimedia, 23:3506–3517, 2021

  9. [9]

    Islam, C

    A. Islam, C. Long, A. Basharat, and A. Hoogs. DOA-GAN: Dual-order attentive generative adversarial network for image copy-move forgery detection and localization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4676–4685, 2020

  10. [10]

    J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems, 2020

  11. [11]

    Rombach, A

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image syn- thesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022

  12. [12]

    M. R. Uddin, T.-H. Nguyen, H. M. S. Tabib, K. Gandhi, and M. Xu. Unsupervised multi-scale segmentation of cellular cryo-electron tomograms with stable diffusion foundation model. bioRxiv, 2025. doi:10.1101/2025.06.25.661425

  13. [13]

    Z. Wang, J. Bao, W. Zhou, W. Wang, H. Hu, H. Chen, and H. Li. DIRE for diffusion-generated image detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

  14. [14]

    U. Ojha, Y . Li, and Y . J. Lee. Towards universal fake image detectors that generalize across generative models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24480–24489, 2023

  15. [15]

    Corvi, D

    R. Corvi, D. Cozzolino, G. Zingarini, G. Poggi, K. Nagano, and L. Verdoliva. On the detection of synthetic images generated by diffusion models. InICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1–5, 2023

  16. [16]

    R. Tang, L. Liu, A. Pandey, Z. Jiang, G. Yang, K. Kumar, P. Stenetorp, J. Lin, and F. Ture. What the DAAM: Interpreting stable diffusion using cross attention. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5644–5659, 2023

  17. [17]

    Hertz, R

    A. Hertz, R. Mokady, J. Tenenbaum, K. Aberman, Y . Pritch, and D. Cohen-Or. Prompt-to- prompt image editing with cross-attention control. InInternational Conference on Learning Representations (ICLR), 2023

  18. [18]

    C. Kim, H. Shin, E. Hong, H. Yoon, A. Arnab, P. H. Seo, S. Hong, and S. Kim. Seg4Diff: Un- veiling open-vocabulary segmentation in text-to-image diffusion transformers.arXiv preprint arXiv:2509.18096, 2025

  19. [19]

    F. R. K. Chung.Spectral Graph Theory. CBMS Regional Conference Series in Mathematics, V ol. 92. American Mathematical Society, 1997

  20. [20]

    von Luxburg

    U. von Luxburg. A tutorial on spectral clustering.Statistics and Computing, 17(4):395–416, 2007

  21. [21]

    G. W. Stewart and J. Sun.Matrix Perturbation Theory. Academic Press, 1990

  22. [22]

    W. H. Haemers. Interlacing eigenvalues and graphs.Linear Algebra and its Applications, 226–228:593–616, 1995

  23. [23]

    D. K. Hammond, P. Vandergheynst, and R. Gribonval. Wavelets on graphs via spectral graph theory.Applied and Computational Harmonic Analysis, 30(2):129–150, 2011

  24. [24]

    J. Dong, W. Wang, and T. Tan. CASIA image tampering detection evaluation database. In 2013 IEEE China Summit and International Conference on Signal and Information Processing, pages 422–426, 2013. 21

  25. [25]

    Trali ´c, I

    D. Trali ´c, I. Zupancic, S. Grgic, and M. Grgic. CoMoFoD—new database for copy-move forgery detection. InProceedings of the 55th International Symposium ELMAR, pages 49–54, 2013

  26. [26]

    B. Wen, Y . Zhu, R. Subramanian, T.-T. Ng, X. Shen, and S. Winkler. COVERAGE—a novel database for copy-move forgery detection. In2016 IEEE International Conference on Image Processing (ICIP), pages 161–165, 2016

  27. [27]

    Recod.ai/LUC – Scientific Image Forgery Detection

    Kaggle. Recod.ai/LUC – Scientific Image Forgery Detection. Kaggle competition dataset, 2025. 22 A Reproducibility Software.Python 3.10, PyTorch 2.1, diffusers 0.24, NumPy 1.24, SciPy 1.11, scikit-learn 1.3, matplotlib 3.8. Hardware.All experiments on a single NVIDIA H100 (80 GB VRAM) via Kaggle. Full pipeline runtime:∼4hours for 5,128 images across 16 lay...