pith. sign in

arxiv: 2605.22751 · v1 · pith:RMP537M7new · submitted 2026-05-21 · 💻 cs.CV

Spectral Tail Auxiliary Learning for AI-Generated Image Detection

Pith reviewed 2026-05-22 06:10 UTC · model grok-4.3

classification 💻 cs.CV
keywords AI-generated image detectionfrequency domainspectral analysispower-law decayauxiliary learninggeneralizationhigh-frequency artifactsnonlinear harmonics
0
0 comments X

The pith

Generated images deviate from power-law spectral decay by showing an anomalous uplift in the ultra-high-frequency tail, which transfers via auxiliary training to improve spatial detectors without added inference cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes one-dimensional radial log-power spectra and finds that generated images do not simply have more or less energy in high frequencies overall. Instead they break from the expected power-law decay with a distinct uplift in the ultra-high-frequency tail. This uplift is traced to nonlinear harmonic accumulation inside trained generative models and is presented as a recurring structural cue. The authors build Spectral Tail Auxiliary Learning to let a frequency-domain teacher pass these tail cues to a conventional spatial detector only during training. At inference the frequency components are removed entirely, leaving a detector that runs at normal speed yet generalizes better across generators and datasets.

Core claim

Generated images deviate from power-law decay in their one-dimensional radial log-power spectra and exhibit an anomalous uplift in the ultra-high-frequency tail; this uplift arises from nonlinear harmonic accumulation and functions as a structural cue that can be transferred from a tail-aware frequency teacher to a spatial detector during training, with all frequency modules discarded at inference time.

What carries the argument

Spectral Tail Auxiliary Learning (STAL), a training-time auxiliary supervision framework that transfers ultra-high-frequency tail cues from a frequency-domain teacher to a spatial detector.

If this is right

  • The detector achieves stronger generalization across generators and data distributions while introducing zero inference overhead.
  • The same auxiliary supervision can be applied in real-world scenarios with mixed real and generated images.
  • Frequency-domain analysis is used only for training and does not affect deployment speed or memory.
  • The structural cue is claimed to hold across multiple public generative architectures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the tail uplift proves architecture-agnostic, the same teacher signal could be tested on emerging diffusion or autoregressive models not seen in the current experiments.
  • The approach separates training-time frequency supervision from inference-time spatial processing, suggesting a template for other detection tasks where heavy analysis is acceptable only during learning.
  • Connecting the observed harmonic accumulation to known nonlinearities in neural network activations offers a possible route to predict the uplift strength from model architecture details alone.

Load-bearing premise

The spectral tail uplift must be consistent enough across generative architectures for frequency-teacher signals to reliably improve a spatial detector's generalization.

What would settle it

A new generative model whose images follow clean power-law decay in the one-dimensional radial log-power spectrum with no uplift in the ultra-high-frequency tail would falsify the core spectral observation.

Figures

Figures reproduced from arXiv: 2605.22751 by Jiahui Zhang, Wenhao Wang, Xingyi Li, Yiheng Li, Yun Cao.

Figure 1
Figure 1. Figure 1: Radial FFT [10] power spectra of real images and fakes from BigGAN [4], SD-v1.5 [34], SDXL [31], Midjourney [29], FLUX [22], and SD-VAE [34] reconstructions. Left: spectra over the full radial frequency range. Middle: spectral tail over the local frequency range ρ ∈ [0.7, 1]. Right: normalized tail curves anchored at ρ = 0.7 to expose shape differences. Across generators, fakes show consistent spectral-tai… view at source ↗
Figure 2
Figure 2. Figure 2: Activation nonlinearity drives the spectral tail uplift. We replace every SiLU in SD-VAE [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of STAL. A tail-aware frequency teacher extracts spectral-tail cues from a [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Robustness analysis on GenImage. We evaluate STAL and competing methods under JPEG [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Effect of JPEG compression on the spectral tail. We apply JPEG compression with different [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of spectral-tail under trained and random VAE weights. We keep the SD-VAE [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visualization of model attention. From top to bottom, the three rows show the original [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
read the original abstract

As generative image models evolve rapidly, the perceptual gap between generated and real images continues to narrow, making AI-generated image detection increasingly challenging. Many existing methods exploit frequency-domain cues for detection, typically described as frequency-domain artifacts or high-frequency discrepancies. However, the specific and recurring spectral regularities remain insufficiently understood and characterized. In this paper, we systematically analyze the one-dimensional radial log-power spectra of real and generated images. We find that generated images do not necessarily exhibit higher or lower energy across the entire spectrum or high-band range. Instead, their spectra deviate from the power-law decay and show an anomalous uplift in the ultra-high-frequency tail. We term this phenomenon spectral tail uplift. We further attribute this phenomenon to nonlinear harmonic accumulation in trained generative models, suggesting that it can serve as a structural cue across generative architectures. Based on this observation, we propose Spectral Tail Auxiliary Learning (STAL), a frequency-domain auxiliary supervision framework for generalizable AI-generated image detection. STAL transfers spectral-tail cues from a tail-aware frequency teacher to a spatial detector during training, while all frequency-domain modules are discarded at inference time. Consequently, STAL introduces no inference overhead. Extensive experiments on 9 public datasets show that STAL achieves strong generalization and stability across generators, data distributions, and real-world scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript analyzes the one-dimensional radial log-power spectra of real and AI-generated images, identifying that generated images deviate from power-law decay via an anomalous uplift in the ultra-high-frequency tail, which the authors attribute to nonlinear harmonic accumulation and treat as an architecture-agnostic structural cue. Building on this, they propose Spectral Tail Auxiliary Learning (STAL), a training framework that transfers spectral-tail cues from a frequency-domain teacher network to a spatial-domain detector via auxiliary supervision; frequency modules are discarded at inference, yielding zero overhead. Experiments across 9 public datasets spanning multiple generators and real-world scenarios are reported to demonstrate improved generalization and stability.

Significance. If the spectral tail uplift observation and the auxiliary transfer mechanism hold under the reported conditions, the work offers a concrete, low-cost route to strengthening generalization in AI-generated image detectors without altering inference latency. The multi-generator, multi-dataset empirical backing and explicit validation of inference-time module removal are strengths that could make the approach practically relevant for forensic and content-authenticity applications.

major comments (1)
  1. [Methods / Spectral Analysis] The central claim that the spectral tail uplift serves as a reliable, transferable cue across generative architectures rests on the consistency of the one-dimensional radial log-power spectrum computation; the manuscript should supply the precise radial-averaging procedure, frequency binning, and normalization steps (including any windowing or log-transform details) in the methods section so that the uplift can be independently reproduced and its statistical significance quantified.
minor comments (2)
  1. [Figures] Figure captions for the spectral plots should explicitly list the exact datasets, generators, and number of images used in each panel to facilitate direct comparison with the quantitative tables.
  2. [STAL Framework] The auxiliary loss formulation would benefit from an explicit equation showing how the frequency-teacher output is aligned with the spatial detector's intermediate features (e.g., via MSE or KL divergence on the tail region).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment and the recommendation for minor revision. We address the point below.

read point-by-point responses
  1. Referee: [Methods / Spectral Analysis] The central claim that the spectral tail uplift serves as a reliable, transferable cue across generative architectures rests on the consistency of the one-dimensional radial log-power spectrum computation; the manuscript should supply the precise radial-averaging procedure, frequency binning, and normalization steps (including any windowing or log-transform details) in the methods section so that the uplift can be independently reproduced and its statistical significance quantified.

    Authors: We agree that explicit implementation details are necessary for independent reproduction and statistical assessment. The manuscript describes the computation of one-dimensional radial log-power spectra and the observed uplift but does not enumerate every procedural step. In the revised manuscript we will add a dedicated subsection in Methods that specifies the radial-averaging procedure (including the exact definition of radial bins and averaging kernel), frequency binning scheme, normalization (e.g., per-image or global), any windowing function applied prior to the FFT, and the precise log-transform formulation. These additions will allow readers to reproduce the spectra and quantify the statistical significance of the tail uplift across generators. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is empirical and self-contained

full rationale

The paper's chain begins with direct empirical measurement of one-dimensional radial log-power spectra on real and generated images, identifies the spectral tail uplift as an observed deviation from power-law decay, attributes it to nonlinear harmonic accumulation based on that observation, and then introduces STAL as an auxiliary training technique that transfers the cue without retaining frequency modules at inference. No equations, fitted parameters, or self-citations are shown to reduce the central claim or final detection metric to the inputs by construction. The approach is validated through experiments across multiple datasets and generators, remaining independent of any internal redefinition or load-bearing self-reference.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical observation that real-image spectra follow power-law decay while generated images exhibit tail uplift due to nonlinear harmonic accumulation; no explicit free parameters or invented entities are stated in the abstract.

axioms (1)
  • domain assumption Real images exhibit power-law decay in their one-dimensional radial log-power spectra
    Used as the baseline against which generated-image deviations are identified.

pith-pipeline@v0.9.0 · 5768 in / 1164 out tokens · 61660 ms · 2026-05-22T06:10:29.653941+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 2 internal anchors

  1. [1]

    Ahmed, T

    N. Ahmed, T. Natarajan, and K.R. Rao. Discrete cosine transform.IEEE Transactions on Computers, C-23(1):90–93, 1974. doi: 10.1109/T-C.1974.223784

  2. [2]

    Synthbuster: Towards detection of diffusion model generated images.IEEE Open Journal of Signal Processing, 5:1–9, 2024

    Quentin Bammey. Synthbuster: Towards detection of diffusion model generated images.IEEE Open Journal of Signal Processing, 5:1–9, 2024. doi: 10.1109/OJSP.2023.3337714

  3. [3]

    Flux.1 [dev]

    Black Forest Labs. Flux.1 [dev]. Hugging Face model card, 2024. URL https://huggingface.co/ black-forest-labs/FLUX.1-dev. Model card, accessed 2026-03-30

  4. [4]

    Large scale GAN training for high fidelity natural image synthesis

    Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. InInternational Conference on Learning Representations, 2019. URL https:// openreview.net/forum?id=B1xsqj09Fm

  5. [5]

    Real-time deepfake detection in the real world,

    Bar Cavia, Eliahu Horwitz, Tal Reiss, and Yedid Hoshen. Real-time deepfake detection in the real world,

  6. [6]

    URLhttps://openreview.net/forum?id=kkE7jlqKae

  7. [7]

    DRCT: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images

    Baoying Chen, Jishen Zeng, Jianquan Yang, and Rui Yang. DRCT: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images. InProceedings of the 41st International Conference on Machine Learning, volume 235 ofProceedings of Machine Learning Research, pages 7621–7639. PMLR, 21–27 Jul 2024

  8. [8]

    Dual data alignment makes AI-generated image detector easier generalizable

    Ruoxin Chen, Junwei Xi, Zhiyuan Yan, Ke-Yue Zhang, Shuang Wu, Jingyi Xie, Xu Chen, Lei Xu, Isabel Guan, Taiping Yao, and Shouhong Ding. Dual data alignment makes AI-generated image detector easier generalizable. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026. URLhttps://openreview.net/forum?id=C39ShJwtD5

  9. [9]

    Stargan: Unified generative adversarial networks for multi-domain image-to-image translation

    Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 8789–8797, 2018

  10. [10]

    Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error

    Beilin Chu, Xuan Xu, Xin Wang, Yufei Zhang, Weike You, and Linna Zhou. Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12830–12839, 2025

  11. [11]

    Cooley and John W

    James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19(90):297–301, 1965. doi: 10.1090/S0025-5718-1965-0178586-1

  12. [12]

    Raising the bar of ai-generated image detection with clip

    Davide Cozzolino, Giovanni Poggi, Riccardo Corvi, Matthias Nießner, and Luisa Verdoliva. Raising the bar of ai-generated image detection with clip. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 4356–4366, June 2024

  13. [13]

    Scaling rectified flow transformers for high-resolution image synthesis

    Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, and Robin Rombach. Scaling rectified flow transformers for high-resolution image synthesis. InForty-first Interna- tional Conference on Machine Learning, 2024

  14. [14]

    David J. Field. Relations between the statistics of natural images and the response properties of cortical cells.J. Opt. Soc. Am. A, 4(12):2379–2394, Dec 1987. doi: 10.1364/JOSAA.4.002379. URL https: //opg.optica.org/josaa/abstract.cfm?URI=josaa-4-12-2379

  15. [15]

    Leveraging frequency analysis for deep fake image recognition

    Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. Leveraging frequency analysis for deep fake image recognition. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 3247–3258. PMLR, 13–1...

  16. [16]

    Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

    Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. InAdvances in Neural Information Processing Systems, volume 27, 2014. 10

  17. [17]

    Delving deep into rectifiers: Surpassing human-level performance on imagenet classification

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. InProceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015

  18. [18]

    Denoising diffusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems, volume 33, pages 6840–6851, 2020

  19. [19]

    LoRA: Low-rank adaptation of large language models

    Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. InInternational Conference on Learning Representations, 2022. URLhttps://openreview.net/forum?id=nZeVKeeFYf9

  20. [20]

    Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios

    ITU-R. Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios. Recommendation ITU-R BT.601, 2011. Formerly CCIR Recommendation 601

  21. [21]

    A style-based generator architecture for generative adversarial networks

    Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019

  22. [22]

    Auto-Encoding Variational Bayes

    Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. In2nd International Conference on Learning Representations (ICLR), 2014. URLhttp://arxiv.org/abs/1312.6114

  23. [23]

    Flux.https://github.com/black-forest-labs/flux, 2024

    Black Forest Labs. Flux.https://github.com/black-forest-labs/flux, 2024

  24. [24]

    Improving synthetic image detection towards generalization: An image transformation perspective

    Ouxiang Li, Jiayin Cai, Yanbin Hao, Xiaolong Jiang, Yao Hu, and Fuli Feng. Improving synthetic image detection towards generalization: An image transformation perspective. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .1, KDD ’25, page 2405–2414. Association for Computing Machinery, 2025. ISBN 9798400712456

  25. [25]

    Masksim: Detection of synthetic images by masked spectrum similarity analysis

    Yanhao Li, Quentin Bammey, Marina Gardella, Tina Nikoukhah, Jean-Michel Morel, Miguel Colom, and Rafael Grompone V on Gioi. Masksim: Detection of synthetic images by masked spectrum similarity analysis. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 3855–3865, June 2024

  26. [26]

    Lawrence Zitnick

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, editors,Computer Vision – ECCV 2014, pages 740–755, Cham, 2014. Springer International Publishing

  27. [27]

    Forgery- aware adaptive transformer for generalizable synthetic image detection

    Huan Liu, Zichang Tan, Chuangchuang Tan, Yunchao Wei, Jingdong Wang, and Yao Zhao. Forgery- aware adaptive transformer for generalizable synthetic image detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10770–10780, June 2024

  28. [28]

    Decoupled weight decay regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2019. URLhttps://openreview.net/forum?id=Bkg6RiCqY7

  29. [29]

    Lare^2: Latent reconstruction error based method for diffusion-generated image detection

    Yunpeng Luo, Junlong Du, Ke Yan, and Shouhong Ding. Lare^2: Latent reconstruction error based method for diffusion-generated image detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17006–17015, 2024

  30. [30]

    Midjourney.https://www.midjourney.com/home

    Midjourney, Inc. Midjourney.https://www.midjourney.com/home

  31. [31]

    Towards universal fake image detectors that generalize across generative models

    Utkarsh Ojha, Yuheng Li, and Yong Jae Lee. Towards universal fake image detectors that generalize across generative models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24480–24489, 2023

  32. [32]

    Sdxl: Improving latent diffusion models for high-resolution image synthesis

    Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis. In B. Kim, Y . Yue, S. Chaudhuri, K. Fragkiadaki, M. Khan, and Y . Sun, editors,International Conference on Learning Representations, volume 2024, pages 1862–1874...

  33. [33]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of ...

  34. [34]

    Aligned datasets improve detection of latent diffusion-generated images

    Anirudh Sundara Rajan, Utkarsh Ojha, Jedidiah Schloesser, and Yong Jae Lee. Aligned datasets improve detection of latent diffusion-generated images. InThe Thirteenth International Conference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=doBkiqESYq

  35. [35]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022

  36. [36]

    Oriane Siméoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timothée Darcet, Théo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, Julie...

  37. [37]

    Natural image statistics and neural representation.Annual review of neuroscience, 24(1):1193–1216, 2001

    Eero P Simoncelli and Bruno A Olshausen. Natural image statistics and neural representation.Annual review of neuroscience, 24(1):1193–1216, 2001

  38. [38]

    Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Frequency-aware deepfake detection: Improving generalizability through frequency space domain learning.Proceedings of the AAAI Conference on Artificial Intelligence, 38(5):5052–5060, Mar. 2024

  39. [39]

    Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection

    Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 28130–28139, June 2024

  40. [40]

    C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection

    Chuangchuang Tan, Renshuai Tao, Huan Liu, Guanghua Gu, Baoyuan Wu, Yao Zhao, and Yunchao Wei. C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(7):7184–7192, Apr. 2025. doi: 10.1609/ aaai.v39i7.32772

  41. [41]

    Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. Cnn-generated images are surprisingly easy to spot... for now. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

  42. [42]

    A sanity check for AI-generated image detection

    Shilin Yan, Ouxiang Li, Jiayin Cai, Yanbin Hao, Xiaolong Jiang, Yao Hu, and Weidi Xie. A sanity check for AI-generated image detection. InThe Thirteenth International Conference on Learning Representations, 2025

  43. [43]

    Detecting and simulating artifacts in gan fake images

    Xu Zhang, Svebor Karaman, and Shih-Fu Chang. Detecting and simulating artifacts in gan fake images. In 2019 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–6, 2019

  44. [44]

    Patchcraft: Exploring texture patch for efficient ai-generated image detection

    Nan Zhong, Yiran Xu, Zhenxing Qian, and Xinpeng Zhang. Patchcraft: Exploring texture patch for efficient ai-generated image detection.arXiv preprint arXiv:2311.12397, 2023

  45. [45]

    Unpaired image-to-image translation using cycle-consistent adversarial networks

    Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. InProceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017

  46. [46]

    Genimage: A million-scale benchmark for detecting ai-generated image

    Mingjian Zhu, Hanting Chen, Qiangyu YAN, Xudong Huang, Guanyu Lin, Wei Li, Zhijun Tu, Hailin Hu, Jie Hu, and Yunhe Wang. Genimage: A million-scale benchmark for detecting ai-generated image. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Information Processing Systems, volume 36, pages 77771–77782. Curra...

  47. [47]

    Normalized curves show spectra on ρ∈[0.7,1] .Right: tail uplift∆ log 10 P, the rise from the tail’s minimum toρ= 1

    architecture fixed and compare trained weights with random-initialized weights using pink noise (left) and real images (middle) as inputs. Normalized curves show spectra on ρ∈[0.7,1] .Right: tail uplift∆ log 10 P, the rise from the tail’s minimum toρ= 1. A.1 Spectral Tail Uplift under JPEG Compression Due to the loss of high-frequency information caused b...