Findings of the Counter Turing Test: AI-Generated Image Detection

Aishwarya Naresh Reganti; Aman Chadha; Amitava Das; Amit Sheth; Ashhar Aziz; Gurpreet Singh; Kapil Wanaskar; Nasrin Imanpour; Nilesh Ranjan Pal; Parth Patwa

arxiv: 2605.20787 · v2 · pith:A7R72KRSnew · submitted 2026-05-20 · 💻 cs.CV

Findings of the Counter Turing Test: AI-Generated Image Detection

Rajarshi Roy , Nasrin Imanpour , Ashhar Aziz , Shashwat Bajpai , Gurpreet Singh , Shwetangshu Biswas , Kapil Wanaskar , Parth Patwa

show 11 more authors

Subhankar Ghosh Shreyas Dixit Nilesh Ranjan Pal Vipula Rawte Ritvik Garimella Amitava Das Amit Sheth Vasu Sharma Aishwarya Naresh Reganti Vinija Jain Aman Chadha

This is my paper

Pith reviewed 2026-05-22 09:59 UTC · model grok-4.3

classification 💻 cs.CV

keywords AI-generated image detectiongenerative model attributionimage classificationcomputer visiondeep learning detectorssynthetic mediamodel fingerprinting

0 comments

The pith

AI-generated images can be detected with high accuracy but identifying the exact generative model remains difficult.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports results from testing whether images can be classified as real or AI-generated and, for synthetic images, which model created them. Participants applied convolutional networks, vision transformers, frequency analysis and other methods to a dataset pairing real photographs with synthetic images from several generators. Strong results on the binary task indicate that current tools can reliably flag synthetic content, which would help address misinformation if the performance holds. Weaker results on model identification show that different generators produce overlapping traces that are hard to separate.

Core claim

Participants achieved F1-scores above 0.83 for classifying images as real or AI-generated using strategies such as convolutional neural networks, vision transformers, frequency-based analysis, contrastive learning, and multimodal techniques. In contrast, the highest F1-score for identifying which particular generative model produced a given image reached only 0.4986. The evaluation relied on a dataset combining real images with 50,000 synthetic images produced by multiple generative models.

What carries the argument

A dual-task benchmark requiring first binary classification of images as real or synthetic and second attribution of synthetic images to their source generative model.

Load-bearing premise

The collected set of synthetic images from several current generative models paired with real images captures the range of visual properties that detectors will encounter in everyday use.

What would settle it

Testing the top binary and model-identification systems on images produced by generative models absent from the original dataset and measuring whether F1 scores fall below 0.7 would determine whether the reported performance generalizes.

Figures

Figures reproduced from arXiv: 2605.20787 by Aishwarya Naresh Reganti, Aman Chadha, Amitava Das, Amit Sheth, Ashhar Aziz, Gurpreet Singh, Kapil Wanaskar, Nasrin Imanpour, Nilesh Ranjan Pal, Parth Patwa, Rajarshi Roy, Ritvik Garimella, Shashwat Bajpai, Shreyas Dixit, Shwetangshu Biswas, Subhankar Ghosh, Vasu Sharma, Vinija Jain, Vipula Rawte.

**Figure 1.** Figure 1: Baseline workflow. The input image is first transformed into its frequency domain representation and then passed through a ResNet-50 CNN classifier to predict whether it is real or fake. 4. Participating Systems The challenge utilized the MS COCOAI dataset, an extension of the MS COCO dataset, comprising 50,000 images generated by models such as DALL-E 3, Stable Diffusion, and Midjourney. Participants aim… view at source ↗

read the original abstract

The rapid advancements in generative AI technologies, such as Stable Diffusion, DALL-E, and Midjourney, have significantly transformed the creation of synthetic visual content. While these models enable innovation across industries, they also pose serious challenges, including misinformation, disinformation, and biased content generation. The increasing realism of AI-generated images makes their detection a pressing concern for researchers, policymakers, and industry stakeholders. In this paper, we present the findings of the Defactify 4.0 workshop, which introduced the Counter Turing Test (CT2) for AI-Generated Image Detection. The competition consisted of two key tasks: (1) binary classification of images as either AI-generated or real and (2) identification of the specific generative model responsible for an AI-generated image. To facilitate this, we developed the MS COCOAI dataset, consisting of 50,000 synthetic images from multiple generative models alongside real-world images from the MS COCO dataset. Participants employed diverse detection strategies, including convolutional neural networks (CNNs), Vision Transformers (ViTs), frequency-based analysis, contrastive learning, and multimodal techniques. The results demonstrated that while AI-generated images can be detected with high accuracy (F1-score > 0.83), identifying the exact model used remains significantly more challenging (highest F1-score: 0.4986). These findings highlight the need for improved model fingerprinting, adversarial robustness, and real-time detection mechanisms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript reports the findings of the Defactify 4.0 workshop's Counter Turing Test (CT2) competition on AI-generated image detection. It introduces the MS COCOAI dataset (50,000 synthetic images from multiple generative models paired with real MS COCO images) and describes participant submissions using CNNs, ViTs, frequency analysis, contrastive learning, and multimodal methods. The central empirical results are F1 > 0.83 on binary AI-vs-real classification and a best F1 of 0.4986 on identifying the specific generative model.

Significance. If the benchmark proves robust, the results establish that binary detection is practically feasible with existing architectures while model attribution remains substantially harder, providing a concrete empirical baseline that can guide future work on fingerprinting and adversarial robustness.

major comments (1)

[§3] §3 (MS COCOAI dataset construction): The dataset is described only at a high level (50k synthetic images from 'multiple generative models' plus MS COCO reals). No information is given on exact model versions, generation hyperparameters, prompt sampling, resizing/upsampling kernels, or compression steps. This detail is load-bearing for the headline claim of F1 > 0.83, because without it the performance cannot be distinguished from exploitation of dataset-specific artifacts (fixed kernels, prompt biases, or train-test leakage) as noted in the stress-test concern.

minor comments (2)

[Abstract] Abstract: The statement 'F1-score > 0.83' should specify whether this is the single best submission, the mean across teams, or a threshold; the same clarification is needed for the model-identification F1 of 0.4986.
[Results] Results section: Add per-team breakdowns, number of submissions, and any statistical significance or variance measures for the reported F1 scores to allow readers to assess stability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript describing the findings of the Defactify 4.0 Counter Turing Test competition. We address the single major comment below and will incorporate the requested details to strengthen the paper's reproducibility and address concerns about potential dataset artifacts.

read point-by-point responses

Referee: [§3] §3 (MS COCOAI dataset construction): The dataset is described only at a high level (50k synthetic images from 'multiple generative models' plus MS COCO reals). No information is given on exact model versions, generation hyperparameters, prompt sampling, resizing/upsampling kernels, or compression steps. This detail is load-bearing for the headline claim of F1 > 0.83, because without it the performance cannot be distinguished from exploitation of dataset-specific artifacts (fixed kernels, prompt biases, or train-test leakage) as noted in the stress-test concern.

Authors: We agree that the current high-level description of the MS COCOAI dataset in Section 3 is insufficient for full reproducibility and does not adequately address potential concerns about dataset-specific artifacts. In the revised manuscript we will expand Section 3 with a dedicated subsection that specifies the exact generative models and versions employed, the generation hyperparameters, the prompt sampling procedure (including how MS COCO captions were selected and diversified), and all post-processing steps such as resizing kernels, upsampling methods, and compression. We will also add a brief discussion of steps taken during dataset construction to mitigate common artifacts, such as prompt diversity and standardized pipelines. These additions will allow readers to better evaluate the robustness of the reported F1 scores (>0.83 for binary detection) and will clarify that the competition dataset was designed as a standardized benchmark rather than an artifact-prone test set. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical competition report with no derivation chain

full rationale

The paper is a report on competition results for binary AI-vs-real classification and model identification using the MS COCOAI dataset. It contains no equations, mathematical derivations, fitted parameters, or self-citation chains that reduce any claimed performance metric to the input data by construction. All reported F1 scores are direct empirical outcomes from participant submissions evaluated on the held-out test split; the analysis is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters, invented entities, or non-standard axioms; the work rests on standard machine-learning evaluation assumptions such as representative train/test splits and i.i.d. sampling.

axioms (1)

domain assumption Standard machine-learning assumptions of i.i.d. data and representative sampling hold for the MS COCOAI dataset
Implicit in any competition-based performance claim

pith-pipeline@v0.9.0 · 5878 in / 1201 out tokens · 50420 ms · 2026-05-22T09:59:07.595063+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Participants employed diverse detection strategies, including convolutional neural networks (CNNs), Vision Transformers (ViTs), frequency-based analysis, contrastive learning, and multimodal techniques. The results demonstrated that while AI-generated images can be detected with high accuracy (F1-score > 0.83)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · 1 internal anchor

[1]

High- resolution image synthesis with latent diffusion models.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Robin Rombach, Patrick Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High- resolution image synthesis with latent diffusion models.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022
[2]

Zero-shot text-to-image generation.Proceedings of the International Conference on Machine Learning (ICML), 2021

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation.Proceedings of the International Conference on Machine Learning (ICML), 2021

work page 2021
[3]

The creation and detection of deepfakes: A survey.ACM Computing Surveys (CSUR), 54(1):1–41, 2021

Yisroel Mirsky and Wenke Lee. The creation and detection of deepfakes: A survey.ACM Computing Surveys (CSUR), 54(1):1–41, 2021

work page 2021
[4]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis, 2023. URL https://arxiv.org/abs/2307.01952

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

Improving image generation with better captions

James Betker, Gabriel Goh, Li Jing, TimBrooks, Jianfeng Wang, Linjie Li, LongOuyang, Jun- tangZhuang, JoyceLee, YufeiGuo, WesamManassra, PrafullaDhariwal, CaseyChu, YunxinJiao, and Aditya Ramesh. Improving image generation with better captions. URL https://api.semanticscholar. org/CorpusID:264403242

work page
[6]

Defactify 4.0

Defactify. Defactify 4.0. https://www.defactify.com/, 2025

work page 2025
[7]

Dire for diffusion-generated image detection

Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, and Houqiang Li. Dire for diffusion-generated image detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 22445–22455, October 2023

work page 2023
[8]

Aeroblade: Training-free detection of latent diffusion images using autoencoder reconstruction error, 2024

Jonas Ricker, Denis Lukovnikov, and Asja Fischer. Aeroblade: Training-free detection of latent diffusion images using autoencoder reconstruction error, 2024. URL https://arxiv.org/abs/2401. 17879

work page 2024
[9]

Lare^2: Latent reconstruction error based method for diffusion-generated image detection

Yunpeng Luo, Junlong Du, Ke Yan, and Shouhong Ding. Lare^2: Latent reconstruction error based method for diffusion-generated image detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17006–17015, June 2024

work page 2024
[10]

DRCT: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images

Baoying Chen, Jishen Zeng, Jianquan Yang, and Rui Yang. DRCT: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images. In Ruslan Salakhut- dinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceedings of the 41st International Conference on Ma...

work page 2024
[11]

Towards universal fake image detectors that generalize across generative models, 2024

Utkarsh Ojha, Yuheng Li, and Yong Jae Lee. Towards universal fake image detectors that generalize across generative models, 2024. URL https://arxiv.org/abs/2302.10174

work page arXiv 2024
[12]

Rigid: A training-free and model-agnostic framework for ro- bust ai-generated image detection

Zhiyuan He, Pin-Yu Chen, and Tsung-Yi Ho. Rigid: A training-free and model-agnostic framework for robust ai-generated image detection, 2024. URL https://arxiv.org/abs/2405.20112

work page arXiv 2024
[13]

C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection, 2024

Chuangchuang Tan, Renshuai Tao, Huan Liu, Guanghua Gu, Baoyuan Wu, Yao Zhao, and Yunchao Wei. C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection, 2024. URL https://arxiv.org/abs/2408.09647

work page arXiv 2024
[14]

Generalizable synthetic image detection via language- guided contrastive learning, 2025

Haiwei Wu, Jiantao Zhou, and Shile Zhang. Generalizable synthetic image detection via language- guided contrastive learning, 2025. URL https://arxiv.org/abs/2305.13800

work page arXiv 2025
[15]

Diffusion noise feature: Accurate and fast generated image detection, 2025

Yichi Zhang and Xiaogang Xu. Diffusion noise feature: Accurate and fast generated image detection, 2025. URL https://arxiv.org/abs/2312.02625

work page arXiv 2025
[16]

Noise-informed diffusion-generated image detection with anomaly attention, 2025

Weinan Guan, Wei Wang, Bo Peng, Ziwen He, Jing Dong, and Haonan Cheng. Noise-informed diffusion-generated image detection with anomaly attention, 2025. URL https://arxiv.org/abs/2506. 16743

work page 2025
[17]

A single simple patch is all you need for ai-generated im- age detection

Jiaxuan Chen, Jieteng Yao, and Li Niu. A single simple patch is all you need for ai-generated image detection, 2024. URL https://arxiv.org/abs/2402.01123

work page arXiv 2024
[18]

Learning on gradients: Generalized artifacts representation for gan-generated images detection

Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, and Yunchao Wei. Learning on gradients: Generalized artifacts representation for gan-generated images detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12105–12114, June 2023

work page 2023
[19]

Lei Tan, Shuwei Li, Mohan Kankanhalli, and Robby T. Tan. Aggregating diverse cue experts for ai-generated image detection, 2026. URL https://arxiv.org/abs/2601.08790

work page arXiv 2026
[20]

Gen- det: Towards good generalizations for ai-generated image detection

Mingjian Zhu, Hanting Chen, Mouxiao Huang, Wei Li, Hailin Hu, Jie Hu, and Yunhe Wang. Gendet: Towards good generalizations for ai-generated image detection, 2023. URL https://arxiv.org/abs/ 2312.08880

work page arXiv 2023
[21]

A comprehensive dataset for human vs

Rajarshi Roy, Nasrin Imanpour, Ashhar Aziz, Shashwat Bajpai, Gurpreet Singh, Shwetangshu Biswas, Kapil Wanaskar, Parth Patwa, Subhankar Ghosh, Shreyas Dixit, Nilesh Ranjan Pal, Vipula Rawte, Ritvik Garimella, Gaytri Jena, Vasu Sharma, Vinija Jain, Aman Chadha, Aishwarya Naresh Reganti, and Amitava Das. A comprehensive dataset for human vs. ai generated im...

work page
[22]

URL https://arxiv.org/abs/2601.00553

work page arXiv
[23]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InProceedings of the IEEE European Conference on Computer Vision (ECCV), 2014

work page 2014
[24]

Intriguing properties of synthetic images: From generative adversarial networks to diffusion models

Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. Intriguing properties of synthetic images: From generative adversarial networks to diffusion models. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 973–982, June 2023

work page 2023
[25]

Team nycu at defactify4: Robust detection and source identification of ai-generated images using cnn and clip-based models.arXiv preprint arXiv:2503.10718, 2025

Tsan-Tsung Yang, I-Wei Chen, Kuan-Ting Chen, Shang-Hsuan Chiang, and Wen-Chih Peng. Team nycu at defactify4: Robust detection and source identification of ai-generated images using cnn and clip-based models.arXiv preprint arXiv:2503.10718, 2025

work page arXiv 2025
[26]

Skdu at de-factify 4.0: Vision transformer with data augmentation for ai-generated image detection, 2025

Shrikant Malviya, Neelanjan Bhowmik, and Stamos Katsigiannis. Skdu at de-factify 4.0: Vision transformer with data augmentation for ai-generated image detection, 2025. URL https://arxiv. org/abs/2503.18812

work page arXiv 2025
[27]

Nau-qmul: Utilizing bert and clip for multi-modal ai-generated image detection.arXiv preprint arXiv:2602.23863, 2026

Xiaoyu Guo and Arkaitz Zubiaga. Nau-qmul: Utilizing bert and clip for multi-modal ai-generated image detection.arXiv preprint arXiv:2602.23863, 2026

work page arXiv 2026
[28]

Scalable framework for classifying ai-generated content across modalities, 2025

Anh-Kiet Duong and Petra Gomez-Krämer. Scalable framework for classifying ai-generated content across modalities, 2025. URL https://arxiv.org/abs/2502.00375

work page arXiv 2025

[1] [1]

High- resolution image synthesis with latent diffusion models.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Robin Rombach, Patrick Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High- resolution image synthesis with latent diffusion models.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022

[2] [2]

Zero-shot text-to-image generation.Proceedings of the International Conference on Machine Learning (ICML), 2021

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation.Proceedings of the International Conference on Machine Learning (ICML), 2021

work page 2021

[3] [3]

The creation and detection of deepfakes: A survey.ACM Computing Surveys (CSUR), 54(1):1–41, 2021

Yisroel Mirsky and Wenke Lee. The creation and detection of deepfakes: A survey.ACM Computing Surveys (CSUR), 54(1):1–41, 2021

work page 2021

[4] [4]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis, 2023. URL https://arxiv.org/abs/2307.01952

work page internal anchor Pith review Pith/arXiv arXiv 2023

[5] [5]

Improving image generation with better captions

James Betker, Gabriel Goh, Li Jing, TimBrooks, Jianfeng Wang, Linjie Li, LongOuyang, Jun- tangZhuang, JoyceLee, YufeiGuo, WesamManassra, PrafullaDhariwal, CaseyChu, YunxinJiao, and Aditya Ramesh. Improving image generation with better captions. URL https://api.semanticscholar. org/CorpusID:264403242

work page

[6] [6]

Defactify 4.0

Defactify. Defactify 4.0. https://www.defactify.com/, 2025

work page 2025

[7] [7]

Dire for diffusion-generated image detection

Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, and Houqiang Li. Dire for diffusion-generated image detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 22445–22455, October 2023

work page 2023

[8] [8]

Aeroblade: Training-free detection of latent diffusion images using autoencoder reconstruction error, 2024

Jonas Ricker, Denis Lukovnikov, and Asja Fischer. Aeroblade: Training-free detection of latent diffusion images using autoencoder reconstruction error, 2024. URL https://arxiv.org/abs/2401. 17879

work page 2024

[9] [9]

Lare^2: Latent reconstruction error based method for diffusion-generated image detection

Yunpeng Luo, Junlong Du, Ke Yan, and Shouhong Ding. Lare^2: Latent reconstruction error based method for diffusion-generated image detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17006–17015, June 2024

work page 2024

[10] [10]

DRCT: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images

Baoying Chen, Jishen Zeng, Jianquan Yang, and Rui Yang. DRCT: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images. In Ruslan Salakhut- dinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceedings of the 41st International Conference on Ma...

work page 2024

[11] [11]

Towards universal fake image detectors that generalize across generative models, 2024

Utkarsh Ojha, Yuheng Li, and Yong Jae Lee. Towards universal fake image detectors that generalize across generative models, 2024. URL https://arxiv.org/abs/2302.10174

work page arXiv 2024

[12] [12]

Rigid: A training-free and model-agnostic framework for ro- bust ai-generated image detection

Zhiyuan He, Pin-Yu Chen, and Tsung-Yi Ho. Rigid: A training-free and model-agnostic framework for robust ai-generated image detection, 2024. URL https://arxiv.org/abs/2405.20112

work page arXiv 2024

[13] [13]

C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection, 2024

Chuangchuang Tan, Renshuai Tao, Huan Liu, Guanghua Gu, Baoyuan Wu, Yao Zhao, and Yunchao Wei. C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection, 2024. URL https://arxiv.org/abs/2408.09647

work page arXiv 2024

[14] [14]

Generalizable synthetic image detection via language- guided contrastive learning, 2025

Haiwei Wu, Jiantao Zhou, and Shile Zhang. Generalizable synthetic image detection via language- guided contrastive learning, 2025. URL https://arxiv.org/abs/2305.13800

work page arXiv 2025

[15] [15]

Diffusion noise feature: Accurate and fast generated image detection, 2025

Yichi Zhang and Xiaogang Xu. Diffusion noise feature: Accurate and fast generated image detection, 2025. URL https://arxiv.org/abs/2312.02625

work page arXiv 2025

[16] [16]

Noise-informed diffusion-generated image detection with anomaly attention, 2025

Weinan Guan, Wei Wang, Bo Peng, Ziwen He, Jing Dong, and Haonan Cheng. Noise-informed diffusion-generated image detection with anomaly attention, 2025. URL https://arxiv.org/abs/2506. 16743

work page 2025

[17] [17]

A single simple patch is all you need for ai-generated im- age detection

Jiaxuan Chen, Jieteng Yao, and Li Niu. A single simple patch is all you need for ai-generated image detection, 2024. URL https://arxiv.org/abs/2402.01123

work page arXiv 2024

[18] [18]

Learning on gradients: Generalized artifacts representation for gan-generated images detection

Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, and Yunchao Wei. Learning on gradients: Generalized artifacts representation for gan-generated images detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12105–12114, June 2023

work page 2023

[19] [19]

Lei Tan, Shuwei Li, Mohan Kankanhalli, and Robby T. Tan. Aggregating diverse cue experts for ai-generated image detection, 2026. URL https://arxiv.org/abs/2601.08790

work page arXiv 2026

[20] [20]

Gen- det: Towards good generalizations for ai-generated image detection

Mingjian Zhu, Hanting Chen, Mouxiao Huang, Wei Li, Hailin Hu, Jie Hu, and Yunhe Wang. Gendet: Towards good generalizations for ai-generated image detection, 2023. URL https://arxiv.org/abs/ 2312.08880

work page arXiv 2023

[21] [21]

A comprehensive dataset for human vs

Rajarshi Roy, Nasrin Imanpour, Ashhar Aziz, Shashwat Bajpai, Gurpreet Singh, Shwetangshu Biswas, Kapil Wanaskar, Parth Patwa, Subhankar Ghosh, Shreyas Dixit, Nilesh Ranjan Pal, Vipula Rawte, Ritvik Garimella, Gaytri Jena, Vasu Sharma, Vinija Jain, Aman Chadha, Aishwarya Naresh Reganti, and Amitava Das. A comprehensive dataset for human vs. ai generated im...

work page

[22] [22]

URL https://arxiv.org/abs/2601.00553

work page arXiv

[23] [23]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InProceedings of the IEEE European Conference on Computer Vision (ECCV), 2014

work page 2014

[24] [24]

Intriguing properties of synthetic images: From generative adversarial networks to diffusion models

Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. Intriguing properties of synthetic images: From generative adversarial networks to diffusion models. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 973–982, June 2023

work page 2023

[25] [25]

Team nycu at defactify4: Robust detection and source identification of ai-generated images using cnn and clip-based models.arXiv preprint arXiv:2503.10718, 2025

Tsan-Tsung Yang, I-Wei Chen, Kuan-Ting Chen, Shang-Hsuan Chiang, and Wen-Chih Peng. Team nycu at defactify4: Robust detection and source identification of ai-generated images using cnn and clip-based models.arXiv preprint arXiv:2503.10718, 2025

work page arXiv 2025

[26] [26]

Skdu at de-factify 4.0: Vision transformer with data augmentation for ai-generated image detection, 2025

Shrikant Malviya, Neelanjan Bhowmik, and Stamos Katsigiannis. Skdu at de-factify 4.0: Vision transformer with data augmentation for ai-generated image detection, 2025. URL https://arxiv. org/abs/2503.18812

work page arXiv 2025

[27] [27]

Nau-qmul: Utilizing bert and clip for multi-modal ai-generated image detection.arXiv preprint arXiv:2602.23863, 2026

Xiaoyu Guo and Arkaitz Zubiaga. Nau-qmul: Utilizing bert and clip for multi-modal ai-generated image detection.arXiv preprint arXiv:2602.23863, 2026

work page arXiv 2026

[28] [28]

Scalable framework for classifying ai-generated content across modalities, 2025

Anh-Kiet Duong and Petra Gomez-Krämer. Scalable framework for classifying ai-generated content across modalities, 2025. URL https://arxiv.org/abs/2502.00375

work page arXiv 2025