pith. machine review for the scientific record. sign in

arxiv: 2605.02567 · v1 · submitted 2026-05-04 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Automated In-the-Wild Data Collection for Continual AI Generated Image Detection

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:56 UTC · model grok-4.3

classification 💻 cs.CV
keywords AI-generated image detectioncontinual learningweakly supervised learningin-the-wild data collectionfact-check retrievaldistribution shiftcatastrophic forgettinggenerative models
0
0 comments X

The pith

Both in-the-wild data from automated fact-check retrieval and generator-driven data are essential for continually adapting AI-generated image detectors to new models without catastrophic forgetting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a data-centric framework that automatically gathers real-world AI-generated images by retrieving fact-check articles and pairs them with synthetic images from known generators. This combination is fed into a continual learning process so detectors can update as new generative models appear while retaining accuracy on earlier distributions. Experiments on two leading detectors report average accuracy gains of roughly nine and eight percent. A reader would care because detectors today degrade quickly under distribution shifts, and this offers a scalable way to refresh them without constant manual labeling.

Core claim

We propose a data-centric continual adaptation framework for updating detectors in evolving environments. Both in-the-wild data and generator-driven data are essential for adapting detectors. We introduce an automated, weakly supervised pipeline for constructing in-the-wild datasets through fact-check article retrieval. Incorporating even a small amount of generator-driven data during training enables effective adaptation to newly emerging models, while combining it with in-the-wild data within a continual learning framework enables robust adaptation and mitigates catastrophic forgetting.

What carries the argument

The automated weakly supervised pipeline that constructs in-the-wild datasets by retrieving fact-check articles, combined with a continual learning framework that mixes these data with generator-driven samples.

If this is right

  • Detectors achieve measurable accuracy improvements on state-of-the-art models when updated with the combined data sources.
  • Small quantities of generator-driven data suffice to adapt to newly emerging generative models.
  • Catastrophic forgetting is reduced when in-the-wild and generator-driven data are used together inside the continual learning setup.
  • The framework supports ongoing detector maintenance in environments where generative models continue to evolve.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar automated retrieval from verification sources could be tested for maintaining detectors on AI-generated video or text.
  • The approach creates a feedback loop where public fact-checking activity directly supplies training data for detection systems.
  • If retrieval noise proves higher than expected, adding a lightweight verification step could be explored as an extension.
  • The method suggests that continual adaptation pipelines might become a standard component rather than one-time training procedures.

Load-bearing premise

Fact-check articles supply sufficiently clean and representative weakly supervised labels for AI-generated images and the automated retrieval pipeline extracts them with low noise or selection bias.

What would settle it

If replacing the automatically retrieved in-the-wild data with either random labels or purely generator-driven data eliminates the reported accuracy gains on new models, the claim that both data sources are essential would be refuted.

Figures

Figures reproduced from arXiv: 2605.02567 by Christos Koutlis, Dimitrios Karageorgiou, George Karantaidis, Olga Papadopoulou, Symeon Papadopoulos, Thanasis Pantsios.

Figure 1
Figure 1. Figure 1: The proposed framework combines regular data col view at source ↗
Figure 2
Figure 2. Figure 2: Fact-check retrieval pipeline: Given an article view at source ↗
Figure 4
Figure 4. Figure 4: Examples of semantically aligned real–fake pairs. view at source ↗
Figure 3
Figure 3. Figure 3: Image segmentation: original (left) and segmented view at source ↗
Figure 6
Figure 6. Figure 6: Instruction prompt 𝑝2 template for VLM V. see Sec. 4.1), which reflects real-world image distributions and evolving artifact patterns. Second, we construct a dataset of re￾cent generators D gen 𝑡 (e.g., AIGenImages2026; see Sec. 4.2). To mitigate catastrophic forgetting, we maintain a replay buffer M𝑡−1, constructed by sampling a fixed proportion 𝜌 of the accu￾mulated data observed up to round 𝑡. The train… view at source ↗
Figure 7
Figure 7. Figure 7: AUC performance under continual learning. Results view at source ↗
Figure 8
Figure 8. Figure 8: ACC on AIGenImages2026 across generators un view at source ↗
read the original abstract

The rapid advancement of generative Artificial Intelligence (AI) has introduced significant challenges for reliable AI-generated image detection. Existing detectors often suffer from performance degradation under distribution shifts and when encountering newly emerging generative models. In this work, we propose a data-centric continual adaptation framework for updating detectors in evolving environments. We show that both in-the-wild data and generator-driven data are essential for adapting detectors. We introduce an automated, weakly supervised pipeline for constructing in-the-wild datasets through fact-check article retrieval. Additionally, we demonstrate that incorporating even a small amount of generator-driven data during training enables effective adaptation to newly emerging models, while combining it with in-the-wild data within a continual learning framework enables robust adaptation and mitigates catastrophic forgetting. Extensive experiments on two state-of-the-art detectors show significant improvements of +9.14% and +8% in average accuracy, respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper claims to introduce an automated weakly supervised pipeline for collecting in-the-wild AI-generated image data from fact-check articles. It argues that both in-the-wild and generator-driven data are necessary for continual adaptation of detectors to new generative models. The approach uses a continual learning framework to combine these data sources, mitigating catastrophic forgetting, and reports average accuracy improvements of +9.14% and +8% on two state-of-the-art detectors.

Significance. If the weak labels from fact-check articles prove reliable and the experimental results are robustly validated, this work could be significant for the field of AI-generated content detection. It offers a scalable, data-centric solution to the problem of detector degradation under distribution shifts and emerging generators, which is a pressing issue as generative AI advances rapidly. The emphasis on combining data types and continual learning provides a practical path forward for maintaining detector performance in real-world settings.

major comments (2)
  1. The abstract states clear accuracy gains from experiments on two detectors, yet provides no details on baselines, data splits, statistical significance, or controls for data quality, leaving the central claim only partially supported.
  2. The automated retrieval of fact-check articles for weak supervision is central to the in-the-wild data collection claim, but the manuscript does not include any assessment of label noise, selection bias, or verification of the quality of these labels, which is load-bearing for the assertion that this data enables robust adaptation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below with specific responses and indicate where revisions will be made to strengthen the paper.

read point-by-point responses
  1. Referee: The abstract states clear accuracy gains from experiments on two detectors, yet provides no details on baselines, data splits, statistical significance, or controls for data quality, leaving the central claim only partially supported.

    Authors: We agree that the abstract is concise and could more explicitly reference supporting details to bolster the central claim. The full manuscript describes the experimental setup in detail, including baselines (standard fine-tuning and non-continual variants), data splits for training/validation/testing, multiple random seeds for statistical reliability, and data quality controls in the pipeline description. To address this directly, we will revise the abstract to include a brief clause noting these elements and the robustness of the reported gains. revision: yes

  2. Referee: The automated retrieval of fact-check articles for weak supervision is central to the in-the-wild data collection claim, but the manuscript does not include any assessment of label noise, selection bias, or verification of the quality of these labels, which is load-bearing for the assertion that this data enables robust adaptation.

    Authors: This is a fair observation on the reliance on weak labels. The pipeline uses fact-check articles from reputable sources as a form of weak supervision, which we argue provides a practical and scalable signal. However, to strengthen the claim, we will add a dedicated analysis subsection that reports results from manual verification of a random sample of collected images (quantifying agreement with human labels) and discusses potential selection biases in article retrieval. This will provide empirical support for the data's utility in continual adaptation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical pipeline and measured improvements are self-contained

full rationale

The paper advances a data-collection pipeline and continual-learning framework whose central claims rest on experimental accuracy gains (+9.14% and +8%) obtained by training and evaluating detectors on constructed datasets. No equations, fitted parameters, or self-referential definitions appear in the provided text; the reported improvements are direct empirical measurements against external benchmarks rather than quantities defined by the method itself. Standard continual-learning citations, if present, supply independent prior techniques and do not substitute for the paper's own data-construction and evaluation steps. The derivation chain therefore remains non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard machine-learning assumptions about data distributions and the ability of continual learning to prevent forgetting, plus the domain assumption that fact-check articles yield usable weakly supervised labels.

axioms (2)
  • domain assumption Fact-check articles provide reliable weakly supervised labels for AI-generated images with acceptable noise levels
    Invoked as the basis for the automated data collection pipeline.
  • domain assumption Continual learning methods can incorporate new generator data without catastrophic forgetting when mixed with in-the-wild examples
    Central premise of the adaptation framework.

pith-pipeline@v0.9.0 · 5467 in / 1310 out tokens · 83455 ms · 2026-05-08T18:56:47.823736+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

48 extracted references · 10 canonical work pages · 1 internal anchor

  1. [1]

    Generative artificial intelligence: a sys- tematic review and applications

    Sandeep Singh Sengar, Affan Bin Hasan, Sanjay Kumar, and Fiona Carroll. Generative artificial intelligence: a sys- tematic review and applications. Multimedia Tools and Ap- plications, 84(21):23661–23700, 2025

  2. [2]

    Generative ai in depth: A survey of recent advances, model variants, and real-world applica- tions

    Shamim Y azdani, Akansha Singh, Nripsuta Saxena, Zi- chong Wang, Avash Palikhe, Deng Pan, Umapada Pal, Jie Y ang, and Wenbin Zhang. Generative ai in depth: A survey of recent advances, model variants, and real-world applica- tions. Journal of Big Data , 12(1):230, 2025

  3. [3]

    OpenGPT-4o-Image: A compre- hensive dataset for advanced image generation and editing.arXiv preprint arXiv:2509.24900, 2025

    Zhihong Chen, Xuehai Bai, Y ang Shi, Chaoyou Fu, Huanyu Zhang, Haotian Wang, Xiaoyan Sun, Zhang Zhang, Liang Wang, Yuanxing Zhang, et al. Opengpt-4o-image: A com- prehensive dataset for advanced image generation and edit- ing. arXiv preprint arXiv:2509.24900, 2025

  4. [4]

    Cnn-generated images are surprisingly easy to spot

    Sheng- Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A Efros. Cnn-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8695–8704, 2020

  5. [5]

    Rethinking the up-sampling operations in cnn-based generative network for generaliz- able deepfake detection

    Chuangchuang Tan, Y ao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Rethinking the up-sampling operations in cnn-based generative network for generaliz- able deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 28130–28139, 2024

  6. [6]

    Any-resolution ai- generated image detection by spectral learning

    Dimitrios Karageorgiou, Symeon Papadopoulos, Ioannis Kompatsiaris, and Efstratios Gavves. Any-resolution ai- generated image detection by spectral learning. In Pro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 18706–18717, 2025

  7. [7]

    Raising the bar of ai-generated image detection with clip

    Davide Cozzolino, Giovanni Poggi, Riccardo Corvi, Matthias Nießner, and Luisa Verdoliva. Raising the bar of ai-generated image detection with clip. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4356–4366, 2024

  8. [8]

    Leveraging representations from intermediate encoder-blocks for syn- thetic image detection

    Christos Koutlis and Symeon Papadopoulos. Leveraging representations from intermediate encoder-blocks for syn- thetic image detection. In European Conference on com- puter vision, pages 394–411. Springer, 2024

  9. [9]

    A sanity check for ai-generated image detection

    Shilin Y an, Ouxiang Li, Jiayin Cai, Y anbin Hao, Xiao- long Jiang, Y ao Hu, and Weidi Xie. A sanity check for ai-generated image detection. In Proceedings of the Inter- national Conference on Learning Representations (ICLR) , 2025

  10. [10]

    Navigating the challenges of ai- generated image detection in the wild: What truly matters? arXiv preprint arXiv:2507.10236, 2025

    Despina Konstantinidou, Dimitrios Karageorgiou, Chris- tos Koutlis, Olga Papadopoulou, Emmanouil Schinas, and Symeon Papadopoulos. Navigating the challenges of ai- generated image detection in the wild: What truly matters? arXiv preprint arXiv:2507.10236, 2025

  11. [11]

    Bridging the gap between ideal and real-world evaluation: Benchmarking ai-generated image detection in challeng- ing scenarios

    Chunxiao Li, Xiaoxiao Wang, Meiling Li, Boming Miao, Peng Sun, Yunjian Zhang, Xiangyang Ji, and Y ao Zhu. Bridging the gap between ideal and real-world evaluation: Benchmarking ai-generated image detection in challeng- ing scenarios. In Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision , pages 20379– 20389, 2025

  12. [12]

    S- prompts learning with pre-trained transformers: An oc- cam’s razor for domain incremental learning

    Y abin Wang, Zhiwu Huang, and Xiaopeng Hong. S- prompts learning with pre-trained transformers: An oc- cam’s razor for domain incremental learning. Advances in Neural Information Processing Systems, 35:5682–5695, 2022

  13. [13]

    Clofai: A dataset of real and fake image classification tasks for continual learning

    William Doherty, Anton Lee, and Heitor Murilo Gomes. Clofai: A dataset of real and fake image classification tasks for continual learning. In International Conference on Neural Information Processing , pages 348–362. Springer, 2024

  14. [14]

    Online detection of ai-generated images

    David C Epstein, Ishan Jain, Oliver Wang, and Richard Zhang. Online detection of ai-generated images. In Pro- ceedings of the IEEE/CVF international conference on computer vision, pages 382–392, 2023

  15. [15]

    E3: Ensemble of expert embedders for adapting synthetic image detectors to new generators using limited data

    Aref Azizpour, Tai D Nguyen, Manil Shrestha, Kaidi Xu, Edward Kim, and Matthew C Stamm. E3: Ensemble of expert embedders for adapting synthetic image detectors to new generators using limited data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4334–4344, 2024

  16. [16]

    Liteupdate: A lightweight framework for updating ai-generated image de- tectors

    Jiajie Lu, Zhenkan Fu, Na Zhao, Long Xing, Kejiang Chen, Weiming Zhang, and Nenghai Yu. Liteupdate: A lightweight framework for updating ai-generated image de- tectors. arXiv preprint arXiv:2511.07192, 2025

  17. [17]

    Improving synthetic image detection towards generalization: An image transformation perspec- tive

    Ouxiang Li, Jiayin Cai, Y anbin Hao, Xiaolong Jiang, Y ao Hu, and Fuli Feng. Improving synthetic image detection towards generalization: An image transformation perspec- tive. In Proceedings of the 31st ACM SIGKDD Confer- ence on Knowledge Discovery and Data Mining V. 1, pages 2405–2414, 2025

  18. [18]

    Evolution of detection performance through- out the online lifespan of synthetic images

    Dimitrios Karageogiou, Quentin Bammey, Valentin Por- cellini, Bertrand Goupil, Denis Teyssou, and Symeon Pa- padopoulos. Evolution of detection performance through- out the online lifespan of synthetic images. In European Conference on Computer Vision, pages 400–417. Springer, 2024

  19. [19]

    A bias-free training paradigm for more general ai-generated image de- tection

    Fabrizio Guillaro, Giada Zingarini, Ben Usman, Avneesh Sud, Davide Cozzolino, and Luisa Verdoliva. A bias-free training paradigm for more general ai-generated image de- tection. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 18685–18694, 2025. 9

  20. [20]

    T wigma: A dataset of ai- generated images with metadata from twitter

    Yiqun Chen and James Y Zou. T wigma: A dataset of ai- generated images with metadata from twitter. Advances in Neural Information Processing Systems, 36:37748–37760, 2023

  21. [21]

    On the de- tection of synthetic images generated by diffusion models

    Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Gio- vanni Poggi, Koki Nagano, and Luisa Verdoliva. On the de- tection of synthetic images generated by diffusion models. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages 1–5. IEEE, 2023

  22. [22]

    Realhd: A high-quality dataset for robust detection of state-of-the-art ai-generated images

    Hanzhe Yu, Yun Y e, Jintao Rong, Qi Xuan, and Chen Ma. Realhd: A high-quality dataset for robust detection of state-of-the-art ai-generated images. In Proceedings of the 33rd ACM International Conference on Multimedia , pages 11394–11403, 2025

  23. [23]

    Echo-4o: Harnessing the power of gpt-4o synthetic images for improved image generation.arXiv preprint arXiv:2508.09987, 2025

    Junyan Y e, Dongzhi Jiang, Zihao Wang, Leqi Zhu, Zheng- hao Hu, Zilong Huang, Jun He, Zhiyuan Y an, Jinghua Yu, Hongsheng Li, Conghui He, and Weijia Li. Echo-4o: Har- nessing the power of gpt-4o synthetic images for improved image generation. arXiv preprint arXiv:2508.09987, 2025

  24. [24]

    Realgen: Photorealistic text- to-image generation via detector-guided rewards

    Junyan Y e, Leiqi Zhu, Yuncheng Guo, Dongzhi Jiang, Zi- long Huang, Yifan Zhang, Zhiyuan Y an, Haohuan Fu, Con- ghui He, and Weijia Li. Realgen: Photorealistic text- to-image generation via detector-guided rewards. arXiv preprint arXiv:2512.00473, 2025

  25. [25]

    Synthetic images at mediaeval 2025: Advancing detection of generative ai in real-world online images

    Olga Papadopoulou, Manos Schinas, Riccardo Corvi, Dim- itrios Karageorgiou, Christos Koutlis, Fabrizio Guillaro, Efstratios Gavves, Hannes Mareen, Luisa Verdoliva, and Symeon Papadopoulos. Synthetic images at mediaeval 2025: Advancing detection of generative ai in real-world online images. In Proceedings of the MediaEval 2025 Workshop, Dublin, Ireland and...

  26. [26]

    arXiv preprint arXiv:2505.12335 , year=

    Ziqiang Li, Jiazhen Y an, Ziwen He, Kai Zeng, Weiwei Jiang, Lizhi Xiong, and Zhangjie Fu. Is artificial intelli- gence generated image detection a solved problem? arXiv preprint arXiv:2505.12335, 2025

  27. [27]

    Ai-genbench: A new ongoing benchmark for ai-generated image detection

    Lorenzo Pellegrini, Davide Cozzolino, Serafino Pandolfini, Davide Maltoni, Matteo Ferrara, Luisa Verdoliva, Marco Prati, and Marco Ramilli. Ai-genbench: A new ongoing benchmark for ai-generated image detection. In 2025 In- ternational Joint Conference on Neural Networks (IJCNN), pages 1–9. IEEE, 2025

  28. [28]

    General- ized design choices for deepfake detectors

    Lorenzo Pellegrini, Serafino Pandolfini, Davide Maltoni, Matteo Ferrara, Marco Prati, and Marco Ramilli. General- ized design choices for deepfake detectors. arXiv preprint arXiv:2511.21507, 2025

  29. [29]

    A uni- fying view on dataset shift in classification

    Jose G Moreno-Torres, Troy Raeder, Rocío Alaiz- Rodríguez, Nitesh V Chawla, and Francisco Herrera. A uni- fying view on dataset shift in classification. Pattern recog- nition, 45(1):521–530, 2012

  30. [30]

    Real-time deepfake detection in the real-world

    Bar Cavia, Eliahu Horwitz, Tal Reiss, and Y edid Hoshen. Real-time deepfake detection in the real-world. arXiv preprint arXiv:2406.09398, 2024

  31. [31]

    Google fact check tools api, 2026

    Google. Google fact check tools api, 2026. Accessed: 4 Apr. 2026

  32. [32]

    The database of known fakes (dbkf), 2026

    Ontotext. The database of known fakes (dbkf), 2026. Ac- cessed: 4 Apr. 2026

  33. [33]

    Qwen3 technical report, 2025

    Qwen Team. Qwen3 technical report, 2025

  34. [34]

    Crawl4ai: An open-source llm-friendly web crawler and scraper, 2024

    UncleCode. Crawl4ai: An open-source llm-friendly web crawler and scraper, 2024. Accessed: 4 Apr. 2026

  35. [35]

    gallery-dl, 2026

    Faehrmann, Mike. gallery-dl, 2026. Accessed: 5 Apr. 2026

  36. [36]

    Qwen2.5-vl, 2025

    Qwen Team. Qwen2.5-vl, 2025. Accessed: 4 Apr. 2026

  37. [37]

    Grounding dino: Marrying dino with grounded pre-training for open-set object detection

    Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Y ang, Qing Jiang, Chunyuan Li, Jianwei Y ang, Hang Su, et al. Grounding dino: Marrying dino with grounded pre-training for open-set object detection. In European conference on computer vision , pages 38–55. Springer, 2024

  38. [38]

    Introducing gpt-5.2, 2026

    OpenAI. Introducing gpt-5.2, 2026. Accessed: 4 Apr. 2026

  39. [39]

    Visual news: Benchmark and challenges in news image captioning

    Fuxiao Liu, Yinghan Wang, Tianlu Wang, and Vicente Or- donez. Visual news: Benchmark and challenges in news image captioning. In Proceedings of the 2021 confer- ence on empirical methods in natural language processing, pages 6761–6771, 2021

  40. [40]

    Fal.ai api for generative image models, 2026

    FAL.ai. Fal.ai api for generative image models, 2026. Ac- cessed: 4 Apr. 2026

  41. [41]

    Newsapi, 2026

    NewsAPI. Newsapi, 2026. Accessed: 5 Apr. 2026

  42. [42]

    Thomas Roca, Marco Postiglione, Chongyang Gao, Isabel Gortner, Zuzanna Wojciak, Pengce Wang, Mahsa Alimar- dani, Shirin Anlen, Kevin White, Juan Lavista Ferres, Sarit Kraus, Sam Gregory, and V . S. Subrahmanian. Introducing the mnw benchmark for ai forensics, 2025

  43. [43]

    Synthbuster: Towards detection of dif- fusion model generated images

    Quentin Bammey. Synthbuster: Towards detection of dif- fusion model generated images. IEEE Open Journal of Sig- nal Processing, 5:1–9, 2023

  44. [44]

    Deepfake-Eval-2024: A multi-modal in-the-wild benchmark of deepfakes circulated in 2024.arXiv preprint arXiv:2503.02857, 2025

    Nuria Alina Chandra, Ryan Murtfeldt, Lin Qiu, Arnab Karmakar, Hannah Lee, Emmanuel Tanumihardja, Kevin Farhat, Ben Caffee, Sejin Paik, Changyeon Lee, et al. Deepfake-eval-2024: A multi-modal in-the-wild bench- mark of deepfakes circulated in 2024. arXiv preprint arXiv:2503.02857, 2025

  45. [45]

    Learn- ing transferable visual models from natural language super- vision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. In International conference on machine learning , pages 8748–8763. PmLR, 2021. 10

  46. [46]

    Sidbench: A python framework for reliably assessing synthetic image detection methods

    Manos Schinas and Symeon Papadopoulos. Sidbench: A python framework for reliably assessing synthetic image detection methods. In Proceedings of the 3rd ACM Interna- tional Workshop on Multimedia AI against Disinformation, pages 55–64, 2024

  47. [47]

    DINOv2: Learning Robust Visual Features without Supervision

    Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervi- sion. arXiv preprint arXiv:2304.07193, 2023

  48. [48]

    Pytorch: An imperative style, high-performance deep learning li- brary

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zem- ing Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning li- brary. Advances in neural information processing systems , 32, 2019. 11