pith. sign in

arxiv: 2512.13915 · v1 · submitted 2025-12-15 · 💻 cs.SI

Deepfakes in the 2025 Canadian Election: Prevalence, Partisanship, and Platform Dynamics

Pith reviewed 2026-05-16 21:30 UTC · model grok-4.3

classification 💻 cs.SI
keywords deepfakesCanadian electionsocial mediapartisanshipsynthetic mediaimage detectionmisinformationX platform
0
0 comments X

The pith

Deepfakes appeared in 5.86 percent of election-related images during the 2025 Canadian federal election, shared more often by right-leaning accounts but reaching few viewers overall.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper measures how often AI-generated deepfake images appear and spread in online discussions of the 2025 Canadian federal election. The authors analyzed 187,778 posts from X, Bluesky, and Reddit using a detection framework trained on modern generative models. They report that deepfakes accounted for 5.86 percent of election images, with right-leaning accounts posting them at 8.66 percent versus 4.42 percent for left-leaning accounts, frequently for defamatory or conspiratorial purposes. Most deepfakes turned out to be benign or non-political, and the harmful ones received only 0.12 percent of total views on X. The work shows that deepfakes have entered election talk but currently command limited reach.

Core claim

By analyzing 187,778 posts from X, Bluesky, and Reddit with a high-accuracy detection framework trained on a diverse set of modern generative models, we find that 5.86% of election-related images were deepfakes. Right-leaning accounts shared them more frequently, with 8.66% of their posted images flagged compared to 4.42% for left-leaning users, often with defamatory or conspiratorial intent. Yet, most detected deepfakes were benign or non-political, and harmful ones drew little attention, accounting for only 0.12% of all views on X. Overall, deepfakes were present in the election conversation, but their reach was modest, and realistic fabricated images, although less common, drew higher.

What carries the argument

A high-accuracy detection framework trained on a diverse set of modern generative models, used to classify election-related images from social media posts as deepfakes or authentic.

Load-bearing premise

The detection framework correctly identifies deepfakes at scale with few errors, and the partisan labeling of accounts is accurate without major misclassification.

What would settle it

Manual review of a representative sample of flagged and unflagged images that yields a deepfake rate significantly different from 5.86 percent, or reclassification of account partisanship that changes the 8.66 percent versus 4.42 percent sharing gap.

Figures

Figures reproduced from arXiv: 2512.13915 by Andreea Musulan, Jean-Fran\c{c}ois Godbout, Reihaneh Rabbany, Victor Livernoche, Zachary Yang.

Figure 1
Figure 1. Figure 1: Examples of the seven intent categories used in our analysis, illustrating the range of political and non-political uses of AI-generated imagery, from defamatory and conspiratorial content to benign and artistic creations. Outside explicit political contexts, synthetic imagery has be￾come commonplace in online spaces optimized for engagement. DiResta and Goldstein [3] describe Facebook pages that mass￾prod… view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of deepfake intents by political lean [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prevalence of deepfake intents across platforms. [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: View counts of intents. Each curve shows the ECDF: [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: View counts by political leaning and deepfakes [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Number of deepfakes vs. author follower count [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
read the original abstract

Concerns about AI-generated political content are growing, yet there is limited empirical evidence on how deepfakes actually appear and circulate across social platforms during major events in democratic countries. In this study, we present one of the first in-depth analyses of how these realistic synthetic media shape the political landscape online, focusing specifically on the 2025 Canadian federal election. By analyzing 187,778 posts from X, Bluesky, and Reddit with a high-accuracy detection framework trained on a diverse set of modern generative models, we find that 5.86% of election-related images were deepfakes. Right-leaning accounts shared them more frequently, with 8.66% of their posted images flagged compared to 4.42% for left-leaning users, often with defamatory or conspiratorial intent. Yet, most detected deepfakes were benign or non-political, and harmful ones drew little attention, accounting for only 0.12% of all views on X. Overall, deepfakes were present in the election conversation, but their reach was modest, and realistic fabricated images, although less common, drew higher engagement, highlighting growing concerns about their potential misuse.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper analyzes 187,778 election-related posts containing images from X, Bluesky, and Reddit during the 2025 Canadian federal election. Using a deepfake detection model trained on modern generative models, it reports that 5.86% of images were classified as deepfakes, with right-leaning accounts showing a higher rate (8.66%) than left-leaning ones (4.42%). Most detected deepfakes were benign or non-political, harmful content had minimal reach (0.12% of views on X), and realistic fakes drew higher engagement.

Significance. If the detection framework proves reliable, the study supplies one of the first large-scale empirical measurements of deepfake prevalence and circulation in a real democratic election. The direct measurement approach (no fitted models or self-referential definitions) and the finding of modest overall reach despite partisan differences would be a useful data point for platform policy and misinformation research.

major comments (2)
  1. [Methods / Detection Framework] The central prevalence claim (5.86% deepfakes) and the partisan split (8.66% vs 4.42%) rest entirely on the binary outputs of the detection framework. The abstract and methods description label the model 'high-accuracy' and trained on diverse generators, yet no precision, recall, false-positive rate, or human-annotated validation metrics are reported on the actual 187,778 election images (or on any stratified subset by platform, compression, or image quality). Without these, both the headline rate and the right-leaning excess cannot be assessed for systematic bias.
  2. [Methods / Account Classification] Partisan labeling of accounts is used to compute the 8.66%/4.42% split and to interpret intent (defamatory or conspiratorial). No details are given on the labeling procedure, inter-rater reliability, or error rate for this classification step, which directly affects the partisanship findings.
minor comments (2)
  1. [Abstract] The abstract states clear percentages but supplies no error bars, confidence intervals, or sample-size context for the 5.86% figure.
  2. [Results] The claim that 'most detected deepfakes were benign' and the 0.12% view share for harmful content are presented without the underlying counts or definitions of 'benign' vs 'harmful' used in the annotation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We have reviewed the major comments carefully and provide point-by-point responses below, indicating where revisions will be made to strengthen the methods section.

read point-by-point responses
  1. Referee: [Methods / Detection Framework] The central prevalence claim (5.86% deepfakes) and the partisan split (8.66% vs 4.42%) rest entirely on the binary outputs of the detection framework. The abstract and methods description label the model 'high-accuracy' and trained on diverse generators, yet no precision, recall, false-positive rate, or human-annotated validation metrics are reported on the actual 187,778 election images (or on any stratified subset by platform, compression, or image quality). Without these, both the headline rate and the right-leaning excess cannot be assessed for systematic bias.

    Authors: We agree that explicit performance metrics on the election images themselves are essential for evaluating potential biases from platform-specific factors such as compression or image quality. The detection model was trained and internally validated on a diverse collection of modern generative models, but the current manuscript does not report precision, recall, or false-positive rates on the 187,778-image corpus or on stratified subsets. In the revised version we will add these metrics, obtained via human annotation of a representative sample stratified by platform and image characteristics, to allow readers to assess the reliability of the prevalence and partisanship results. revision: yes

  2. Referee: [Methods / Account Classification] Partisan labeling of accounts is used to compute the 8.66%/4.42% split and to interpret intent (defamatory or conspiratorial). No details are given on the labeling procedure, inter-rater reliability, or error rate for this classification step, which directly affects the partisanship findings.

    Authors: We acknowledge that the current manuscript provides insufficient detail on the partisan labeling procedure. The accounts were classified through a multi-coder review of profiles, bios, and posting histories using predefined criteria. In the revision we will expand the methods section to fully describe the labeling protocol, the number of coders, inter-rater reliability (e.g., Cohen’s kappa), and any estimated error rates or limitations. This will improve transparency and allow better evaluation of the reported partisan differences. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical measurements from applied classifier on collected posts

full rationale

The paper reports direct observational statistics (5.86% deepfake rate, partisan splits, engagement figures) obtained by applying a pre-trained detection framework to a fixed corpus of 187,778 posts. No equations, first-principles derivations, or predictions are present that reduce by construction to fitted parameters, self-definitions, or self-citation chains. The central claims rest on measurement and classification of external data rather than any loop where outputs are renamed as inputs or where a uniqueness theorem is imported from the authors' prior work to force the result. Absence of reported validation metrics on the target images is a potential correctness concern but does not constitute circularity under the specified patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical measurement study; no free parameters, axioms, or invented entities are introduced or required by the central claims in the abstract.

pith-pipeline@v0.9.0 · 5535 in / 1213 out tokens · 36908 ms · 2026-05-16T21:30:42.801219+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages

  1. [1]

    Herbert Chang, Benjamin Shaman, Yung-Chun Chen, Mingyue Zha, Sean Noh, Chiyu Wei, Tracy Weener, and Maya Magee. 2024. Generative Memesis: AI Mediates Political Information in the 2024 United States Presidential Election. A vailable at SSRN(2024)

  2. [2]

    Zhiyi Chen, Jinyi Ye, Beverlyn Tsai, Emilio Ferrara, and Luca Luceri. 2025. Syn- thetic politics: Prevalence, spreaders, and emotional reception of AI-generated political images on X. InProceedings of the 36th ACM Conference on Hypertext and Social Media. 11–21

  3. [3]

    Goldstein

    Renee DiResta and Josh A. Goldstein. 2024. How Spammers and Scam- mers Leverage AI-Generated Images on Facebook for Audience Growth. arXiv:2403.12838 [cs.CY] https://arxiv.org/abs/2403.12838

  4. [5]

    Chiara Drolsbach and Nicolas Pröllochs. 2025. Characterizing AI-Generated Misinformation on Social Media. arXiv:2505.10266 [cs.SI]

  5. [6]

    Victor Livernoche, Akshatha Arodi, Andreea Musulan, Zachary Yang, Adam Salvail, Gaétan Marceau Caron, Jean-François Godbout, and Reihaneh Rabbany

  6. [7]

    arXiv:2509.09495 [cs.CV] https://arxiv.org/abs/2509.09495

    OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection. arXiv:2509.09495 [cs.CV] https://arxiv.org/abs/2509.09495

  7. [8]

    Hana Matatov, Marianne Aubin Le Quéré, Ofra Amir, and Mor Naaman. 2025. Ex- amining the Prevalence and Dynamics of AI-Generated Media in Art Subreddits. arXiv:2410.07302 [cs.AI] https://arxiv.org/abs/2410.07302

  8. [9]

    Jonas Ricker, Dennis Assenmacher, Thorsten Holz, Asja Fischer, and Erwin Quiring. 2024. AI-generated faces in the real world: A large-scale case study of Twitter profile images. InProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses. 513–530

  9. [10]

    Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Chengjie Wang, Shouhong Ding, Yunsheng Wu, et al. 2024. Df40: Toward next-generation deepfake detection.Advances in Neural Information Processing Systems37 (2024), 29387–29434

  10. [11]

    Kaicheng Yang, Danishjeet Singh, and Filippo Menczer. 2024. Characteristics and Prevalence of Fake Social Media Profiles with AI-generated Faces.Journal of Online Trust and Safety2, 4 (Sept. 2024). doi:10.54501/jots.v2i4.197

  11. [12]

    Mingjian Zhu, Hanting Chen, Qiangyu Yan, Xudong Huang, Guanyu Lin, Wei Li, Zhijun Tu, Hailin Hu, Jie Hu, and Yunhe Wang. 2023. GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image. arXiv:2306.08571 [cs.CV]