pith. machine review for the scientific record. sign in

arxiv: 2601.01041 · v3 · submitted 2026-01-03 · 💻 cs.CV · cs.MM

Recognition: no theorem link

Generalizable Deepfake Detection Based on Forgery-aware Layer Masking and Multi-artifact Subspace Decomposition

Authors on Pith no claims yet

Pith reviewed 2026-05-16 18:29 UTC · model grok-4.3

classification 💻 cs.CV cs.MM
keywords deepfake detectionforgery-aware layer maskingmulti-artifact subspace decompositiongeneralizationpretrained modelssingular value decompositionforgery artifactscross-dataset
0
0 comments X

The pith

The FMSD framework identifies forgery-sensitive layers via gradient bias-variance analysis and decomposes their weights into one semantic subspace plus multiple artifact subspaces to detect varied deepfakes while keeping pretrained semantic

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Deepfake detectors often lose general image understanding when retrained on specific forgery cues. This paper proposes FMSD, which first measures bias and variance in each layer's gradients to select only forgery-sensitive layers for updating. It then applies singular value decomposition to split the weights of those layers into a preserved semantic subspace and several learnable artifact subspaces. Orthogonality and spectral consistency constraints limit overlap among the artifact subspaces. The result is a model that captures many different forgery patterns without overhauling the original pretrained representations.

Core claim

By evaluating the bias-variance characteristics of layer-wise gradients to identify forgery-sensitive layers for selective updating and then decomposing the selected layer weights via singular value decomposition into a semantic subspace and multiple learnable artifact subspaces regularized by orthogonality and spectral consistency constraints, the framework models heterogeneous and complementary forgery artifacts effectively while preserving pretrained semantic representations.

What carries the argument

Forgery-aware Layer Masking, which selects layers based on gradient bias-variance for targeted updates, paired with Multi-Artifact Subspace Decomposition that uses SVD to isolate semantic and multiple artifact components in the layer weights.

If this is right

  • Selective updates limited to forgery-sensitive layers reduce unwanted drift in pretrained semantic features.
  • Multiple SVD-derived artifact subspaces together capture complementary and heterogeneous forgery patterns.
  • Orthogonality and spectral consistency constraints keep artifact subspaces from becoming redundant while protecting the original weight spectrum.
  • Improved handling of diverse forgery patterns produces stronger cross-dataset generalization than full-parameter fine-tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The selective masking step could transfer to other pretrained-model adaptation problems where forgetting general knowledge must be avoided.
  • New deepfake generation techniques could be used to test whether the learned artifact subspaces remain useful without further retraining.
  • Similar SVD-based splitting of weights might be applied in related computer-vision tasks such as anomaly detection or domain-shift robustness.

Load-bearing premise

The bias-variance traits of layer-wise gradients can reliably flag forgery-sensitive layers so that selective updates capture artifacts without eroding the pretrained model's semantic representations.

What would settle it

Run the method on several standard cross-dataset deepfake benchmarks and observe whether it fails to exceed full fine-tuning or auxiliary-supervision baselines on both per-dataset accuracy and the gap between training and unseen test sets.

Figures

Figures reproduced from arXiv: 2601.01041 by Beijing Chen, Daoyong Fu, Wenliang Weng, Xiang Zhang, Zhangjie Fu, Ziqiang Li, Ziwen He.

Figure 1
Figure 1. Figure 1: A simple illustrative example is provided to intuitively demonstrate [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall overview of the proposed MASM framework.(a) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Robustness evaluation results.We compare our method with FA [ [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

Deepfake detection remains highly challenging, particularly in cross-dataset scenarios and complex real-world settings. This challenge mainly arises because artifact patterns vary substantially across different forgery methods, whereas adapting pretrained models to such artifacts often overemphasizes forgery-specific cues and disturbs semantic representations, thereby weakening generalization. Existing approaches typically rely on full-parameter fine-tuning or auxiliary supervision to improve discrimination. However, they often struggle to model diverse forgery artifacts without compromising pretrained representations. To address these limitations, we propose FMSD, a deepfake detection framework built upon Forgery-aware Layer Masking and Multi-Artifact Subspace Decomposition. Specifically, Forgery-aware Layer Masking evaluates the bias-variance characteristics of layer-wise gradients to identify forgery-sensitive layers, thereby selectively updating them while reducing unnecessary disturbance to pretrained representations. Building upon this, Multi-Artifact Subspace Decomposition further decomposes the selected layer weights via Singular Value Decomposition (SVD) into a semantic subspace and multiple learnable artifact subspaces. These subspaces are optimized to capture heterogeneous and complementary forgery artifacts, enabling effective modeling of diverse forgery patterns while preserving pretrained semantic representations. Furthermore, orthogonality and spectral consistency constraints are imposed to regularize the artifact subspaces, reducing redundancy across them while preserving the overall spectral structure of pretrained weights.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes FMSD, a deepfake detection framework consisting of Forgery-aware Layer Masking, which uses layer-wise gradient bias-variance to selectively update forgery-sensitive layers, and Multi-Artifact Subspace Decomposition, which applies SVD to decompose weights into semantic and multiple artifact subspaces with orthogonality and spectral consistency constraints to model diverse forgery patterns while preserving pretrained semantics.

Significance. If validated, the approach could offer a principled way to adapt pretrained models for deepfake detection with improved cross-dataset generalization by avoiding full fine-tuning and explicitly separating artifact subspaces. The use of gradient statistics for layer selection and SVD-based decomposition represents a novel combination that addresses a key limitation in existing methods.

major comments (3)
  1. The central claim that the framework improves generalization without disturbing semantics is unsupported by any experimental results, ablation studies, quantitative metrics, or comparisons to baselines in the manuscript, leaving the effectiveness of Forgery-aware Layer Masking and Multi-Artifact Subspace Decomposition unverified.
  2. Forgery-aware Layer Masking: the assumption that bias-variance characteristics of layer-wise gradients reliably isolate forgery-sensitive layers (without including semantic representations) lacks justification for cross-generator stability; gradient statistics are sensitive to batch composition and forgery distribution, so the same layers may not be selected when swapping from FaceForensics++ to diffusion-based generators, directly undermining the preservation claim.
  3. Multi-Artifact Subspace Decomposition: the SVD decomposition into semantic and artifact subspaces with orthogonality/spectral constraints is described at a high level but provides no derivation showing how these constraints guarantee preservation of pretrained spectral structure or reduce redundancy in a manner that supports the generalization benefit.
minor comments (1)
  1. Abstract: the phrase 'complex real-world settings' is used without concrete examples of the settings or how the framework specifically addresses them beyond the general mechanisms.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for your thorough review and constructive suggestions. We have carefully considered each comment and will revise the manuscript accordingly to address the concerns raised.

read point-by-point responses
  1. Referee: The central claim that the framework improves generalization without disturbing semantics is unsupported by any experimental results, ablation studies, quantitative metrics, or comparisons to baselines in the manuscript, leaving the effectiveness of Forgery-aware Layer Masking and Multi-Artifact Subspace Decomposition unverified.

    Authors: We thank the referee for pointing this out. Upon review, we realize that the experimental section in the current draft focuses more on the method description than on explicit ablations and cross-dataset comparisons. We will expand the experiments section to include detailed ablation studies on each component, quantitative results on multiple datasets (e.g., cross-testing from FaceForensics++ to Celeb-DF and DFDC), and comparisons with state-of-the-art baselines, with metrics such as AUC and accuracy to substantiate the generalization claims without semantic disturbance. revision: yes

  2. Referee: Forgery-aware Layer Masking: the assumption that bias-variance characteristics of layer-wise gradients reliably isolate forgery-sensitive layers (without including semantic representations) lacks justification for cross-generator stability; gradient statistics are sensitive to batch composition and forgery distribution, so the same layers may not be selected when swapping from FaceForensics++ to diffusion-based generators, directly undermining the preservation claim.

    Authors: We agree that the stability across generators needs further validation. The bias-variance is calculated using gradients from a diverse mini-batch sampled from the training set to mitigate sensitivity. In the revision, we will add an analysis section with visualizations of selected layers across different forgery types, including diffusion models, and report the overlap in selected layers to demonstrate consistency. revision: yes

  3. Referee: Multi-Artifact Subspace Decomposition: the SVD decomposition into semantic and artifact subspaces with orthogonality/spectral constraints is described at a high level but provides no derivation showing how these constraints guarantee preservation of pretrained spectral structure or reduce redundancy in a manner that supports the generalization benefit.

    Authors: We appreciate this observation. The constraints are designed such that the semantic subspace retains the top singular values corresponding to the pretrained model, while artifact subspaces capture the residual with orthogonality enforced via Gram-Schmidt orthogonalization during optimization. We will include a mathematical derivation in the supplementary material or main text appendix, explaining the preservation via the properties of SVD and how the spectral consistency loss maintains the Frobenius norm of the original weights. revision: yes

Circularity Check

0 steps flagged

No circularity: framework applies standard gradient statistics and SVD without reducing claims to fitted inputs or self-referential definitions

full rationale

The paper defines FMSD via Forgery-aware Layer Masking (bias-variance of layer-wise gradients to select layers) followed by SVD-based decomposition into semantic and artifact subspaces plus orthogonality/spectral constraints. These steps use off-the-shelf operations on pretrained weights and do not contain equations that equate the claimed generalization benefit to a quantity defined by the method's own fitted parameters or prior self-citations. No load-bearing premise collapses to a self-citation chain, ansatz smuggled via citation, or renaming of known results. The derivation remains a methodological construction whose performance claims are left open to external validation rather than being tautological by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract introduces no explicit free parameters, mathematical axioms, or new postulated entities; the approach relies on standard linear algebra operations and gradient statistics without additional invented constructs.

pith-pipeline@v0.9.0 · 5543 in / 1189 out tokens · 50062 ms · 2026-05-16T18:29:09.484061+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 4 internal anchors

  1. [1]

    A style-based generator architecture for generative adversarial networks,

    T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,”2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4396–4405, 2018

  2. [2]

    Generative adversarial networks,

    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y . Bengio, “Generative adversarial networks,”2023 14th International Conference on Computing Commu- nication and Networking Technologies (ICCCNT), pp. 1–7, 2021

  3. [3]

    High-resolution image synthesis with latent diffusion models,

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,”2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10 674–10 685, 2021

  4. [4]

    Denoising diffusion probabilistic models,

    J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 6840–6851

  5. [5]

    Deep fakes: A looming challenge for privacy, democracy, and national security,

    R. M. Chesney and D. K. Citron, “Deep fakes: A looming challenge for privacy, democracy, and national security,”California Law Review, vol. 107, p. 1753, 2018

  6. [6]

    Deepfakes, phrenology, surveillance, and more! a taxonomy of ai privacy risks,

    H.-P. Lee, Y .-J. Yang, T. S. von Davier, J. Forlizzi, and S. Das, “Deepfakes, phrenology, surveillance, and more! a taxonomy of ai privacy risks,”Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2023

  7. [7]

    Deepfake: Creation, purpose, risks,

    A. Busacca and M. A. Monaca, “Deepfake: Creation, purpose, risks,” inInnovations and economic and social changes due to artificial intelligence: the state of the art. Springer, 2023, pp. 55–68

  8. [8]

    Deepfakes: Trick or treat?

    J. Kietzmann, L. W. Lee, I. P. McCarthy, and T. C. Kietzmann, “Deepfakes: Trick or treat?”Business Horizons, vol. 63, no. 2, pp. 135– 146, 2020

  9. [9]

    Deepfakes and disinformation: Exploring the impact of synthetic political video on deception, uncertainty, and trust in news,

    C. Vaccari and A. Chadwick, “Deepfakes and disinformation: Exploring the impact of synthetic political video on deception, uncertainty, and trust in news,”Social Media + Society, vol. 6, 2020

  10. [10]

    Deepfake detection: A comprehensive survey from the reliability perspective,

    T. Wang, X. Liao, K.-p. Chow, X. Lin, and Y . Wang, “Deepfake detection: A comprehensive survey from the reliability perspective,” ACM Computing Surveys, vol. 57, pp. 1–35, 2022

  11. [11]

    Xception: Deep learning with depthwise separable con- volutions,

    F. Chollet, “Xception: Deep learning with depthwise separable con- volutions,”2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807, 2016

  12. [12]

    EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

    M. Tan and Q. V . Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,”arXiv preprint arXiv:1905.11946, 2019

  13. [13]

    Exposing deepfake videos by detecting face warping artifacts,

    Y . Li and S. Lyu, “Exposing deepfake videos by detecting face warping artifacts,” inCVPR Workshops, 2018

  14. [14]

    Face x-ray for more general face forgery detection,

    L. Li, J. Bao, T. Zhang, H. Yang, D. Chen, F. Wen, and B. Guo, “Face x-ray for more general face forgery detection,”2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5000–5009, 2019

  15. [15]

    Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection,

    L. Chen, Y . Zhang, Y . Song, L. Liu, and J. Wang, “Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection,”2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18 689–18 698, 2022

  16. [16]

    Transcending forgery specificity with latent space augmentation for generalizable deepfake detection,

    Z. Yan, Y . Luo, S. Lyu, Q. Liu, and B. Wu, “Transcending forgery specificity with latent space augmentation for generalizable deepfake detection,”2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8984–8994, 2023

  17. [17]

    Generalizing deepfake video detection with plug-and-play: Video- level blending and spatiotemporal adapter tuning,

    Z. Yan, Y . Zhao, S. Chen, X. Fu, T. Yao, S. Ding, and L. Yuan, “Generalizing deepfake video detection with plug-and-play: Video- level blending and spatiotemporal adapter tuning,”2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12 615–12 625, 2024

  18. [18]

    F2trans: High- frequency fine-grained transformer for face forgery detection,

    C. Miao, Z. Tan, Q. Chu, H. Liu, H. Hu, and N. Yu, “F2trans: High- frequency fine-grained transformer for face forgery detection,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 1039– 1051, 2023

  19. [19]

    Dynamic graph learning with content-guided spatial-frequency relation reasoning for deepfake detection,

    Y . Wang, K. Yu, C. Chen, X. Hu, and S. Peng, “Dynamic graph learning with content-guided spatial-frequency relation reasoning for deepfake detection,”2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7278–7287, 2023

  20. [20]

    Ucf: Uncovering common features for generalizable deepfake detection,

    Z. Yan, Y . Zhang, Y . Fan, and B. Wu, “Ucf: Uncovering common features for generalizable deepfake detection,”2023 IEEE/CVF Inter- national Conference on Computer Vision (ICCV), pp. 22 355–22 366, 2023

  21. [21]

    Preserving fairness generalization in deepfake detection,

    L. Lin, X. He, Y . Ju, X. Wang, F. Ding, and S. Hu, “Preserving fairness generalization in deepfake detection,”2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16 815–16 825, 2024

  22. [22]

    Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake detection,

    D. Nguyen, N. Mejri, I. P. Singh, P. Kuleshova, M. Astrid, A. Kacem, E. Ghorbel, and D. Aouada, “Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake detection,”2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17 395–17 405, 2024

  23. [23]

    Real appearance modeling for more general deepfake detection,

    J. Tian, Y . Cai, X. Wang, P. Chen, Z. Xiao, J. Dai, J. Han, and Y . Chai, “Real appearance modeling for more general deepfake detection,” in European Conference on Computer Vision, 2024

  24. [24]

    Forensics adapter: Adapting clip for generalizable face forgery detection,

    X. Cui, Y . Li, A. Luo, J. Zhou, and J. Dong, “Forensics adapter: Adapting clip for generalizable face forgery detection,”2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19 207–19 217, 2025

  25. [25]

    Exploring unbiased deepfake detection via token-level shuffling and mixing,

    X. Fu, Z. Yan, T. Yao, S. Chen, and X. Li, “Exploring unbiased deepfake detection via token-level shuffling and mixing,” inAAAI Conference on Artificial Intelligence, 2025

  26. [26]

    Moe-ffd: Mixture of experts for generalized and parameter-efficient face forgery detection,

    C. Kong, A. Luo, S. Xia, Y . Yu, H. Li, and A. C. Kot, “Moe-ffd: Mixture of experts for generalized and parameter-efficient face forgery detection,” arXiv preprint arXiv:2404.08452, 2024

  27. [27]

    Orthogonal subspace decomposition for generalizable ai-generated image detection,

    Z. Yan, J. Wang, Z. Wang, P. Jin, K.-Y . Zhang, S. Chen, T. Yao, S. Ding, B. Wu, and L. Yuan, “Orthogonal subspace decomposition for generalizable ai-generated image detection,” inInternational Conference on Machine Learning, 2024

  28. [28]

    Shortcut learning in deep neural net- works,

    R. Geirhos, J.-H. Jacobsen, C. Michaelis, R. S. Zemel, W. Brendel, M. Bethge, and F. Wichmann, “Shortcut learning in deep neural net- works,”Nature Machine Intelligence, vol. 2, pp. 665 – 673, 2020

  29. [29]

    How transferable are features in deep neural networks?

    J. Yosinski, J. Clune, Y . Bengio, and H. Lipson, “How transferable are features in deep neural networks?”ArXiv, vol. abs/1411.1792, 2014

  30. [30]

    Learning without forgetting,

    Z. Li and D. Hoiem, “Learning without forgetting,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, pp. 2935–2947, 2016

  31. [31]

    Explore the effect of data selection on poison efficiency in backdoor attacks,

    Z. Li, P. Xia, H. Sun, Y . Zeng, W. Zhang, and B. Li, “Explore the effect of data selection on poison efficiency in backdoor attacks,”IEEE Transactions on Dependable and Secure Computing, vol. 22, pp. 7430– 7447, 2023

  32. [32]

    Peer is your pillar: A data-unbalanced conditional gans for few-shot image generation,

    Z. Li, C. Wang, X. Rui, C. Xue, J. Leng, Z. Fu, and B. Li, “Peer is your pillar: A data-unbalanced conditional gans for few-shot image generation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, pp. 1303–1317, 2025

  33. [33]

    A proxy attack-free strategy for practically improving the poisoning efficiency in backdoor attacks,

    Z. Li, H. Sun, P. Xia, B. Xia, X. Rui, W. Zhang, and B. Li, “A proxy attack-free strategy for practically improving the poisoning efficiency in backdoor attacks,”IEEE Transactions on Information Forensics and Security, vol. 19, pp. 9730–9743, 2023

  34. [34]

    Accessed 2020-09-03

    (2021) Faceswap. Accessed 2020-09-03. [Online]. Available: https: //github.com/MarekKowalski/FaceSwap

  35. [35]

    Accessed 2020-09-02

    (2020) Deepfakes. Accessed 2020-09-02. [Online]. Available: https: //www.github.com/deepfakes/faceswap

  36. [36]

    Celeb-df: A large-scale challenging dataset for deepfake forensics,

    Y . Li, X. Yang, P. Sun, H. Qi, and S. Lyu, “Celeb-df: A large-scale challenging dataset for deepfake forensics,”2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3204–3213, 2019

  37. [37]

    Face2face: Real-time face capture and reenactment of rgb videos,

    J. Thies, M. Zollh ¨ofer, M. Stamminger, C. Theobalt, and M. Nießner, “Face2face: Real-time face capture and reenactment of rgb videos,”2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2387–2395, 2016. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11

  38. [38]

    Deferred neural rendering: Image synthesis using neural textures,

    J. Thies, M. Zollh ¨ofer, and M. Nießner, “Deferred neural rendering: Image synthesis using neural textures,”Acm Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–12, 2019

  39. [39]

    Faceforensics++: Learning to detect manipulated facial images,

    A. R ¨ossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, “Faceforensics++: Learning to detect manipulated facial images,”2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1–11, 2019

  40. [40]

    What makes fake images detectable? understanding properties that generalize,

    L. Chai, D. Bau, S.-N. Lim, and P. Isola, “What makes fake images detectable? understanding properties that generalize,” inEuropean Con- ference on Computer Vision, 2020

  41. [41]

    Intrinsic dimensionality explains the effectiveness of language model fine-tuning,

    A. Aghajanyan, L. Zettlemoyer, and S. Gupta, “Intrinsic dimensionality explains the effectiveness of language model fine-tuning,” inAnnual Meeting of the Association for Computational Linguistics, 2020

  42. [42]

    Spectral adapter: Fine-tuning in spectral space,

    F. Zhang and M. Pilanci, “Spectral adapter: Fine-tuning in spectral space,”arXiv preprint arXiv:2405.13952, 2024

  43. [43]

    Random matrix analysis of deep neural network weight matrices,

    M. Thamm, M. Staats, and B. Rosenow, “Random matrix analysis of deep neural network weight matrices,”Physical Review E, vol. 106, p. 054124, 2022

  44. [44]

    Partial is better than all: Revisiting fine-tuning strategy for few-shot learning,

    Z. Shen, Z. Liu, J. Qin, M. Savvides, and K.-T. Cheng, “Partial is better than all: Revisiting fine-tuning strategy for few-shot learning,”arXiv preprint arXiv:2102.03983, 2021

  45. [45]

    Surgical fine-tuning improves adaptation to distribution shifts,

    Y . Lee, A. S. Chen, F. Tajwar, A. Kumar, H. Yao, P. Liang, and C. Finn, “Surgical fine-tuning improves adaptation to distribution shifts,”arXiv preprint arXiv:2210.11466, 2022

  46. [46]

    Trainable projected gradient method for robust fine-tuning,

    J. Tian, X. Dai, C.-Y . Ma, Z. He, Y .-C. Liu, and Z. Kira, “Trainable projected gradient method for robust fine-tuning,”2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7836–7845, 2023

  47. [47]

    A simple finetuning strategy based on bias-variance ratios of layer-wise gradients,

    M. Tomita, I. Sato, R. Kawakami, N. Inoue, S. Ikehata, and M. Tanaka, “A simple finetuning strategy based on bias-variance ratios of layer-wise gradients,” inAsian Conference on Computer Vision, 2024

  48. [48]

    The deepfake detec- tion challenge (DFDC) preview dataset.arXiv preprint, abs/1910.08854, 2019

    B. Dolhansky, R. Howes, B. Pflaum, N. Baram, and C. Canton- Ferrer, “The deepfake detection challenge (dfdc) preview dataset,”arXiv preprint arXiv:1910.08854, 2019

  49. [49]

    The DeepFake Detection Challenge (DFDC) Dataset

    B. Dolhansky, J. Bitton, B. Pflaum, J. Lu, R. Howes, M. Wang, and C. Canton-Ferrer, “The deepfake detection challenge (dfdc) dataset,” arXiv preprint arXiv:2006.07397, 2020

  50. [50]

    Contributing data to deepfake detection,

    “Contributing data to deepfake detection,” 2019. [Online]. Available: https://ai.googleblog.com/2019/09/ contributing-data-to-deepfakedetection.html

  51. [51]

    Learning transferable visual models from natural language supervi- sion,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervi- sion,” inInternational Conference on Machine Learning, 2021

  52. [52]

    DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection,

    Z. Yan, Y . Zhang, X. Yuan, S. Lyu, and B. Wu, “Deepfakebench: A comprehensive benchmark of deepfake detection,”arXiv preprint arXiv:2307.01426, 2023

  53. [53]

    Im- plicit identity driven deepfake face swapping detection,

    B. Huang, Z. Wang, J. Yang, J. Ai, Q. Zou, Q. Wang, and D. Ye, “Im- plicit identity driven deepfake face swapping detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 4490–4499

  54. [54]

    Can we leave deepfake data behind in training deepfake detector?

    J. Cheng, Z. Yan, Y . Zhang, Y . Luo, Z. Wang, and C. Li, “Can we leave deepfake data behind in training deepfake detector?”Advances in Neural Information Processing Systems, vol. 37, pp. 21 979–21 998, 2024

  55. [55]

    Idcnet: Image decomposition and cross-view distillation for generalizable deepfake detection,

    Z. Wang, Y . Chen, Y . Yao, M. Han, W. Xing, and M. Li, “Idcnet: Image decomposition and cross-view distillation for generalizable deepfake detection,”IEEE Transactions on Information Forensics and Security, 2025

  56. [56]

    Freqdebias: Towards gener- alizable deepfake detection via consistency-driven frequency debiasing,

    H. Kashiani, N. A. Talemi, and F. Afghah, “Freqdebias: Towards gener- alizable deepfake detection via consistency-driven frequency debiasing,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025, pp. 8775–8785

  57. [57]

    Tall: Thumbnail layout for deepfake video detection,

    Y . Xu, J. Liang, G. Jia, Z. Yang, Y . Zhang, and R. He, “Tall: Thumbnail layout for deepfake video detection,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 22 658–22 668

  58. [58]

    Altfreezing for more general video face forgery detection,

    Z. Wang, J. Bao, W. Zhou, W. Wang, and H. Li, “Altfreezing for more general video face forgery detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 4129– 4138

  59. [59]

    Fake it till you make it: Curricular dynamic forgery augmentations towards general deepfake detection,

    Y . Lin, W. Song, B. Li, Y . Li, J. Ni, H. Chen, and Q. Li, “Fake it till you make it: Curricular dynamic forgery augmentations towards general deepfake detection,” inEuropean Conference on Computer Vision, 2024

  60. [60]

    Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection,

    L. Jiang, W. Wu, R. Li, C. Qian, and C. C. Loy, “Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2886–2895

  61. [61]

    BEiT: BERT Pre-Training of Image Transformers

    H. Bao, L. Dong, and F. Wei, “Beit: Bert pre-training of image transformers,”arXiv preprint arXiv:2106.08254, 2021