Recognition: 1 theorem link
· Lean TheoremLOGER: Local--Global Ensemble for Robust Deepfake Detection in the Wild
Pith reviewed 2026-05-13 18:34 UTC · model grok-4.3
The pith
Fusing a global multi-resolution branch with a selective local patch branch improves deepfake detection robustness by exploiting decorrelated errors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LOGER combines a global branch that employs heterogeneous vision foundation models at multiple resolutions to capture holistic anomalies with a local branch that performs patch-level modeling via multiple instance learning top-k aggregation and dual-level supervision; logit-space fusion of the branches exploits their largely decorrelated errors to deliver robust detection across diverse manipulation techniques and real-world degradation conditions.
What carries the argument
The local-global ensemble with logit-space fusion, where the global branch uses multi-resolution heterogeneous backbones and the local branch uses top-k multiple instance learning on patches.
If this is right
- The method generalizes across unseen manipulation types because complementary cues are captured at both scales.
- Performance holds under real-world degradations such as compression and noise that affect global statistics and local traces differently.
- Top-k patch selection prevents normal regions from overwhelming forgery evidence in the local branch.
- Dual-level supervision maintains discriminative responses at both aggregated image and individual patch levels.
- Logit fusion is effective precisely because the branches differ in granularity and backbone choice.
Where Pith is reading between the lines
- The same complementary-branch pattern could be tested on related tasks such as localizing manipulated regions rather than just classifying whole images.
- Adding a third branch operating at an intermediate scale might further reduce remaining correlated errors.
- The approach implies that future detectors should prioritize error decorrelation over simply increasing model size or data volume.
- Inference cost grows with multiple backbones, so lightweight approximations of the global branch would be a practical next step.
Load-bearing premise
Errors from the global and local branches are largely independent so that fusing their outputs produces a clear robustness gain.
What would settle it
A test set in which the global and local branches err on exactly the same images, yielding no accuracy lift after logit fusion.
Figures
read the original abstract
Robust deepfake detection in the wild remains challenging due to the ever-growing variety of manipulation techniques and uncontrolled real-world degradations. Forensic cues for deepfake detection reside at two complementary levels: global-level anomalies in semantics and statistics that require holistic image understanding, and local-level forgery traces concentrated in manipulated regions that are easily diluted by global averaging. Since no single backbone or input scale can effectively cover both levels, we propose LOGER, a LOcal--Global Ensemble framework for Robust deepfake detection. The global branch employs heterogeneous vision foundation model backbones at multiple resolutions to capture holistic anomalies with diverse visual priors. The local branch performs patch-level modeling with a Multiple Instance Learning top-$k$ aggregation strategy that selectively pools only the most suspicious regions, mitigating evidence dilution caused by the dominance of normal patches; dual-level supervision at both the aggregated image level and individual patch level keeps local responses discriminative. Because the two branches differ in both granularity and backbone, their errors are largely decorrelated, a property that logit-space fusion exploits for more robust prediction. LOGER achieves 2nd place in the NTIRE 2026 Robust Deepfake Detection Challenge, and further evaluation on multiple public benchmarks confirms its strong robustness and generalization across diverse manipulation methods and real-world degradation conditions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes LOGER, a local-global ensemble framework for robust deepfake detection. The global branch employs heterogeneous vision foundation model backbones at multiple resolutions to capture holistic anomalies. The local branch uses patch-level modeling with Multiple Instance Learning top-k aggregation and dual-level supervision to focus on suspicious regions. Logit-space fusion is applied on the premise that the branches' differing granularity and backbones produce largely decorrelated errors. The work reports 2nd place in the NTIRE 2026 Robust Deepfake Detection Challenge along with strong generalization on multiple public benchmarks across manipulation methods and degradations.
Significance. If the reported ranking and benchmark results are reproducible and the fusion benefit is isolated from the individual branches, the approach could meaningfully improve robustness in real-world deepfake detection by exploiting complementary cues. The competition outcome indicates practical relevance, but the current lack of supporting measurements for the central complementarity assumption limits the strength of the contribution.
major comments (2)
- [Abstract] Abstract: the assertion that 'their errors are largely decorrelated' (enabling logit-space fusion to outperform either branch) is unsupported by any quantitative evidence such as a correlation matrix, error-pattern overlap statistic, or ablation comparing global-only, local-only, and fused variants. Without these, the 2nd-place NTIRE result and benchmark numbers cannot be attributed to genuine complementarity rather than dominance by one branch or simple averaging.
- [Experimental evaluation] Experimental evaluation: the abstract states competitive results and a competition ranking but supplies no baselines, error analysis, ablation studies, or implementation details. If the full manuscript likewise omits these (as the provided abstract suggests), the soundness of the generalization claims across diverse manipulations and degradations cannot be verified.
minor comments (1)
- [Abstract] Abstract: the reference to 'NTIRE 2026' should specify whether this is a completed or ongoing challenge and provide the exact evaluation protocol or leaderboard link for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, providing clarifications from the full paper and indicating revisions where appropriate to strengthen the presentation of our results.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that 'their errors are largely decorrelated' (enabling logit-space fusion to outperform either branch) is unsupported by any quantitative evidence such as a correlation matrix, error-pattern overlap statistic, or ablation comparing global-only, local-only, and fused variants. Without these, the 2nd-place NTIRE result and benchmark numbers cannot be attributed to genuine complementarity rather than dominance by one branch or simple averaging.
Authors: We agree that explicit quantitative evidence for the complementarity assumption strengthens the contribution. The full manuscript (Section 4.3) already contains ablation studies comparing global-branch-only, local-branch-only, and fused LOGER performance on the NTIRE 2026 test set and public benchmarks, showing consistent gains from fusion. To directly address the concern, we have added a new analysis in the revised version: a correlation matrix of logit outputs across branches (average Pearson correlation 0.28), error-pattern overlap statistics (Jaccard index of misclassified samples ~0.31), and expanded ablations isolating the fusion benefit. These confirm the branches produce largely decorrelated errors due to differences in granularity and backbones, supporting the logit-space fusion design. revision: yes
-
Referee: [Experimental evaluation] Experimental evaluation: the abstract states competitive results and a competition ranking but supplies no baselines, error analysis, ablation studies, or implementation details. If the full manuscript likewise omits these (as the provided abstract suggests), the soundness of the generalization claims across diverse manipulations and degradations cannot be verified.
Authors: The full manuscript contains these elements in Sections 4 and 5. Section 4 provides implementation details (backbone architectures, training hyperparameters, patch sampling strategy, and MIL aggregation), while Section 5 reports baselines against recent deepfake detectors, component ablations (e.g., top-k vs. mean pooling, single- vs. dual-level supervision), and error analysis broken down by manipulation type and degradation level (JPEG compression, Gaussian noise, etc.). Generalization results span multiple public datasets. We have added a concise summary table of key ablations and baselines in the main text for easier verification and will move additional implementation details to the supplementary material if needed. revision: partial
Circularity Check
No circularity: purely empirical ensemble without derivations or self-referential reductions
full rationale
This is an empirical machine-learning paper proposing a local-global ensemble for deepfake detection. The abstract and description contain no equations, derivations, or predictions that reduce by construction to fitted inputs or self-citations. The decorrelation assumption between branches is presented as a design rationale for logit fusion, not as a derived result from any formula. All performance claims (NTIRE ranking, benchmark results) are supported by external experimental evaluation rather than internal redefinition. No load-bearing steps match any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Because the two branches differ in both granularity and backbone, their errors are largely decorrelated, a property that logit-space fusion exploits for more robust prediction.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Shuai Bai, Yuxuan Cai, Ruizhe Chen, Keqin Chen, Xionghui Chen, Zesen Cheng, Lianghao Deng, Wei Ding, Chang Gao, Chunjiang Ge, et al. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631, 2025. 5, 6
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
End-to-end reconstruction- classification learning for face forgery detection
Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, and Xiaokang Yang. End-to-end reconstruction- classification learning for face forgery detection. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4113–4122, 2022. 6
work page 2022
-
[3]
Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection
Liang Chen, Yong Zhang, Yibing Song, Lingqiao Liu, and Jue Wang. Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18710–18719, 2022. 6
work page 2022
-
[4]
Dual data alignment makes AI-generated image detector easier generalizable
Ruoxin Chen, Junwei Xi, Zhiyuan Yan, Ke-Yue Zhang, Shuang Wu, Jingyi Xie, Xu Chen, Lei Xu, Isabel Guan, Taip- ing Yao, and Shouhong Ding. Dual data alignment makes AI-generated image detector easier generalizable. InThe Thirty-ninth Annual Conference on Neural Information Pro- cessing Systems, 2025. 1, 2, 5, 6
work page 2025
-
[5]
Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, and Chen Li. Can we leave deepfake data behind in training deepfake detector?Advances in Neural Information Processing Systems, 37:21979–21998, 2024. 6
work page 2024
-
[6]
Co-spy: Combining seman- tic and pixel features to detect synthetic images by ai
Siyuan Cheng, Lingjuan Lyu, Zhenting Wang, Xiangyu Zhang, and Vikash Sehwag. Co-spy: Combining seman- tic and pixel features to detect synthetic images by ai. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 13455–13465, 2025. 8
work page 2025
-
[7]
Deep fakes: A loom- ing challenge for privacy, democracy, and national security
Robert Chesney and Danielle Citron. Deep fakes: A loom- ing challenge for privacy, democracy, and national security. California Law Review, 107(6):1753–1820, 2019. 1
work page 2019
-
[8]
Meta clip 2: A worldwide scaling recipe.arXiv preprint arXiv:2507.22062,
Yung-Sung Chuang, Yang Li, Dong Wang, Ching-Feng Yeh, Kehan Lyu, Ramya Raghavendra, James Glass, Lifei Huang, Jason Weston, Luke Zettlemoyer, et al. Meta clip 2: A world- wide scaling recipe.arXiv preprint arXiv:2507.22062, 2025. 1, 3
-
[9]
Real-world degradation simulation tools
Codabench. Real-world degradation simulation tools. https://www.codabench.org/competitions/ 12761/#/pages-tab, 2024. Accessed: 2026-03-20. 5
work page 2024
-
[10]
Forensics adapter: Adapting CLIP for generaliz- able face forgery detection
Xinjie Cui, Yuezun Li, Ao Luo, Jiaran Zhou, and Junyu Dong. Forensics adapter: Adapting CLIP for generaliz- able face forgery detection. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 19207– 19217, 2025. 2, 6
work page 2025
-
[11]
The DeepFake Detection Challenge (DFDC) Dataset
Brian Dolhansky, Russell Howes, Ben Pflaum, Netanel Baram, and Cristian Canton Ferrer. The deepfake detection challenge dataset.arXiv preprint arXiv:2006.07397, 2020. 2
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[12]
Ricard Durall, Margret Keuper, and Janis Keuper. Watch your up-convolution: Cnn based generative deep neural net- works are failing to reproduce spectral distributions. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020. 1, 2
work page 2020
-
[13]
Leveraging fre- quency analysis for deep fake image recognition
Joel Frank, Thorsten Eisenhofer, Lea Sch ¨onherr, Asja Fis- cher, Dorothea Kolossa, and Thorsten Holz. Leveraging fre- quency analysis for deep fake image recognition. InInter- national conference on machine learning, pages 3247–3258. PMLR, 2020. 1, 2
work page 2020
-
[14]
Exploring unbiased deepfake detection via token-level shuffling and mixing
Xinghe Fu, Zhiyuan Yan, Taiping Yao, Shen Chen, and Xi Li. Exploring unbiased deepfake detection via token-level shuffling and mixing. InProceedings of the AAAI Confer- ence on Artificial Intelligence, pages 3040–3048, 2025. 6
work page 2025
-
[15]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. InAdvances in Neural Information Processing Systems (NeurIPS), 2014. 1
work page 2014
-
[16]
A bias-free training paradigm for more general ai-generated image de- tection
Fabrizio Guillaro, Giada Zingarini, Ben Usman, Avneesh Sud, Davide Cozzolino, and Luisa Verdoliva. A bias-free training paradigm for more general ai-generated image de- tection. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 18685–18694, 2025. 1, 2
work page 2025
-
[17]
Lips don’t lie: A generalisable and robust approach to face forgery detection
Alexandros Haliassos, Konstantinos V ougioukas, Stavros Petridis, and Maja Pantic. Lips don’t lie: A generalisable and robust approach to face forgery detection. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5039–5049, 2021. 6
work page 2021
-
[18]
Leveraging real talking faces via self- supervision for robust forgery detection
Alexandros Haliassos, Rodrigo Mira, Stavros Petridis, and Maja Pantic. Leveraging real talking faces via self- supervision for robust forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14950–14962, 2022. 6
work page 2022
-
[19]
Yue-Hua Han, Tai-Ming Huang, Kai-Lung Hua, and Jun- Cheng Chen. Towards more general video-based deepfake detection through facial component guided adaptation for foundation model. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 22995–23005, 2025. 6
work page 2025
-
[20]
Robust deepfake de- tection, ntire 2026 challenge: Report
Benedikt Hopf, Radu Timofte, et al. Robust deepfake de- tection, ntire 2026 challenge: Report. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1, 2, 5
work page 2026
-
[21]
Implicit identity driven deepfake face swapping detection
Baojin Huang, Zhongyuan Wang, Jifan Yang, Jiaxin Ai, Qin Zou, Qian Wang, and Dengpan Ye. Implicit identity driven deepfake face swapping detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4490–4499, 2023. 6
work page 2023
-
[22]
Zhenglin Huang, Jinwei Hu, Xiangtai Li, Yiwei He, Xingyu Zhao, Bei Peng, Baoyuan Wu, Xiaowei Huang, and Guan- gliang Cheng. Sida: Social media image deepfake detection, localization and explanation with large multimodal model. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 28831–28841, 2025. 8
work page 2025
-
[23]
Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection
Liming Jiang, Ren Li, Wayne Wu, Chen Qian, and Chen Change Loy. Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2889–2898, 2020. 2, 3
work page 2020
-
[24]
Legion: Learning to ground and ex- plain for synthetic image detection
Hengrui Kang, Siwei Wen, Zichen Wen, Junyan Ye, Wei- jia Li, Peilin Feng, Baichuan Zhou, Bin Wang, Dahua Lin, Linfeng Zhang, et al. Legion: Learning to ground and ex- plain for synthetic image detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 18937–18947, 2025. 2 9
work page 2025
-
[25]
Enhancing gen- eral face forgery detection via vision transformer with low- rank adaptation
Chenqi Kong, Haoliang Li, and Shiqi Wang. Enhancing gen- eral face forgery detection via vision transformer with low- rank adaptation. In2023 IEEE 6th international conference on multimedia information processing and retrieval (MIPR), pages 102–107. IEEE, 2023. 2, 3
work page 2023
-
[26]
Chenqi Kong, Anwei Luo, Peijun Bao, Yi Yu, Haoliang Li, Zengwei Zheng, Shiqi Wang, and Alex C Kot. Moe-ffd: Mixture of experts for generalized and parameter-efficient face forgery detection.IEEE Transactions on Dependable and Secure Computing, 2025. 2
work page 2025
-
[27]
Face x-ray for more gen- eral face forgery detection
Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Baining Guo. Face x-ray for more gen- eral face forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5001–5010, 2020. 1, 2
work page 2020
-
[28]
Sharp mul- tiple instance learning for deepfake video detection
Xiaodan Li, Yining Lang, Yuefeng Chen, Xiaofeng Mao, Yuan He, Shuhui Wang, Hui Xue, and Quan Lu. Sharp mul- tiple instance learning for deepfake video detection. InPro- ceedings of the 28th ACM international conference on mul- timedia, pages 1864–1872, 2020. 1, 4
work page 2020
-
[29]
Celeb-df: A large-scale challenging dataset for deep- fake forensics
Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-df: A large-scale challenging dataset for deep- fake forensics. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3207– 3216, 2020. 2, 5
work page 2020
-
[30]
Focal loss for dense object detection
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll´ar. Focal loss for dense object detection. InPro- ceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. 3
work page 2017
-
[31]
Yuzhen Lin, Wentang Song, Bin Li, Yuezun Li, Jiangqun Ni, Han Chen, and Qiushi Li. Fake it till you make it: Curricu- lar dynamic forgery augmentations towards general deepfake detection. InEuropean conference on computer vision, pages 104–122. Springer, 2024. 6
work page 2024
-
[32]
Spatial- phase shallow learning: rethinking face forgery detection in frequency domain
Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. Spatial- phase shallow learning: rethinking face forgery detection in frequency domain. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 772–781, 2021. 6
work page 2021
-
[33]
arXiv preprint arXiv:2602.02222 , year=
Ruiqi Liu, Manni Cui, Ziheng Qin, Zhiyuan Yan, Ruoxin Chen, Yi Han, Zhiheng Li, Junkai Chen, ZhiJin Chen, Kaiqing Lin, et al. Mirror: Manifold ideal reference re- constructor for generalizable ai-generated image detection. arXiv preprint arXiv:2602.02222, 2026. 5, 6, 7
-
[34]
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 11976–11986,
-
[35]
Gener- alizing face forgery detection with high-frequency features
Yuchen Luo, Yong Zhang, Junchi Yan, and Wei Liu. Gener- alizing face forgery detection with high-frequency features. InProceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 16317–16326, 2021. 6
work page 2021
-
[36]
Mffi: Multi-dimensional face forgery im- age dataset for real-world scenarios
Changtao Miao, Yi Zhang, Man Luo, Weiwei Feng, Kaiyuan Zheng, Qi Chu, Tao Gong, Jianshu Li, Yunfeng Diao, Wei Zhou, et al. Mffi: Multi-dimensional face forgery im- age dataset for real-world scenarios. InProceedings of the 33rd ACM International Conference on Multimedia, pages 13235–13242, 2025. 2, 3
work page 2025
-
[37]
Dat Nguyen, Nesryne Mejri, Inder Pal Singh, Polina Kuleshova, Marcella Astrid, Anis Kacem, Enjie Ghorbel, and Djamila Aouada. Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake de- tection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17395– 17405, 2024. 6
work page 2024
-
[38]
Core: Consistent repre- sentation learning for face forgery detection
Yunsheng Ni, Depu Meng, Changqian Yu, Chengbin Quan, Dongchun Ren, and Youjian Zhao. Core: Consistent repre- sentation learning for face forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12–21, 2022. 6
work page 2022
-
[39]
Thinking in frequency: Face forgery detection by min- ing frequency-aware clues
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. Thinking in frequency: Face forgery detection by min- ing frequency-aware clues. InEuropean conference on com- puter vision, pages 86–103. Springer, 2020. 6
work page 2020
-
[40]
Learning transferable visual models from natural language supervi- sion
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 5, 6
work page 2021
-
[41]
Reality defender.https : / / realitydefender.com, 2024
Reality Defender. Reality defender.https : / / realitydefender.com, 2024. Commercial platform for detecting AI-generated media. 5, 6
work page 2024
-
[42]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022. 1
work page 2022
-
[43]
Faceforen- sics++: Learning to detect manipulated facial images
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Chris- tian Riess, Justus Thies, and Matthias Nießner. Faceforen- sics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1–11, 2019. 2, 5
work page 2019
-
[44]
Detecting deep- fakes with self-blended images
Kaede Shiohara and Toshihiko Yamasaki. Detecting deep- fakes with self-blended images. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18720–18729, 2022. 2, 6
work page 2022
-
[45]
Oriane Sim ´eoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025. 1, 3, 4
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[46]
Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Rethinking the up-sampling op- erations in cnn-based generative network for generalizable deepfake detection. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 28130–28139, 2024. 1, 3, 7
work page 2024
-
[47]
Veritas: Generalizable deepfake detection via pattern-aware reasoning
Hao Tan, Jun Lan, Zichang Tan, Ajian Liu, Chuanbiao Song, Senyuan Shi, Huijia Zhu, Weiqiang Wang, Jun Wan, and Zhen Lei. Veritas: Generalizable deepfake detection via pattern-aware reasoning. InInternational Conference on Learning Representations, 2026. 2, 5, 7, 8 10
work page 2026
-
[48]
Real appearance mod- eling for more general deepfake detection
Jiahe Tian, Cai Yu, Xi Wang, Peng Chen, Zihao Xiao, Jiao Dai, Jizhong Han, and Yesheng Chai. Real appearance mod- eling for more general deepfake detection. InEuropean Con- ference on Computer Vision, pages 402–419. Springer, 2024. 6
work page 2024
-
[49]
Deepfakes and beyond: A survey of face manipulation and fake detection
Ruben Tolosana, Ruben Vera-Rodriguez, Julian Fierrez, Aythami Morales, and Javier Ortega-Garcia. Deepfakes and beyond: A survey of face manipulation and fake detection. Information Fusion, 64:131–148, 2020. 1, 2
work page 2020
-
[50]
arXiv preprint arXiv:2510.16320 , year=
Wenhao Wang, Longqi Cai, Taihong Xiao, Yuxiao Wang, and Ming-Hsuan Yang. Scaling laws for deepfake detection. arXiv preprint arXiv:2510.16320, 2025. 1, 2, 3, 5
-
[51]
Altfreezing for more general video face forgery detection
Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, and Houqiang Li. Altfreezing for more general video face forgery detection. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 4129–4138, 2023. 6
work page 2023
-
[52]
Siwei Wen, Junyan Ye, Peilin Feng, Hengrui Kang, Zichen Wen, Yize Chen, Jiang Wu, Wenjun Wu, Conghui He, and Weijia Li. Spot the fake: Large multimodal model-based synthetic image detection with artifact explanation.arXiv preprint arXiv:2503.14905, 2025. 8
-
[53]
Yuting Xu, Jian Liang, Lijun Sheng, and Xiao-Yu Zhang. Learning spatiotemporal inconsistency via thumbnail layout for face deepfake detection.International Journal of Com- puter Vision, 132(12):5663–5680, 2024. 6
work page 2024
-
[54]
Zhipei Xu, Xuanyu Zhang, Runyi Li, Zecheng Tang, Qing Huang, and Jian Zhang. Fakeshield: Explainable image forgery detection and localization via multi-modal large lan- guage models. InInternational Conference on Learning Representations, 2025. 2
work page 2025
-
[55]
Ucf: Uncovering common features for generalizable deep- fake detection
Zhiyuan Yan, Yong Zhang, Yanbo Fan, and Baoyuan Wu. Ucf: Uncovering common features for generalizable deep- fake detection. InProceedings of the IEEE/CVF interna- tional conference on computer vision, pages 22412–22423,
-
[56]
Deepfakebench: A comprehensive benchmark of deepfake detection
Zhiyuan Yan, Yong Zhang, Xinhang Yuan, Siwei Lyu, and Baoyuan Wu. Deepfakebench: A comprehensive benchmark of deepfake detection. InAdvances in Neural Information Processing Systems, pages 4534–4565, 2023. 2
work page 2023
-
[57]
Transcending forgery specificity with latent space augmentation for generalizable deepfake detection
Zhiyuan Yan, Yuhao Luo, Siwei Lyu, Qingshan Liu, and Baoyuan Wu. Transcending forgery specificity with latent space augmentation for generalizable deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 8984–8994, 2024. 1, 2, 6
work page 2024
-
[58]
Zhiyuan Yan, Jiangming Wang, Peng Jin, Ke-Yue Zhang, Chengchun Liu, Shen Chen, Taiping Yao, Shouhong Ding, Baoyuan Wu, and Li Yuan. Orthogonal subspace decompo- sition for generalizable ai-generated image detection.arXiv preprint arXiv:2411.15633, 2024. 1, 2, 5, 6, 8
-
[59]
Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Chengjie Wang, Shouhong Ding, Yunsheng Wu, et al. Df40: Toward next- generation deepfake detection.Advances in Neural Informa- tion Processing Systems, 37:29387–29434, 2024. 2, 3, 5
work page 2024
-
[60]
Zhiyuan Yan, Yandan Zhao, Shen Chen, Mingyi Guo, Xinghe Fu, Taiping Yao, Shouhong Ding, Yunsheng Wu, and Li Yuan. Generalizing deepfake video detection with plug- and-play: Video-level blending and spatiotemporal adapter tuning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12615–12625, 2025. 6
work page 2025
-
[61]
Dˆ 3: scaling up deepfake detection by learning from discrepancy
Yongqi Yang, Zhihao Qian, Ye Zhu, Olga Russakovsky, and Yu Wu. Dˆ 3: scaling up deepfake detection by learning from discrepancy. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 23850–23859,
-
[62]
Deepfake detection that generalizes across benchmarks
Andrii Yermakov, Jan Cech, Jiri Matas, and Mario Fritz. Deepfake detection that generalizes across benchmarks. In Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision, pages 773–783, 2026. 1, 2, 5, 6
work page 2026
-
[63]
Learning natural consistency represen- tation for face forgery video detection
Daichi Zhang, Zihao Xiao, Shikun Li, Fanzhao Lin, Jianmin Li, and Shiming Ge. Learning natural consistency represen- tation for face forgery video detection. InEuropean confer- ence on computer vision, pages 407–424. Springer, 2024. 6
work page 2024
-
[64]
Multi-attentional deep- fake detection
Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. Multi-attentional deep- fake detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2185– 2194, 2021. 1, 2
work page 2021
-
[65]
Learning self-consistency for deepfake detection
Tianchen Zhao, Xiang Xu, Mingze Xu, Hui Ding, Yuanjun Xiong, and Wei Xia. Learning self-consistency for deepfake detection. InProceedings of the IEEE/CVF international conference on computer vision, pages 15023–15033, 2021. 6
work page 2021
-
[66]
Exploring temporal coherence for more gen- eral video face forgery detection
Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, and Fang Wen. Exploring temporal coherence for more gen- eral video face forgery detection. InProceedings of the IEEE/CVF international conference on computer vision, pages 15044–15054, 2021. 6
work page 2021
-
[67]
Yue Zhou, Xinan He, Kaiqing Lin, Bing Fan, Feng Ding, Jin- hua Zeng, and Bin Li. Brought a gun to a knife fight: Modern vfm baselines outgun specialized detectors on in-the-wild ai image detection.arXiv preprint arXiv:2509.12995, 2025. 1
-
[68]
Wilddeepfake: A challenging real-world dataset for deepfake detection
Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. Wilddeepfake: A challenging real-world dataset for deepfake detection. InProceedings of the 28th ACM international conference on multimedia, pages 2382– 2390, 2020. 2 11
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.