pith. machine review for the scientific record. sign in

arxiv: 2604.24163 · v1 · submitted 2026-04-27 · 💻 cs.CV

Recognition: unknown

Robust Deepfake Detection, NTIRE 2026 Challenge: Report

Authors on Pith no claims yet

Pith reviewed 2026-05-08 04:50 UTC · model grok-4.3

classification 💻 cs.CV
keywords deepfake detectionrobustnessimage degradationNTIRE challengefoundation modelsensemble learningforgery detection
0
0 comments X

The pith

The NTIRE 2026 Robust Deepfake Detection Challenge finds that top detectors maintain performance under degradations by using large foundation models, ensembles, and degradation-specific training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This report describes the first NTIRE challenge on robust deepfake detection, where participants built detectors tested on an unknown set of images containing both common and uncommon degradations of varying strengths. The challenge setup with limited test time and a private final set prevents overfitting and ensures the results reflect real generalization. Top entries succeeded by relying on large foundation models combined through ensembles and trained with diverse degradations. A sympathetic reader cares because standard deepfake detectors lose accuracy from even minor image processing artifacts or from malicious forgeries that deliberately add degradations.

Core claim

In the NTIRE 2026 Robust Deepfake Detection Challenge, the highest-scoring methods combined large foundation models, ensemble strategies, and training on images with various degradations to achieve both generality across clean cases and robustness to both common and uncommon degradations.

What carries the argument

The unknown test set containing common and uncommon degradations of multiple strengths, used to score detectors after a 24-hour test window with no labels provided.

If this is right

  • Detectors trained only on undegraded images will underperform once exposed to real image processing pipelines or adversarial corruptions.
  • Ensembling multiple large models helps balance accuracy on clean images with stability under degradation.
  • Explicit training on a range of degradations is required to reach high performance on both common and uncommon corruption types.
  • Large foundation models supply the capacity needed for detectors to generalize across varied deepfake generation methods and degradation levels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future detection systems could extend this approach by adding online adaptation mechanisms when new degradation types appear after deployment.
  • The same combination of foundation models and degradation training may improve robustness in related tasks such as video or audio forgery detection.
  • Challenge organizers could introduce progressively harder or previously unseen degradation families in subsequent editions to keep pushing the frontier.

Load-bearing premise

The degradations present in the challenge test set are representative enough of accidental and malicious degradations that detectors will face after real-world deployment.

What would settle it

A top challenge method that scores high on the provided test set but drops sharply on a new collection of degradations never seen in the challenge training or test data.

Figures

Figures reproduced from arXiv: 2604.24163 by Aashish Negi, Aishwarya A, Akshara S, Akshay Dudhane, Amit Shukla, Anas M. Ali, Ashwathi N, Bang-Kang Chen, Benedikt Hopf, Bilel Benjdira, Chenfan Qu, Chia-Ming Lee, Chih-Chung Hsu, Chih-Yu Jian, Cristian Lazo Quispe, Dagong Lu, Fei Wu, Fengjun Guo, Feng Xu, Fu-En Yang, Guoyi Xu, Haodong Ren, Hardik Sharma, Hong Vin Koay, Jayant Kumar, Jiachen Tu, Jiajia Liu, Jia Wen Seow, Jielun Peng, Jincheng Liu, Junchi Li, Krish Wadhwani, Liam Fitzpatrick, Minh-Hoang Le, Minh-Khoa Le-Phan, Minh-Triet Tran, Mufeng Yao, Praful Hambarde, Prateek Shaily, Radu Timofte, Sachin Chaudhary, Shuai Chen, Trong-Le Do, Utkarsh Tiwari, Wadii Boulila, Xiaopeng Hong, Xinlei Xu, Yabin Wang, Yaokun Shi, Yaoxin Jiang, Yaqi Li, Yi-Fan Wang, Yongwei Tang, You-Chen Chao, Yu-Chiang Frank Wang, Zhiqiang Wu, Zhiqiang Yang.

Figure 1
Figure 1. Figure 1: Overall pipeline of ShalloReal’s DINO-MAC model. fine-tuned using Low-Rank Adaptation (LoRA) [26] with a rank of 32 and an alpha of 64. The final prediction is generated by a Multi-Aspect Clas￾sification (MAC) head that processes features from the DI￾NOv3 backbone. This module aggregates information from multiple sources: the [CLS] token, four [REG] register to￾kens, and an [AVG] token representing the ave… view at source ↗
Figure 2
Figure 2. Figure 2: Overall pipeline of INTSIG’s LOGER: Local-Global Ensemble for Robust Deepfake Detection in the Wild. models: M1 and M2 share a DINOv3-Huge [67] backbone with full-parameter fine-tuning and a two-layer MLP clas￾sification head. M1 is trained and inferred at 256×256, while M2 is trained at the same resolution but inferred at 384×384, preserving fine-grained forensic details that lower resolutions would disca… view at source ↗
Figure 3
Figure 3. Figure 3: Overall pipeline of AntInternational’s An Ensemble of Architecturally-Diverse Large-Scale Vision Transformers. The final submission score is a weighted average of the predictions from our two independently trained models. We determined an optimal 35/65 weighting scheme: \textit {Confidence}_{\text {final}} = \alpha \cdot f_{\text {CLS}}(I) + \beta \cdot f_{\text {AttnPool}}(I) \label {eq:ensemble_ant} (1) … view at source ↗
Figure 4
Figure 4. Figure 4: Overall pipeline of HCMUS-Aqua’s Robust Deepfake Detection via Multi-Stream DINO-CLIP Fusion and Discretized Voting. The system processes inputs through three specialized expert streams. The Localized Facial and Global Texture streams maintain native signal integrity (252×252) utilizing a shared DINOv2-Giant backbone [52]. The Hybrid Semantic Fusion stream (224×224) concatenates geometric features from DIN… view at source ↗
Figure 5
Figure 5. Figure 5: Overall pipeline of ACV Lab’s Quality-Aware Multi-Expert Routing with Robust Optimization for Deepfake Detection. GAPL Prototype Global Head GenD-DINOV3 + LN Input Image 4 view TTA GAPL-CLIP + LORA Local Head Global Top K Anomaly GAPL Score DINOV3 Score MLP Fusion Degradation Description Feat a Feat b Fusion Score Rank Avg 55% 35% 15% Final Score view at source ↗
Figure 6
Figure 6. Figure 6: Overall pipeline of Reagvis Labs’s Beyond Backbones: Degradation-Aware Prototype Fusion for Robust Deepfake Detection. GenD [90] deepfake-tuned initialization, (3) a degradation￾aware fusion MLP, and (4) rank-based multi-model score calibration. The final AUC is 84.3 on the competition test set. Backbone A – CLIP-GAPL: CLIP ViT-L/14 [60] fine-tuned with LoRA [26] (r=16, α=32, applied to Wq, Wk, Wv). The po… view at source ↗
Figure 7
Figure 7. Figure 7: Overall pipeline of HIT-VIRLAB’s Hierarchical Adap￾tive Feature Aggregation with Degraded-Original Consistency Learning for Robust Deepfake Detection. data from multiple publicly available sources, includ￾ing FF++ [62], DFDC [13], FakeAVCeleb [29], Celeb￾DF++ [41], DF40 [88], and DDL [48]. The resulting million-scale dataset contains diverse forgery generation methods and visual conditions, providing rich … view at source ↗
Figure 8
Figure 8. Figure 8: Overall pipeline of PSU’s PRISM: Paradigm-diverse Representation Integration for Synthesis-artifact Manifold Detection. 3 view at source ↗
Figure 9
Figure 9. Figure 9: Overall pipeline of AI4Good’s Self-Supervised Adversarial Training for Robust Deepfake Detection. 4 view at source ↗
Figure 10
Figure 10. Figure 10: Overall pipeline of ACUBE’s Robust Deepfake Detec￾tion using ConvNeXt with Frequency-Aware Fusion and Regular￾ized Training Strategy. 5 view at source ↗
read the original abstract

Robustness is a long-overlooked problem in deepfake detection. However, detection performance is nearly worthless in the real world if it suffers under exposure to even slight image degradation. In addition to weaker degradations that can accidentally occur in the image processing pipeline, there is another risk of malicious deepfakes that specifically introduce degradations, purposefully exploiting the detector's weaknesses in that regard. Here, we present an overview of the NTIRE 2026 Robust Deepfake Detection Challenge, which specifically addresses that problem. Participants were tasked with building a detector that would later be tested on an unknown test-set, which included both common and uncommon degradations of various strengths. With a total number of 337 participants and 57 submissions to the final leaderboard, the first edition of the challenge was well received. To ensure the reliability of the results, participants were given only 24h to complete the test run with no labels provided, limiting the possibility of training on the test data. Furthermore, the top solutions were scored on a private test-set to detect any such overfitting. This report presents the competition setting, dataset preparation, as well as details and performance of methods. Top methods rely on large foundation models, ensembles, and degradation training to combine generality and robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The manuscript reports on the NTIRE 2026 Robust Deepfake Detection Challenge. It outlines the motivation for addressing robustness to image degradations (both accidental and malicious), the competition rules (24-hour blind test window on an unknown test set containing common and uncommon degradations of varying strengths, with private test-set scoring), participation statistics (337 participants and 57 final submissions), dataset preparation, and the observed characteristics of the top-ranked methods, which predominantly employed large foundation models, ensembles, and degradation training.

Significance. If the reported trends hold, the paper supplies a useful empirical benchmark for robust deepfake detection, showing that foundation-model-based ensembles trained on degradations can improve both generality and resilience. The explicit anti-overfitting measures (time-limited blind testing and private scoring) lend credibility to the descriptive findings and provide the community with concrete guidance on promising architectural and training choices.

minor comments (4)
  1. [Abstract] Abstract: the statement that the challenge 'was well received' is subjective and unsupported by any quantitative indicator (e.g., survey results or comparison with prior NTIRE editions); this should be either removed or substantiated.
  2. [Results] Results / Methods summary: performance trends are asserted without a table or figure listing the top submissions together with their key components (foundation model, ensemble size, degradation types) and exact scores; adding such a summary would make the central descriptive claim more verifiable and concrete.
  3. [Dataset Preparation] Dataset and evaluation: the manuscript does not state the precise ranking metric (AUC, accuracy, etc.) or whether error bars / multiple runs were used, which is needed to assess the stability of the observed ranking of foundation-model approaches.
  4. [Conclusion] The report contains no limitations section discussing possible mismatches between the challenge degradations and real-world malicious attacks, even though the abstract highlights this risk.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our manuscript on the NTIRE 2026 Robust Deepfake Detection Challenge and for recommending minor revision. We appreciate the recognition that the reported trends on foundation-model ensembles and degradation-aware training provide a useful empirical benchmark, and that the 24-hour blind test with private scoring strengthens the credibility of the results.

Circularity Check

0 steps flagged

No significant circularity; purely descriptive competition report

full rationale

The manuscript is a standard challenge report summarizing the NTIRE 2026 Robust Deepfake Detection setup, dataset construction, evaluation protocol (including the 24-hour test window and private test-set), and observed characteristics of the 57 submissions. No equations, derivations, fitted parameters, or causal claims are advanced; the sole load-bearing statement is an empirical summary that top entries used foundation models, ensembles, and degradation training. This observation is external to the report itself and does not reduce to any self-definition, self-citation chain, or input renaming. The document therefore contains no load-bearing steps that could be circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model, derivation, or new postulated entity is introduced; the document is an empirical competition report.

pith-pipeline@v0.9.0 · 5785 in / 1035 out tokens · 23007 ms · 2026-05-08T04:50:05.969719+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

97 extracted references · 10 canonical work pages · 5 internal anchors

  1. [1]

    NT-HAZE: A Benchmark Dataset for Re- alistic Night-time Image Dehazing

    Radu Ancuti, Codruta Ancuti, Radu Timofte, and Cos- min Ancuti. NT-HAZE: A Benchmark Dataset for Re- alistic Night-time Image Dehazing . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  2. [2]

    NTIRE 2026 Nighttime Image Dehazing Challenge Report

    Radu Ancuti, Alexandru Brateanu, Florin Vasluianu, Raul Balmez, Ciprian Orhei, Codruta Ancuti, Radu Timofte, Cos- min Ancuti, et al. NTIRE 2026 Nighttime Image Dehazing Challenge Report . InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  3. [3]

    Self-supervised learning from images with a joint-embedding predictive architecture, 2023

    Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bo- janowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture, 2023. 4

  4. [4]

    The devil is in the details: Stylefeatureeditor for detail-rich stylegan inversion and high quality image editing,

    Denis Bobkov, Vadim Titov, Aibek Alanov, and Dmitry Vetrov. The devil is in the details: Stylefeatureeditor for detail-rich stylegan inversion and high quality image editing,

  5. [5]

    NTIRE 2026 Challenge on Single Image Re- flection Removal in the Wild: Datasets, Results, and Meth- ods

    Jie Cai, Kangning Yang, Zhiyuan Li, Florin Vasluianu, Radu Timofte, et al. NTIRE 2026 Challenge on Single Image Re- flection Removal in the Wild: Datasets, Results, and Meth- ods . InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) Workshops,

  6. [6]

    The Fourth Challenge on Image Super-Resolution (×4) at NTIRE 2026: Benchmark Results and Method Overview

    Zheng Chen, Kai Liu, Jingkai Wang, Xianglong Yan, Jianze Li, Ziqing Zhang, Jue Gong, Jiatong Li, Lei Sun, Xi- aoyang Liu, Radu Timofte, Yulun Zhang, et al. The Fourth Challenge on Image Super-Resolution (×4) at NTIRE 2026: Benchmark Results and Method Overview . InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR) W...

  7. [7]

    Meta clip 2: A worldwide scaling recipe, 2025

    Yung-Sung Chuang, Yang Li, Dong Wang, Ching-Feng Yeh, Kehan Lyu, Ramya Raghavendra, James Glass, Lifei Huang, Jason Weston, Luke Zettlemoyer, Xinlei Chen, Zhuang Liu, Saining Xie, Wen tau Yih, Shang-Wen Li, and Hu Xu. Meta clip 2: A worldwide scaling recipe, 2025. 4

  8. [8]

    Low Light Image Enhancement Challenge at NTIRE 2026

    George Ciubotariu, Sharif S M A, Abdur Rehman, Fayaz Ali Dharejo, Rizwan Ali Naqvi, Marcos Conde, Radu Tim- ofte, et al. Low Light Image Enhancement Challenge at NTIRE 2026 . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 1

  9. [9]

    High FPS Video Frame Interpolation Challenge at NTIRE 2026

    George Ciubotariu, Zhuyun Zhou, Yeying Jin, Zongwei Wu, Radu Timofte, et al. High FPS Video Frame Interpolation Challenge at NTIRE 2026 . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  10. [10]

    Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Gio- vanni Poggi, Koki Nagano, and Luisa Verdoliva. On the detection of synthetic images generated by diffusion mod- els.ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, 2022. 9

  11. [11]

    Contributing data to deepfake detection

    DFD. Contributing data to deepfake detection. Google AI Blog, 2020. Accessed: 2021-04-24. 6

  12. [12]

    arXiv preprint arXiv:1910.08854 , year=

    Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton-Ferrer. The deepfake detection chal- lenge (dfdc) preview dataset.ArXiv, abs/1910.08854, 2019. 6

  13. [13]

    The DeepFake Detection Challenge (DFDC) Dataset

    Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton- Ferrer. The deepfake detection challenge dataset.ArXiv, abs/2006.07397, 2020. 6, 8

  14. [14]

    An image is worth 16x16 words: Transformers for image recognition at scale, 2021

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2021. 6, 8, 9

  15. [15]

    Can we build a monolithic model for fake image detection? sica: Semantic-induced constrained adaptation for unified-yet-discriminative artifact feature space reconstruction, 2026

    Bo Du, Xiaochen Ma, Xuekang Zhu, Zhe Yang, Chaogun Niu, Jian Liu, and Ji-Zhe Zhou. Can we build a monolithic model for fake image detection? sica: Semantic-induced constrained adaptation for unified-yet-discriminative artifact feature space reconstruction, 2026. 8

  16. [16]

    NTIRE 2026 Rip Current Detection and Segmentation (RipDetSeg) Chal- lenge Report

    Andrei Dumitriu, Aakash Ralhan, Florin Miron, Florin Ta- tui, Radu Tudor Ionescu, Radu Timofte, et al. NTIRE 2026 Rip Current Detection and Segmentation (RipDetSeg) Chal- lenge Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 1

  17. [17]

    Conde, Zongwei Wu, Yeying Jin, Radu Timofte, et al

    Omar Elezabi, Marcos V . Conde, Zongwei Wu, Yeying Jin, Radu Timofte, et al. Photography Retouching Trans- fer, NTIRE 2026 Challenge: Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  18. [18]

    EV A-Giant Patch14 224 CLIP ft IN1k

    EV A-Giant Models. EV A-Giant Patch14 224 CLIP ft IN1k. Hugging Face Model Hub, 2023. 4

  19. [19]

    Eva-02: A visual representation for neon genesis.Image Vis

    Yuxin Fang, Quan Sun, Xinggang Wang, Tiejun Huang, Xin- long Wang, and Yue Cao. Eva-02: A visual representation for neon genesis.Image Vis. Comput., 149:105171, 2023. 9

  20. [20]

    NTIRE 2026 Challenge on End-to-End Financial Receipt Restoration and Reasoning from Degraded Images: Datasets, Methods and Results

    Bochen Guan, Jinlong Li, Kangning Yang, Chuang Ke, Jie Cai, Florin Vasluianu, Radu Timofte, et al. NTIRE 2026 Challenge on End-to-End Financial Receipt Restoration and Reasoning from Degraded Images: Datasets, Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  21. [21]

    NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: AI Flash Portrait (Track 3)

    Ya-nan Guan, Shaonan Zhang, Hang Guo, Yawen Wang, Xinying Fan, Jie Liang, Hui Zeng, Guanyi Qin, Lishen Qu, Tao Dai, Shu-Tao Xia, Lei Zhang, Radu Timofte, et al. NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: AI Flash Portrait (Track 3) . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1 10

  22. [22]

    NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild

    Aleksandr Gushchin, Khaled Abud, Ekaterina Shumitskaya, Artem Filippov, Georgii Bychkov, Sergey Lavrushkin, Mikhail Erofeev, Anastasia Antsiferova, Changsheng Chen, Shunquan Tan, Radu Timofte, Dmitriy Vatolin, et al. NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild . InProceedings of the IEEE/CVF Conference on Computer Vision and Pa...

  23. [23]

    Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

    Dan Hendrycks and Thomas G. Dietterich. Benchmarking neural network robustness to common corruptions and per- turbations.ArXiv, abs/1903.12261, 2019. 9

  24. [24]

    Practical manipulation model for robust deepfake detection

    Benedikt Hopf and Radu Timofte. Practical manipulation model for robust deepfake detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pages 5675–5684, 2025. 1, 2, 3, 5, 9

  25. [25]

    Robust Deepfake De- tection, NTIRE 2026 Challenge: Report

    Benedikt Hopf, Radu Timofte, et al. Robust Deepfake De- tection, NTIRE 2026 Challenge: Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  26. [26]

    LoRA: Low-Rank Adaptation of Large Language Models

    J. Edward Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models.ArXiv, abs/2106.09685, 2021. 3, 6, 7, 8

  27. [27]

    Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection, 2020

    Liming Jiang, Ren Li, Wayne Wu, Chen Qian, and Chen Change Loy. Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection, 2020. 6

  28. [28]

    Woo, and Jinyoung Han

    Chaewon Kang, Seoyoon Jeong, Jonghyun Lee, Daejin Choi, Simon S. Woo, and Jinyoung Han. Hidf: A human- indistinguishable deepfake dataset. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .2, page 5527–5538, New York, NY , USA, 2025. Association for Computing Machinery. 6

  29. [29]

    Hasam Khalid, Shahroz Tariq, Minha Kim, and Simon S. Woo. Fakeavceleb: A novel audio-video multimodal deep- fake dataset, 2022. 8

  30. [30]

    NTIRE 2026 Low-light Enhancement: Twilight Cowboy Challenge

    Aleksei Khalin, Egor Ershov, Artem Panshin, Sergey Ko- rchagin, Georgiy Lobarev, Arseniy Terekhin, Sofiia Doro- gova, Amir Shamsutdinov, Yasin Mamedov, Bakhtiyar Khalfin, Bogdan Sheludko, Emil Zilyaev, Nikola Bani ´c, Georgy Perevozchikov, Radu Timofte, et al. NTIRE 2026 Low-light Enhancement: Twilight Cowboy Challenge . In Proceedings of the IEEE/CVF Con...

  31. [31]

    Faceswap.https://github.com/ MarekKowalski/FaceSwap, 2018

    Marek Kowalski. Faceswap.https://github.com/ MarekKowalski/FaceSwap, 2018. 2

  32. [32]

    Seeable: Soft discrepancies and bounded contrastive learning for exposing deepfakes.2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 20954–20964, 2022

    Nicolas Larue, Ngoc-Son Vu, Vitomir Struc, Peter Peer, and Vassilis Christophides. Seeable: Soft discrepancies and bounded contrastive learning for exposing deepfakes.2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 20954–20964, 2022. 1

  33. [33]

    Robust Deepfake Detection: Mitigat- ing Spatial Attention Drift via Calibrated Complementary Ensembles

    Minh-Khoa Le-Phan, Minh-Hoang Le, Trong-Le Do, and Minh-Triet Tran. Robust Deepfake Detection: Mitigat- ing Spatial Attention Drift via Calibrated Complementary Ensembles . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 5

  34. [34]

    The First Challenge on Mobile Real-World Image Super- Resolution at NTIRE 2026: Benchmark Results and Method Overview

    Jiatong Li, Zheng Chen, Kai Liu, Jingkai Wang, Zihan Zhou, Xiaoyang Liu, Libo Zhu, Radu Timofte, Yulun Zhang, et al. The First Challenge on Mobile Real-World Image Super- Resolution at NTIRE 2026: Benchmark Results and Method Overview . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  35. [35]

    Face x-ray for more general face forgery detection.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5000–5009, 2019

    Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Baining Guo. Face x-ray for more general face forgery detection.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5000–5009, 2019. 1

  36. [36]

    Advancing high fidelity identity swapping for forgery detection

    Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, and Fang Wen. Advancing high fidelity identity swapping for forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5074–5083,

  37. [37]

    NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results

    Xin Li, Jiachao Gong, Xijun Wang, Shiyao Xiong, Bingchen Li, Suhang Yao, Chao Zhou, Zhibo Chen, Radu Timofte, et al. NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  38. [38]

    NTIRE 2026 The Second Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

    Xin Li, Yeying Jin, Suhang Yao, Beibei Lin, Zhaoxin Fan, Wending Yan, Xin Jin, Zongwei Wu, Bingchen Li, Peishu Shi, Yufei Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby Tan, Radu Timofte, et al. NTIRE 2026 The Second Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer...

  39. [39]

    In ictu oculi: Exposing ai created fake videos by detecting eye blinking

    Yuezun Li, Ming-Ching Chang, and Siwei Lyu. In ictu oculi: Exposing ai created fake videos by detecting eye blinking. In 2018 IEEE International workshop on information forensics and security (WIFS), pages 1–7. Ieee, 2018. 6

  40. [40]

    Celeb-df: A large-scale challenging dataset for deep- fake forensics.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3204–3213,

    Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-df: A large-scale challenging dataset for deep- fake forensics.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3204–3213,

  41. [41]

    Celeb- df++: A large-scale challenging video deepfake benchmark for generalizable forensics, 2025

    Yuezun Li, Delong Zhu, Xinjie Cui, and Siwei Lyu. Celeb- df++: A large-scale challenging video deepfake benchmark for generalizable forensics, 2025. 6, 8

  42. [42]

    Focal loss for dense object detection, 2018

    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll´ar. Focal loss for dense object detection, 2018. 4

  43. [43]

    The First Chal- lenge on Remote Sensing Infrared Image Super-Resolution at NTIRE 2026: Benchmark Results and Method Overview

    Kai Liu, Haoyang Yue, Zeli Lin, Zheng Chen, Jingkai Wang, Jue Gong, Radu Timofte, Yulun Zhang, et al. The First Chal- lenge on Remote Sensing Infrared Image Super-Resolution at NTIRE 2026: Benchmark Results and Method Overview . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  44. [44]

    Conde, et al

    Shuhong Liu, Ziteng Cui, Chenyu Bao, Xuangeng Chu, Lin Gu, Bin Ren, Radu Timofte, Marcos V . Conde, et al. 3D Restoration and Reconstruction in Adverse Conditions: Re- alX3D Challenge Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  45. [45]

    NTIRE 2026 X- AIGC Quality Assessment Challenge: Methods and Results

    Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Qiang Hu, Jiezhang Cao, Yu Zhou, Wei Sun, Farong Wen, Zitong Xu, 11 Yingjie Zhou, Huiyu Duan, Lu Liu, Jiarui Wang, Siqi Luo, Chunyi Li, Li Xu, Zicheng Zhang, Yue Shi, Yubo Wang, Minghong Zhang, Chunchao Guo, Zhichao Hu, Mingtao Chen, Xiele Wu, Xin Ma, Zhaohe Lv, Yuanhao Xue, Jiaqi Wang, Xinxing Sha, Radu Timofte,...

  46. [46]

    A convnet for the 2020s.2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11966–11976, 2022

    Zhuang Liu, Hanzi Mao, Chaozheng Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s.2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11966–11976, 2022. 9

  47. [47]

    A new approach to im- prove learning-based deepfake detection in realistic condi- tions, 2022

    Yuhang Lu and Touradj Ebrahimi. A new approach to im- prove learning-based deepfake detection in realistic condi- tions, 2022. 1

  48. [48]

    Ddl: A large-scale datasets for deepfake detection and local- ization in diversified real-world scenarios, 2025

    Changtao Miao, Yi Zhang, Weize Gao, Zhiya Tan, Weiwei Feng, Man Luo, Jianshu Li, Ajian Liu, Yunfeng Diao, Qi Chu, Tao Gong, Zhe Li, Weibin Yao, and Joey Tianyi Zhou. Ddl: A large-scale datasets for deepfake detection and local- ization in diversified real-world scenarios, 2025. 4, 6, 8

  49. [49]

    NTIRE 2026 Challenge on Video Saliency Predic- tion: Methods and Results

    Andrey Moskalenko, Alexey Bryncev, Ivan Kosmynin, Kira Shilovskaya, Mikhail Erofeev, Dmitry Vatolin, Radu Timo- fte, et al. NTIRE 2026 Challenge on Video Saliency Predic- tion: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  50. [50]

    Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake de- tection

    Dat Nguyen, Nesryne Mejri, Inder Pal Singh, Polina Kuleshova, Marcella Astrid, Anis Kacem, Enjie Ghorbel, and Djamila Aouada. Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake de- tection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17395– 17405, 2024. 1, 3, 4

  51. [51]

    FSGAN: Sub- ject agnostic face swapping and reenactment

    Yuval Nirkin, Yosi Keller, and Tal Hassner. FSGAN: Sub- ject agnostic face swapping and reenactment. InProceedings of the IEEE International Conference on Computer Vision, pages 7184–7193, 2019. 3

  52. [52]

    Dinov2: Learning robust visual features with- out supervision, 2024

    Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mah- moud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herv ´e Je- gou, Julien Mairal, ...

  53. [53]

    NTIRE 2026 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results

    Hyunhee Park, Eunpil Park, Sangmin Lee, Radu Timofte, et al. NTIRE 2026 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results . InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  54. [54]

    NTIRE 2026 Challenge on Learned Smartphone ISP with Unpaired Data: Methods and Results

    Georgy Perevozchikov, Daniil Vladimirov, Radu Timofte, et al. NTIRE 2026 Challenge on Learned Smartphone ISP with Unpaired Data: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR) Workshops, 2026. 1

  55. [55]

    NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1)

    Guanyi Qin, Jie Liang, Bingbing Zhang, Lishen Qu, Ya-nan Guan, Hui Zeng, Lei Zhang, Radu Timofte, et al. NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1) . InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  56. [56]

    Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes

    Ziheng Qin, Yuheng Ji, Renshuai Tao, Yuxuan Tian, Yuyang Liu, Yipu Wang, and Xiaolong Zheng. Scaling up ai- generated image detection with generator-aware prototypes. arXiv preprint arXiv:2512.12982, 2025. 7

  57. [57]

    The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results

    Xingyu Qiu, Yuqian Fu, Jiawei Geng, Bin Ren, Jiancheng Pan, Zongwei Wu, Hao Tang, Yanwei Fu, Radu Timo- fte, Nicu Sebe, Mohamed Elhoseiny, et al. The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  58. [58]

    Dino-mac: First- place winner solution of the cvpr2026 robust deepfake detec- tion challenge

    Chenfan Qu, Lianwen Jin, Junchi Li, et al. Dino-mac: First- place winner solution of the cvpr2026 robust deepfake detec- tion challenge. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 3

  59. [59]

    NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Multi-Exposure Image Fusion in Dynamic Scenes (Track2)

    Lishen Qu, Yao Liu, Jie Liang, Hui Zeng, Wen Dai, Ya-nan Guan, Guanyi Qin, Shihao Zhou, Jufeng Yang, Lei Zhang, Radu Timofte, et al. NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Multi-Exposure Image Fusion in Dynamic Scenes (Track2) . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  60. [60]

    Learning transferable visual models from natural language supervision, 2021

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021. 5, 6, 7, 8, 9

  61. [61]

    The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

    Bin Ren, Hang Guo, Yan Shu, Jiaqi Ma, Ziteng Cui, Shuhong Liu, Guofeng Mei, Lei Sun, Zongwei Wu, Fahad Shahbaz Khan, Salman Khan, Radu Timofte, Yawei Li, et al. The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  62. [62]

    FaceForen- sics++: Learning to detect manipulated facial images

    Andreas R ¨ossler, Davide Cozzolino, Luisa Verdoliva, Chris- tian Riess, Justus Thies, and Matthias Nießner. FaceForen- sics++: Learning to detect manipulated facial images. In International Conference on Computer Vision (ICCV), 2019. 6, 8

  63. [63]

    Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

    Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. Distributionally robust neural networks for group shifts: On the importance of regularization for worst- case generalization.arXiv preprint arXiv:1911.08731, 2019. 6

  64. [64]

    Conde, Jeffrey Chen, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al

    Tim Seizinger, Florin-Alexandru Vasluianu, Marcos V . Conde, Jeffrey Chen, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. The First Controllable Bokeh Rendering Challenge at NTIRE 2026 . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  65. [65]

    Towards real-world deepfake de- tection: A diverse in-the-wild dataset of forgery faces, 2025

    Junyu Shi, Minghui Li, Junguo Zuo, Zhifei Yu, Yipeng Lin, Shengshan Hu, Ziqi Zhou, Yechao Zhang, Wei Wan, Yinzhe 12 Xu, and Leo Yu Zhang. Towards real-world deepfake de- tection: A diverse in-the-wild dataset of forgery faces, 2025. 6

  66. [66]

    Yamasaki

    Kaede Shiohara and T. Yamasaki. Detecting deepfakes with self-blended images.2022 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 18699– 18708, 2022. 1, 3, 9

  67. [67]

    Oriane Sim ´eoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timoth´ee Darcet, Th´eo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie,...

  68. [68]

    The Third Challenge on Image Denoising at NTIRE 2026: Methods and Results

    Lei Sun, Hang Guo, Bin Ren, Shaolin Su, Xian Wang, Danda Pani Paudel, Luc Van Gool, Radu Timofte, Yawei Li, et al. The Third Challenge on Image Denoising at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  69. [69]

    The Second Challenge on Event-Based Image Deblurring at NTIRE 2026: Methods and Results

    Lei Sun, Weilun Li, Xian Wang, Zhendong Li, Letian Shi, Dannong Xu, Deheng Zhang, Mengshun Hu, Shuang Guo, Shaolin Su, Radu Timofte, Danda Pani Paudel, Luc Van Gool, et al. The Second Challenge on Event-Based Image Deblurring at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Wor...

  70. [70]

    NTIRE 2026 The First Challenge on Blind Computational Aberration Correction: Methods and Results

    Lei Sun, Xiaolong Qian, Qi Jiang, Xian Wang, Yao Gao, Kailun Yang, Kaiwei Wang, Radu Timofte, Danda Pani Paudel, Luc Van Gool, et al. NTIRE 2026 The First Challenge on Blind Computational Aberration Correction: Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  71. [71]

    Mingxing Tan and Quoc V . Le. Efficientnet: Rethinking model scaling for convolutional neural networks.ArXiv, abs/1905.11946, 2019. 9

  72. [72]

    Seven ways to improve example-based single image super resolu- tion

    Radu Timofte, Rasmus Rothe, and Luc Van Gool. Seven ways to improve example-based single image super resolu- tion. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1865–1873, 2016. 8, 9

  73. [73]

    Training data-efficient image transformers & distillation through at- tention, 2021

    Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herv ´e J´egou. Training data-efficient image transformers & distillation through at- tention, 2021. 9

  74. [74]

    Learning- Based Ambient Lighting Normalization: NTIRE 2026 Chal- lenge Results and Findings

    Florin-Alexandru Vasluianu, Tim Seizinger, Jeffrey Chen, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. Learning- Based Ambient Lighting Normalization: NTIRE 2026 Chal- lenge Results and Findings . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 1

  75. [75]

    Advances in Single- Image Shadow Removal: Results from the NTIRE 2026 Challenge

    Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. Advances in Single- Image Shadow Removal: Results from the NTIRE 2026 Challenge . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 1

  76. [76]

    Fsfm: A generalizable face se- curity foundation model via self-supervised facial represen- tation learning

    Gaojian Wang, Feng Lin, Tong Wu, Zhenguang Liu, Zhongjie Ba, and Kui Ren. Fsfm: A generalizable face se- curity foundation model via self-supervised facial represen- tation learning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 24364–24376, 2025. 8

  77. [77]

    The Second Challenge on Real-World Face Restoration at NTIRE 2026: Methods and Results

    Jingkai Wang, Jue Gong, Zheng Chen, Kai Liu, Jiatong Li, Yulun Zhang, Radu Timofte, et al. The Second Challenge on Real-World Face Restoration at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  78. [78]

    NTIRE 2026 Challenge on 3D Content Super-Resolution: Methods and Results

    Longguang Wang, Yulan Guo, Yingqian Wang, Juncheng Li, Sida Peng, Ye Zhang, Radu Timofte, Minglin Chen, Yi Wang, Qibin Hu, Wenjie Lei, et al. NTIRE 2026 Challenge on 3D Content Super-Resolution: Methods and Results . In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR) Workshops, 2026. 1

  79. [79]

    Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. Cnn-generated images are sur- prisingly easy to spot. . . for now.2020 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 8692–8701, 2019. 9

  80. [80]

    Understanding contrastive representation learning through alignment and uniformity on the hypersphere.ArXiv, abs/2005.10242, 2020

    Tongzhou Wang and Phillip Isola. Understanding contrastive representation learning through alignment and uniformity on the hypersphere.ArXiv, abs/2005.10242, 2020. 9

Showing first 80 references.