Exposing and Mitigating Temporal Attack in Deepfake Video Detection
Pith reviewed 2026-05-11 01:52 UTC · model grok-4.3
The pith
Deepfake video detectors overfit to fragile temporal spectrum cues and can be evaded by spectral attacks, while SpInShield forces reliance on stable semantic motion instead.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Spatiotemporal deepfake detectors achieve high AUC scores yet remain susceptible to evasion because they overfit on fragile temporal spectrum cues instead of learning robust semantic causality. SpInShield addresses this by decoupling semantic motion from manipulatable spectral artifacts: a learnable spectral adversary dynamically synthesizes severe spectral deformations to simulate extreme attacks, and a shortcut suppression optimization compels the encoder to extract reliable forensic cues while purging unstable spectral statistics from the latent space.
What carries the argument
The learnable spectral adversary, which dynamically generates severe spectral deformations to mimic extreme attacks, paired with shortcut suppression optimization that removes unstable spectral statistics from the latent representation.
If this is right
- Models trained under SpInShield retain competitive AUC on standard deepfake datasets while showing substantially higher resistance to amplitude spectral attacks.
- The encoder is forced to prioritize semantic motion causality over any spectral shortcuts that can be altered by an adversary.
- The same training procedure can be applied to other video-based forensic tasks that currently rely on fragile frequency-domain cues.
- Detectors become harder to evade because attackers must now alter the underlying semantic content rather than just the spectral profile.
Where Pith is reading between the lines
- Similar spectral vulnerabilities are likely present in other video understanding models that process motion, such as action recognition systems.
- The defense suggests that any detection method relying on frequency statistics should be re-examined for shortcut learning before deployment.
- Real-world validation would require applying the method to deepfakes generated by unknown future manipulation techniques rather than only simulated attacks.
- The approach could extend to audio or multimodal deepfakes if analogous spectral instabilities exist in those domains.
Load-bearing premise
That the simulated spectral deformations accurately represent real attacker capabilities and that removing unstable spectral statistics leaves behind all the forensic information the detector actually needs.
What would settle it
A test set of deepfake videos subjected to real amplitude spectral modifications where SpInShield's AUC falls to the level of the strongest undefended baseline.
Figures
read the original abstract
While spatiotemporal deepfake detectors achieve high AUC, our experiments reveal their susceptibility to evasion attacks. These models tend to overfit on fragile temporal spectrum cues, rather than learning robust semantic causality. To mitigate this vulnerability, we propose SpInShield, a temporal spectral-invariant defense framework explicitly designed to decouple semantic motion from manipulatable spectral artifacts. We propose a learnable spectral adversary that dynamically synthesizes severe spectral deformations, simulating extreme attack scenarios. By employing a shortcut suppression optimization strategy, SpInShield compels the encoder to extract reliable forensic cues while purging unstable spectral statistics from the latent space. Experiments show that SpInShield obtains competitive performance on widely used datasets and outperforms the strongest baseline by 21.30 percentage points in AUC under simulated amplitude spectral attacks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that spatiotemporal deepfake detectors overfit to fragile temporal spectrum cues rather than robust semantic features, making them vulnerable to evasion attacks. It proposes SpInShield, a temporal spectral-invariant defense that introduces a learnable spectral adversary to dynamically synthesize severe amplitude spectral deformations during training, combined with a shortcut suppression optimization to purge unstable spectral statistics from the latent space. The method is reported to achieve competitive performance on standard deepfake datasets while delivering a 21.30 percentage point AUC gain over the strongest baseline specifically under the simulated attacks generated by this adversary.
Significance. If the simulated attacks faithfully represent the distribution of real evasion attacks that deepfake generators can produce, SpInShield could offer a practical framework for building more robust detectors by enforcing invariance to manipulatable spectral artifacts. The learnable adversary approach for simulating extreme scenarios is a potentially useful training-time augmentation technique, though its value depends on independent validation beyond the training distribution.
major comments (3)
- [Abstract / Experimental Evaluation] Abstract and experimental results: the headline 21.30 pp AUC improvement is reported exclusively under 'simulated amplitude spectral attacks' generated by the same learnable spectral adversary used during training. This creates a potential circularity risk; the evaluation does not demonstrate robustness against independent real-world temporal manipulations or fixed non-learnable attacks, so the gain may reflect overfitting to the adversary's output distribution rather than genuine invariance.
- [Method / Shortcut Suppression Optimization] Shortcut suppression strategy: the description of 'purging unstable spectral statistics' lacks an explicit, reproducible definition or criterion (e.g., variance threshold, gradient norm, or statistical test). Without this and an ablation confirming that the purged features do not contain stable forensic cues under other perturbations, it remains unclear whether useful detection signal is being discarded.
- [Experiments] Experimental setup: the abstract and results provide no details on the specific baselines compared, the datasets and attack parameters used to train/validate the learnable adversary, or how the 'widely used datasets' were split for the robustness experiments. This limits verification of the central claim and reproducibility.
minor comments (2)
- [Method] Notation for spectral components (e.g., amplitude vs. phase) could be clarified with explicit equations or diagrams to avoid ambiguity in the temporal spectrum discussion.
- [Abstract] The abstract mentions 'competitive performance on widely used datasets' but does not name the datasets or report the corresponding AUC numbers; adding a summary table would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. We address each major comment below and commit to revisions that enhance the clarity, reproducibility, and strength of our claims without altering the core contributions.
read point-by-point responses
-
Referee: [Abstract / Experimental Evaluation] Abstract and experimental results: the headline 21.30 pp AUC improvement is reported exclusively under 'simulated amplitude spectral attacks' generated by the same learnable spectral adversary used during training. This creates a potential circularity risk; the evaluation does not demonstrate robustness against independent real-world temporal manipulations or fixed non-learnable attacks, so the gain may reflect overfitting to the adversary's output distribution rather than genuine invariance.
Authors: We acknowledge the validity of the circularity concern. The learnable adversary is deliberately trained to generate severe deformations as a worst-case training augmentation, and the reported gain under its distribution validates the shortcut-suppression objective. However, this does not fully substitute for evaluation on independent attacks. In the revision we will add results on fixed (non-learnable) amplitude spectral perturbations and at least one additional temporal manipulation method drawn from the literature, using the same evaluation protocol. These new experiments will be reported alongside the existing adversary-based results. revision: yes
-
Referee: [Method / Shortcut Suppression Optimization] Shortcut suppression strategy: the description of 'purging unstable spectral statistics' lacks an explicit, reproducible definition or criterion (e.g., variance threshold, gradient norm, or statistical test). Without this and an ablation confirming that the purged features do not contain stable forensic cues under other perturbations, it remains unclear whether useful detection signal is being discarded.
Authors: We agree that the current description of the shortcut suppression optimization is insufficiently precise. The revised manuscript will include the exact loss formulation, the criterion used to identify unstable spectral statistics (a variance-based threshold computed over the batch in the frequency domain), and the optimization schedule. We will also add an ablation that measures detection performance when the suppression term is removed or replaced by random feature dropout, under both the original and additional perturbation sets, to confirm that stable forensic cues are retained. revision: yes
-
Referee: [Experiments] Experimental setup: the abstract and results provide no details on the specific baselines compared, the datasets and attack parameters used to train/validate the learnable adversary, or how the 'widely used datasets' were split for the robustness experiments. This limits verification of the central claim and reproducibility.
Authors: We accept this criticism. The revised experimental section will explicitly list all baselines with their original references and hyper-parameters, name the datasets (FaceForensics++, Celeb-DF, DFDC) together with the exact train/validation/test splits and preprocessing, and provide the full training protocol and hyper-parameters for the learnable spectral adversary (including deformation severity ranges and optimization settings). All robustness experiments will be described with the same level of detail. revision: yes
Circularity Check
No significant circularity in derivation or evaluation chain
full rationale
The paper introduces SpInShield with a learnable spectral adversary for training and reports performance gains under the resulting simulated attacks. This follows standard adversarial training and evaluation protocols without reducing claims to definitional equivalence or fitted inputs by construction. No equations, self-citations, or uniqueness theorems are invoked in the provided text that would force the central results (competitive AUC on standard datasets and +21.30 pp under simulated attacks) to collapse into the method's own inputs. The experimental comparisons remain independent of any self-referential loop.
Axiom & Free-Parameter Ledger
invented entities (1)
-
learnable spectral adversary
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lucic, and Cordelia Schmid. Vivit: A video vision transformer.2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 6816–6826, 2021. URL https://api.semanticscholar. org/CorpusID:232417054
work page 2021
-
[2]
Liang Chen, Yong Zhang, Yibing Song, Lingqiao Liu, and Jue Wang. Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection. In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18689–18698, 2022. doi: 10.1109/CVPR52688.2022.01815
-
[3]
Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, and Chen Li. Can we leave deepfake data behind in training deepfake detector? InProceedings of the 38th International Conference on Neural Information Processing Systems, NIPS ’24, Red Hook, NY , USA, 2024. Curran Associates Inc. ISBN 9798331314385
work page 2024
-
[4]
Chesney and Danielle Keats Citron
Robert M. Chesney and Danielle Keats Citron. Deep fakes: A looming challenge for privacy, democracy, and national security.California Law Review, 107:1753, 2018. URL https: //api.semanticscholar.org/CorpusID:158865631
work page 2018
-
[5]
Ex- ploiting style latent flows for generalizing deepfake video detection
Jongwook Choi, Taehoon Kim, Yonghyun Jeong, Seungryul Baek, and Jongwon Choi. Ex- ploiting style latent flows for generalizing deepfake video detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1133–1143, 2024
work page 2024
-
[6]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale.ArXiv, abs/2010.11929, 2020. URL https://api.semanticscholar. org/CorpusI...
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[7]
Contributing data to deepfake detection research
Nick Dufour and Andrew Gully. Contributing data to deepfake detection research. https:// ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html , 9 2019. Google AI Blog. Accessed: 2023-07-30
work page 2019
-
[8]
Fourier spectrum discrepancies in deep network generated images,
Tarik Dzanic, Karan Shah, and Freddie Witherden. Fourier spectrum discrepancies in deep network generated images, 2020. URLhttps://arxiv.org/abs/1911.06465
-
[9]
David Field and Damon Chandler. Method for estimating the relative contribution of phase and power spectra to the total information in natural-scene patches.Journal of the Optical Society of America A, 29:55–67, 12 2011. doi: 10.1364/JOSAA.29.000055
-
[10]
Leveraging frequency analysis for deep fake image recognition
Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. Leveraging frequency analysis for deep fake image recognition. InProceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020
work page 2020
-
[11]
Zemel, Wieland Brendel, Matthias Bethge, and Felix Wichmann
Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard S. Zemel, Wieland Brendel, Matthias Bethge, and Felix Wichmann. Shortcut learning in deep neural networks.Na- ture Machine Intelligence, 2:665 – 673, 2020. URL https://api.semanticscholar.org/ CorpusID:215786368
work page 2020
-
[12]
David Güera and Edward J. Delp. Deepfake video detection using recurrent neural networks. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (A VSS), pages 1–6, 2018. doi: 10.1109/A VSS.2018.8639163
work page doi:10.1109/a 2018
-
[13]
Yue Hua Han, Tai Ming Huang, Kai Lung Hua, and Jun Cheng Chen. Towards more general video-based deepfake detection through facial component guided adaptation for foundation model. InProceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2025
work page 2025
-
[14]
Omar Hommos, Silvia L. Pintea, Pascal S. M. Mettes, and Jan C. van Gemert. Using phase instead of optical flow for action recognition, 2018. URL https://arxiv.org/abs/1809. 03258. 10
work page 2018
-
[15]
Depth-aware generative adversarial network for talking head video generation
Fa-Ting Hong, Longhao Zhang, Li Shen, and Dan Xu. Depth-aware generative adversarial network for talking head video generation. 2022
work page 2022
-
[16]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Baojin Huang, Zhongyuan Wang, Jifan Yang, Jiaxin Ai, Qin Zou, Qian Wang, and Dengpan Ye. Implicit identity driven deepfake face swapping detection. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4490–4499, 2023. doi: 10.1109/ CVPR52729.2023.00436
-
[17]
Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection, 2020
Liming Jiang, Ren Li, Wayne Wu, Chen Qian, and Chen Change Loy. Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection, 2020. URL https://arxiv.org/ abs/2001.03024
-
[18]
Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks.2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4396–4405, 2018. URL https://api.semanticscholar.org/CorpusID: 54482423
work page 2019
-
[19]
Beyond spatial frequency: Pixel-wise temporal frequency-based deepfake video detection
Taehoon Kim, Jongwook Choi, Yonghyun Jeong, Haeun Noh, Jaejun Yoo, Seungryul Baek, and Jongwon Choi. Beyond spatial frequency: Pixel-wise temporal frequency-based deepfake video detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 11198–11207, October 2025
work page 2025
-
[20]
Davis E. King. Dlib-ml: A machine learning toolkit.J. Mach. Learn. Res., 10:1755–1758, December 2009. ISSN 1532-4435
work page 2009
-
[21]
Freqblender: enhancing deepfake detection by blending frequency knowledge
Hanzhe Li, Jiaran Zhou, Yuezun Li, Baoyuan Wu, Bin Li, and Junyu Dong. Freqblender: enhancing deepfake detection by blending frequency knowledge. InProceedings of the 38th International Conference on Neural Information Processing Systems, NIPS ’24, Red Hook, NY , USA, 2024. Curran Associates Inc. ISBN 9798331314385
work page 2024
-
[22]
Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics
Yuezun Li, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, W A, United States, 2020
work page 2020
-
[23]
Yuzhen Lin, Wentang Song, Bin Li, Yuezun Li, Jiangqun Ni, Han Chen, and Qiushi Li. Fake it till you make it: Curricular dynamic forgery augmentations towards general deepfake detection,
- [24]
-
[25]
Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. Spatial-phase shallow learning: Rethinking face forgery detection in frequency domain.2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 772–781, 2021. URL https://api.semanticscholar.org/CorpusID:232092167
work page 2021
-
[26]
Momina Masood, M. M. Tanzim Nawaz, Khalid Mahmood Malik, Ali Javed, Aun Irtaza, and Hafiz Malik. Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward.Applied Intelligence, 53:3974–4026, 2021. URL https://api.semanticscholar.org/CorpusID:232075890
work page 2021
-
[27]
The creation and detection of deepfakes.ACM Computing Surveys (CSUR), 54:1 – 41, 2020
Yisroel Mirsky and Wenke Lee. The creation and detection of deepfakes.ACM Computing Surveys (CSUR), 54:1 – 41, 2020. URL https://api.semanticscholar.org/CorpusID: 216080410
work page 2020
-
[28]
Bartusiak, Justin Yang, David Guera, Fengqing Maggie Zhu, and Edward J
Daniel Mas Montserrat, Hanxiang Hao, Sri Kalyan Yarlagadda, Sriram Baireddy, Ruiting Shao, János Horváth, Emily R. Bartusiak, Justin Yang, David Guera, Fengqing Maggie Zhu, and Edward J. Delp. Deepfakes detection with automatic face weighting.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2851– 2859, 2020. URL...
work page 2020
-
[29]
Vulnerability- aware spatio-temporal learning for generalizable deepfake video detection
Dat Nguyen, Marcella Astrid, Anis Kacem, Enjie Ghorbel, and Djamila Aouada. Vulnerability- aware spatio-temporal learning for generalizable deepfake video detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10786–10796, 2025. 11
work page 2025
-
[30]
Thanh Thi Nguyen, Quoc Viet Hung Nguyen, Dung Tien Nguyen, Duc Thanh Nguyen, Thien Huynh-The, Saeid Nahavandi, Thanh Tam Nguyen, Quoc-Viet Pham, and Cuong M. Nguyen. Deep learning for deepfakes creation and detection: A survey.Computer Vision and Image Understanding, 223:103525, 2022. ISSN 1077-3142. doi: https://doi.org/10. 1016/j.cviu.2022.103525. URL h...
-
[31]
A.V . Oppenheim and J.S. Lim. The importance of phase in signals.Proceedings of the IEEE, 69(5):529–541, 1981. doi: 10.1109/PROC.1981.12022
-
[32]
Thinking in frequency: Face forgery detection by mining frequency-aware clues
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. Thinking in frequency: Face forgery detection by mining frequency-aware clues. InComputer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII, page 86–103, Berlin, Heidelberg, 2020. Springer-Verlag. ISBN 978-3-030-58609-6. doi: 10.1007/ 978-3-03...
-
[33]
Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images.2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 1–11, 2019. URL https://api. semanticscholar.org/CorpusID:59292011
work page 2019
-
[34]
Analysis and visualization of temporal variations in video
Michael Rubinstein. Analysis and visualization of temporal variations in video. 2014. URL https://api.semanticscholar.org/CorpusID:41891254
work page 2014
-
[35]
Sefik Serengil and Alper Ozpinar. A benchmark of facial recognition pipelines and co-usability performances of modules.Journal of Information Technologies, 17(2):95–107, 2024. doi: 10.17671/gazibtd.1399077. URL https://dergipark.org.tr/en/pub/gazibtd/issue/ 84331/1399077
-
[36]
Kaede Shiohara and Toshihiko Yamasaki. Detecting deepfakes with self-blended images. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18699–18708, 2022. doi: 10.1109/CVPR52688.2022.01816
-
[37]
Deepfakes and beyond: A survey of face manipulation and fake detec- tion.ArXiv, abs/2001.00179, 2020
Rubén Tolosana, Rubén Vera-Rodríguez, Julian Fierrez, Aythami Morales, and Javier Ortega-Garcia. Deepfakes and beyond: A survey of face manipulation and fake detec- tion.ArXiv, abs/2001.00179, 2020. URL https://api.semanticscholar.org/CorpusID: 209531954
-
[38]
Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri
Du Tran, Lubomir D. Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. Learning spatiotemporal features with 3d convolutional networks.2015 IEEE International Conference on Computer Vision (ICCV), pages 4489–4497, 2014. URL https://api.semanticscholar. org/CorpusID:1122604
work page 2015
-
[39]
Luisa Verdoliva. Media forensics and deepfakes: An overview.IEEE Journal of Selected Topics in Signal Processing, 14:910–932, 2020. URL https://api.semanticscholar. org/CorpusID:210838881
work page 2020
-
[40]
Neal Wadhwa, Michael Rubinstein, Frédo Durand, and William T. Freeman. Phase-based video motion processing.ACM Trans. Graph., 32(4), July 2013. ISSN 0730-0301. doi: 10.1145/2461912.2461966. URLhttps://doi.org/10.1145/2461912.2461966
-
[41]
Limin Wang, Bingkun Huang, Zhiyu Zhao, Zhan Tong, Yinan He, Yi Wang, Yali Wang, and Yu Qiao. Videomae v2: Scaling video masked autoencoders with dual masking.2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14549–14560, 2023. URLhttps://api.semanticscholar.org/CorpusID:257805127
work page 2023
-
[42]
Exposing digital forgeries in video by detecting double mpeg compression
Weihong Wang and Hany Farid. Exposing digital forgeries in video by detecting double mpeg compression. InProceedings of the 8th Workshop on Multimedia and Security, MM&Sec ’06, page 37–47, New York, NY , USA, 2006. Association for Computing Machinery. ISBN 1595934936. doi: 10.1145/1161366.1161375. URL https://doi.org/10.1145/1161366. 1161375. 12
-
[43]
Yan Wang, Qindong Sun, Dongzhu Rong, and Rong Geng. Multi-domain awareness for compressed deepfake videos detection over social networks guided by common mechanisms between artifacts.Computer Vision and Image Understanding, 247:104072, 2024. ISSN 1077-
work page 2024
-
[44]
doi: https://doi.org/10.1016/j.cviu.2024.104072. URL https://www.sciencedirect. com/science/article/pii/S107731422400153X
-
[45]
Interactive editing of deformable simulations , year =
Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John Guttag, Frédo Durand, and William Freeman. Eulerian video magnification for revealing subtle changes in the world.ACM Trans. Graph., 31(4), July 2012. ISSN 0730-0301. doi: 10.1145/2185520.2185561. URL https://doi.org/10.1145/2185520.2185561
-
[46]
Tall: Thumbnail layout for deepfake video detection
Yuting Xu, Jian Liang, Gengyun Jia, Ziming Yang, Yanhao Zhang, and Ran He. Tall: Thumbnail layout for deepfake video detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 22658–22668, 2023
work page 2023
-
[47]
Zhiyuan Yan, Yuhao Luo, Siwei Lyu, Qingshan Liu, and Baoyuan Wu. Transcending forgery specificity with latent space augmentation for generalizable deepfake detection.2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8984–8994, 2023. URLhttps://api.semanticscholar.org/CorpusID:265294623
work page 2024
-
[48]
In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
Zhiyuan Yan, Yong Zhang, Yanbo Fan, and Baoyuan Wu. Ucf: Uncovering common features for generalizable deepfake detection. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 22355–22366, 2023. doi: 10.1109/ICCV51070.2023.02048
-
[49]
Orthogonal subspace decomposition for generalizable ai-generated image detection
Zhiyuan Yan, Jiangming Wang, Zhendong Wang, Peng Jin, Ke-Yue Zhang, Shen Chen, Taiping Yao, Shouhong Ding, Baoyuan Wu, and Li Yuan. Orthogonal subspace decomposition for generalizable ai-generated image detection. InInternational Conference on Machine Learning,
-
[50]
URLhttps://api.semanticscholar.org/CorpusID:274234236
-
[51]
Zhiyuan Yan, Yandan Zhao, Shen Chen, Xinghe Fu, Taiping Yao, Shouhong Ding, and Li Yuan. Generalizing deepfake video detection with plug-and-play: Video-level blending and spatiotem- poral adapter tuning.2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12615–12625, 2024. URLhttps://api.semanticscholar.org/CorpusID: 272310564
work page 2025
-
[52]
Benchmarking the robustness of spatial-temporal models against corruptions, 2022
Chenyu Yi, Siyuan Yang, Haoliang Li, Yap peng Tan, and Alex Kot. Benchmarking the robustness of spatial-temporal models against corruptions, 2022. URL https://arxiv.org/ abs/2110.06513
-
[53]
Dong Yin, Raphael Gontijo Lopes, Jonathon Shlens, Ekin D. Cubuk, and Justin Gilmer. A fourier perspective on model robustness in computer vision, 2020. URLhttps://arxiv.org/ abs/1906.08988
-
[54]
Exploring temporal coherence for more general video face forgery detection
Zheng Yinglin, Bao Jianmin, Chen Dong, Zeng Ming, and Wen Fang. Exploring temporal coherence for more general video face forgery detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15044–15054, 2021
work page 2021
-
[55]
Learning natural consistency representation for face forgery video detection
Daichi Zhang, Zihao Xiao, Shikun Li, Fanzhao Lin, Jianmin Li, and Shiming Ge. Learning natural consistency representation for face forgery video detection. InComputer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Pro- ceedings, Part LXXXIII, page 407–424, Berlin, Heidelberg, 2024. Springer-Verlag. ISBN 978-3-031...
-
[56]
Diffswap: High-fidelity and controllable face swapping via 3d-aware masked diffusion.CVPR, 2023
Wenliang Zhao, Yongming Rao, Weikang Shi, Zuyan Liu, Jie Zhou, and Jiwen Lu. Diffswap: High-fidelity and controllable face swapping via 3d-aware masked diffusion.CVPR, 2023
work page 2023
-
[57]
Altfreezing for more general video face forgery detection
Wang Zhendong, Bao Jianmin, Zhou Wengang, Wang Weilun, and Li Houqiang. Altfreezing for more general video face forgery detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4129–4138, June 2023
work page 2023
-
[58]
Wilddeepfake: A challenging real-world dataset for deepfake detection
Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. Wilddeepfake: A challenging real-world dataset for deepfake detection. InProceedings of the 28th ACM International Conference on Multimedia, pages 2382–2390, 2020. 13
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.