pith. sign in

arxiv: 2504.18015 · v4 · submitted 2025-04-25 · 💻 cs.CR · cs.CV· cs.LG

DiffMI: Breaking Face Recognition Privacy via Diffusion-Driven Training-Free Model Inversion

Pith reviewed 2026-05-22 18:43 UTC · model grok-4.3

classification 💻 cs.CR cs.CVcs.LG
keywords model inversionface recognitiondiffusion modelstraining-free attackbiometric privacyadversarial refinementlatent code initializationprivacy attack
0
0 comments X

The pith

DiffMI shows that diffusion models can invert face recognition embeddings to reconstruct identities without any target-specific training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DiffMI as a diffusion-driven training-free model inversion attack on face recognition systems. It uses a pipeline that starts with robust latent code initialization, applies ranked adversarial refinement, and optimizes with a confidence-aware objective to recover faces from embeddings. This works directly on unseen identities and models, unlike prior methods that often need per-target training. A sympathetic reader would care because it indicates that mapping faces to embeddings may not fully protect biometric privacy against reconstruction attacks.

Core claim

DiffMI is the first diffusion-driven training-free model inversion attack. It combines robust latent code initialization, a ranked adversarial refinement strategy, and a statistically grounded confidence-aware optimization objective applied to a diffusion model. The method applies directly to unseen target identities and face recognition models, achieves 84.42%--92.87% attack success rates against inversion-resilient systems, outperforms the best prior training-free GAN-based approach by 4.01%--9.82%, and reduces computational overhead compared to training-dependent approaches.

What carries the argument

The DiffMI pipeline of robust latent code initialization, ranked adversarial refinement, and confidence-aware optimization objective, applied within a diffusion model to perform model inversion from embeddings.

If this is right

  • Reconstructions of nuanced or unseen identities become feasible without target-specific training or fine-tuning.
  • The approach offers greater adaptability to new face recognition models than training-dependent inversion methods.
  • Computational overhead is significantly reduced while achieving higher success rates than prior training-free GAN-based attacks.
  • Privacy vulnerabilities in systems that rely on facial embeddings for protection are more accessible than previously demonstrated with GAN methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same initialization and refinement steps could be tested on other embedding-based systems such as speaker recognition to see if diffusion models generalize as inversion tools.
  • Defenders might add diffusion-resistant noise during embedding computation as a direct counter to this style of attack.
  • Success on unseen targets suggests that future inversion methods may shift emphasis from collecting training data to designing better latent-space controls.

Load-bearing premise

The diffusion model and proposed pipeline can faithfully reconstruct nuanced or unseen identities without any target-specific training or fine-tuning.

What would settle it

Applying DiffMI to a previously untested face recognition model and a set of unseen identities, then checking whether the generated images are accepted as matches by the target system at rates near the reported success range, would test the generalization claim.

Figures

Figures reproduced from arXiv: 2504.18015 by Chun-Shien Lu, Hanrui Wang, Isao Echizen, Shuo Wang.

Figure 1
Figure 1. Figure 1: The threat of model inversion against embedding-based [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Framework of DiffMI, which reconstructs a facial image sharing the same identity as a private face solely from its [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Optimization convergence on the target model and [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Two-stage latent code generation. First, [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Selection of the top N latent codes based on embedding similarity to the target identity [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Ranked Adversary algorithm for latent code manipulation. The top [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Example of difficult identity selection in the user [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Visual comparison of our DiffMI and MAP2V [14] in the white-box setting. DiffMI achieves higher identity recovery accuracy and produces full headshot-style reconstructions. “Initial” refers to outputs before manipulation. The target model is the inversion-resilient PartialFace [13]. DSCasConv Shahreza et al. MAP2V Ours DeconvNet StyleGAN StyleGAN Random Top 1/1000 DDPM Target APGD + DDPM [PITH_FULL_IMAGE:… view at source ↗
Figure 9
Figure 9. Figure 9: Visual fidelity comparison across all baselines. Both [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Performance of DiffMI in the black-box setting. Compared to its white-box counterpart (columns 3 [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 13
Figure 13. Figure 13: Examples of failed face generation from randomly [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗
Figure 12
Figure 12. Figure 12: Impact of adversarial manipulation on Gaussian [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
Figure 14
Figure 14. Figure 14: Example where the latent code with the best initial [PITH_FULL_IMAGE:figures/full_fig_p012_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Larger perturbations introduce stronger artifacts, in [PITH_FULL_IMAGE:figures/full_fig_p012_15.png] view at source ↗
read the original abstract

Face recognition poses serious privacy risks due to its reliance on sensitive and immutable biometric data. While modern systems mitigate privacy risks by mapping facial images to embeddings (commonly regarded as privacy-preserving), model inversion attacks reveal that identity information can still be recovered, exposing critical vulnerabilities. However, existing attacks are often computationally expensive and lack generalization, especially those requiring target-specific training. Even training-free approaches suffer from limited identity controllability, hindering faithful reconstruction of nuanced or unseen identities. In this work, we propose DiffMI, the first diffusion-driven, training-free model inversion attack. DiffMI introduces a novel pipeline combining robust latent code initialization, a ranked adversarial refinement strategy, and a statistically grounded, confidence-aware optimization objective. DiffMI applies directly to unseen target identities and face recognition models, offering greater adaptability than training-dependent approaches while significantly reducing computational overhead. Our method achieves 84.42%--92.87% attack success rates against inversion-resilient systems and outperforms the best prior training-free GAN-based approach by 4.01%--9.82%. The implementation is available at https://github.com/azrealwang/DiffMI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces DiffMI, the first diffusion-driven training-free model inversion attack against face recognition (FR) systems. It proposes a pipeline consisting of robust latent code initialization, ranked adversarial refinement, and a confidence-aware optimization objective. The central empirical claim is that DiffMI achieves 84.42%–92.87% attack success rates (ASR) on inversion-resilient FR models and outperforms the strongest prior training-free GAN-based baseline by 4.01%–9.82%, while generalizing directly to unseen target identities without any target-specific training or fine-tuning.

Significance. If the reported ASRs are reproducible and the reconstructions are shown to arise from true inversion rather than memorization of identities present in the diffusion model’s pre-training corpus, the work would constitute a meaningful advance in biometric privacy research. It would demonstrate that modern diffusion models can serve as powerful, training-free oracles for embedding inversion, thereby strengthening the case for more robust embedding protections and motivating new defense strategies. The public code release is a clear positive for reproducibility.

major comments (3)
  1. [Experiments / Evaluation] Experiments / Evaluation section: The headline ASR range (84.42%–92.87%) and the 4.01%–9.82% improvement margin are presented without any reported decontamination procedure, identity-overlap audit, or out-of-distribution (OOD) identity test between the diffusion pre-training corpus and the FR benchmark test splits. This directly bears on the central claim that DiffMI reconstructs “nuanced or unseen identities” solely from the target embedding; without such controls, the results remain compatible with generative memorization.
  2. [Abstract and §4] Abstract and §4 (results): The manuscript states concrete success rates and outperformance margins but supplies no information on dataset splits, number of identities per split, exact baseline implementations, or statistical significance testing. These omissions make it impossible to assess whether the reported margins are robust or sensitive to evaluation choices.
  3. [Method] Method description (pipeline): The claim that the “statistically grounded, confidence-aware optimization objective” enables faithful reconstruction of unseen identities is not accompanied by an ablation that isolates the contribution of each component (latent initialization, ranked refinement, confidence term) under a controlled OOD identity regime. This weakens the causal link between the proposed pipeline and the generalization result.
minor comments (2)
  1. [Method] Notation for the ranked adversarial refinement step is introduced without an explicit algorithmic listing or pseudocode, making the ranking criterion difficult to reproduce from the text alone.
  2. [Figures] Figure captions for the qualitative reconstruction examples do not indicate whether the shown identities were part of the diffusion pre-training data or held out.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below, providing clarifications and committing to revisions that will strengthen the empirical claims regarding true inversion and evaluation rigor.

read point-by-point responses
  1. Referee: [Experiments / Evaluation] Experiments / Evaluation section: The headline ASR range (84.42%–92.87%) and the 4.01%–9.82% improvement margin are presented without any reported decontamination procedure, identity-overlap audit, or out-of-distribution (OOD) identity test between the diffusion pre-training corpus and the FR benchmark test splits. This directly bears on the central claim that DiffMI reconstructs “nuanced or unseen identities” solely from the target embedding; without such controls, the results remain compatible with generative memorization.

    Authors: We agree that ruling out memorization is essential to substantiate the inversion claim. Although DiffMI operates in a training-free manner on standard FR benchmarks (e.g., LFW, CelebA) whose identities are not explicitly part of the diffusion model's pre-training objective, the original manuscript did not include an explicit identity-overlap audit or OOD verification. In the revised version, we will add a dedicated decontamination analysis: we will use face recognition tools to audit overlaps with LAION-5B (the primary corpus for models like Stable Diffusion), report the overlap statistics, and evaluate on a curated OOD subset of identities confirmed absent from the pre-training data. This will directly support that reconstructions arise from embedding inversion rather than memorization. revision: yes

  2. Referee: [Abstract and §4] Abstract and §4 (results): The manuscript states concrete success rates and outperformance margins but supplies no information on dataset splits, number of identities per split, exact baseline implementations, or statistical significance testing. These omissions make it impossible to assess whether the reported margins are robust or sensitive to evaluation choices.

    Authors: We acknowledge that additional experimental details are necessary for reproducibility and robustness assessment. We will revise the abstract and Section 4 to explicitly report: the dataset splits (including exact numbers of identities and images per split for each FR model and attack evaluation), precise descriptions of baseline implementations (including code references, any re-implementation choices, and hyperparameter settings), and statistical significance testing (e.g., paired t-tests or Wilcoxon signed-rank tests with p-values) for the reported ASR improvements. These additions will allow readers to evaluate the sensitivity of the 4.01%–9.82% margins. revision: yes

  3. Referee: [Method] Method description (pipeline): The claim that the “statistically grounded, confidence-aware optimization objective” enables faithful reconstruction of unseen identities is not accompanied by an ablation that isolates the contribution of each component (latent initialization, ranked refinement, confidence term) under a controlled OOD identity regime. This weakens the causal link between the proposed pipeline and the generalization result.

    Authors: We will incorporate a new ablation study in the revised manuscript to isolate the contribution of each component under a controlled OOD regime. Specifically, we will evaluate variants with ablated latent code initialization, ranked adversarial refinement, and the confidence-aware term, using only identities verified as out-of-distribution relative to the diffusion pre-training corpus. Results will include ASR, reconstruction quality metrics, and qualitative examples, thereby establishing the causal role of the full pipeline in enabling generalization to unseen identities. revision: yes

Circularity Check

0 steps flagged

Empirical attack evaluation with external benchmarks; no definitional or self-citation circularity

full rationale

The paper describes a training-free diffusion-based model inversion pipeline evaluated via attack success rates on external face recognition models and prior published baselines. No equations, predictions, or first-principles claims are presented that reduce reported metrics to quantities defined by the method's own parameters or self-citations. Performance is measured independently against inversion-resilient systems, making the central results falsifiable outside the paper's fitted values or internal definitions. Minor self-citations, if present, are not load-bearing for the headline ASR figures.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that off-the-shelf diffusion models can be steered to produce identity-faithful reconstructions via the described initialization and optimization steps; no new entities are postulated and no free parameters are explicitly fitted in the abstract.

axioms (1)
  • domain assumption Diffusion models pretrained on general face data can be used directly for inversion of unseen target identities without fine-tuning.
    Invoked by the training-free claim and the statement that the method applies directly to unseen identities.

pith-pipeline@v0.9.0 · 5738 in / 1203 out tokens · 33100 ms · 2026-05-22T18:43:34.346886+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. From Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal

    cs.CV 2026-04 conditional novelty 7.0

    Visual encoders leak identity information; a one-shot linear subspace removal method (ISP) reduces leakage to near-chance levels while retaining high non-biometric utility across datasets.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · cited by 1 Pith paper

  1. [1]

    Review of cbp’s major cybersecurity incident during a 2019 biometric pilot,

    Office of Inspector General, Department of Homeland Security, “Review of cbp’s major cybersecurity incident during a 2019 biometric pilot,” Tech. Rep. OIG-20-71, U.S. Department of Homeland Security, Office of Inspector General, Sept. 2020. Accessed: 2025-08-12

  2. [2]

    Major breach found in biometrics system used by banks, uk police and defence firms,

    J. Taylor, “Major breach found in biometrics system used by banks, uk police and defence firms,” Aug. 2019. The Guardian. Accessed: 2025- 08-12

  3. [3]

    The shanghai data leak shows china’s state surveillance was inevitable,

    Fortune Staff, “The shanghai data leak shows china’s state surveillance was inevitable,” July 2022. Fortune. Accessed: 2025-08-12

  4. [4]

    Reported australian biometric data breach prompts arrest and hysteria,

    C. Burt, “Reported australian biometric data breach prompts arrest and hysteria,” May 2024. Biometric Update. Accessed: 2025-08-12

  5. [5]

    Data leak exposes personal data of indian military and police,

    P. A. Thomas, “Data leak exposes personal data of indian military and police,” May 2024. CSO Online. Accessed: 2025-08-12

  6. [6]

    Azure AI Services Documenta- tion

    Microsoft,Face API Reference, 2024. Azure AI Services Documenta- tion

  7. [7]

    A. W. Services,Amazon Rekognition Documentation, 2024. AWS Documentation

  8. [8]

    Cloud,Cloud Vision Documentation, 2024

    G. Cloud,Cloud Vision Documentation, 2024. Google Cloud Documen- tation

  9. [9]

    Face++ Official Documentation

    Face++,Face++ Documentation, 2024. Face++ Official Documentation

  10. [10]

    Facenet: A unified em- bedding for face recognition and clustering,

    F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified em- bedding for face recognition and clustering,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Boston, Massachusetts, USA), pp. 815–823, IEEE, 2015

  11. [11]

    Arcface: Additive angular margin loss for deep face recognition,

    J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Long Beach, CA, USA), pp. 4690–4699, IEEE, 2019

  12. [12]

    Privacy-preserving face recognition with learnable privacy budgets in frequency domain,

    J. Ji, H. Wang, Y . Huang, J. Wu, X. Xu, S. Ding, S. Zhang, L. Cao, and R. Ji, “Privacy-preserving face recognition with learnable privacy budgets in frequency domain,” inEuropean Conference on Computer Vision (ECCV), (Tel Aviv, Israel), pp. 475–491, Springer, 2022. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 13

  13. [13]

    Privacy-preserving face recognition using random frequency components,

    Y . Mi, Y . Huang, J. Ji, M. Zhao, J. Wu, X. Xu, S. Ding, and S. Zhou, “Privacy-preserving face recognition using random frequency components,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 19673–19684, IEEE, 2023

  14. [14]

    Validating privacy-preserving face recognition under a minimum assumption,

    H. Zhang, X. Dong, Y . Lai, Y . Zhou, X. Zhang, X. Lv, Z. Jin, and X. Li, “Validating privacy-preserving face recognition under a minimum assumption,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Seattle, Washington, United States), pp. 12205–12214, IEEE, 2024

  15. [15]

    Vulnerability of state- of-the-art face recognition models to template inversion attack,

    H. O. Shahreza, V . K. Hahn, and S. Marcel, “Vulnerability of state- of-the-art face recognition models to template inversion attack,”IEEE Transactions on Information Forensics and Security, 2024

  16. [16]

    Face reconstruction from facial templates by learning latent space of a generator network,

    H. Otroshi Shahreza and S. Marcel, “Face reconstruction from facial templates by learning latent space of a generator network,”Advances in Neural Information Processing Systems (NIPS), vol. 36, 2023

  17. [17]

    Template inversion attack using synthetic face images against real face recognition systems,

    H. O. Shahreza and S. Marcel, “Template inversion attack using synthetic face images against real face recognition systems,”IEEE Transactions on Biometrics, Behavior, and Identity Science, 2024

  18. [18]

    Reconstruct face from features based on genetic algorithm using gan generator as a distribution constraint,

    X. Dong, Z. Miao, L. Ma, J. Shen, Z. Jin, Z. Guo, and A. B. J. Teoh, “Reconstruct face from features based on genetic algorithm using gan generator as a distribution constraint,”Computers & Security, vol. 125, p. 103026, 2023

  19. [19]

    Model inversion attacks that exploit confidence information and basic countermeasures,

    M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (ACM CCS), pp. 1322–1333, 2015

  20. [20]

    On the reconstruction of face images from deep face templates,

    G. Mai, K. Cao, P. C. Yuen, and A. K. Jain, “On the reconstruction of face images from deep face templates,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 5, pp. 1188–1202, 2018

  21. [21]

    Analysis and utilization of hidden information in model inversion attacks,

    Z. Zhang, X. Wang, J. Huang, and S. Zhang, “Analysis and utilization of hidden information in model inversion attacks,”IEEE Transactions on Information Forensics and Security, 2023

  22. [22]

    Vec2face: Unveil human faces from their blackbox features in face recognition,

    C. N. Duong, T.-D. Truong, K. Luu, K. G. Quach, H. Bui, and K. Roy, “Vec2face: Unveil human faces from their blackbox features in face recognition,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6132–6141, 2020

  23. [23]

    The secret revealer: Generative model-inversion attacks against deep neural networks,

    Y . Zhang, R. Jia, H. Pei, W. Wang, B. Li, and D. Song, “The secret revealer: Generative model-inversion attacks against deep neural networks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 253–261, 2020

  24. [24]

    Model inversion attack by integration of deep generative models: Privacy-sensitive face generation from a face recognition system,

    M. Khosravy, K. Nakamura, Y . Hirose, N. Nitta, and N. Babaguchi, “Model inversion attack by integration of deep generative models: Privacy-sensitive face generation from a face recognition system,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 357– 372, 2022

  25. [25]

    Pseudo label-guided model inversion attack via conditional generative adver- sarial network,

    X. Yuan, K. Chen, J. Zhang, W. Zhang, N. Yu, and Y . Zhang, “Pseudo label-guided model inversion attack via conditional generative adver- sarial network,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 37, pp. 3349–3357, 2023

  26. [26]

    Label-only model inversion attacks via knowledge transfer,

    B.-N. Nguyen, K. Chandrasegaran, M. Abdollahzadeh, and N.-M. M. Cheung, “Label-only model inversion attacks via knowledge transfer,” Advances in Neural Information Processing Systems (NIPS), vol. 36, 2023

  27. [27]

    Label-only model inversion attacks: Adaptive boundary exclusion for limited queries,

    J. Wu, C. Wan, H. Chen, Z. Zheng, and Y . Sun, “Label-only model inversion attacks: Adaptive boundary exclusion for limited queries,” Neurocomputing, p. 129902, 2025

  28. [28]

    Controllable inversion of black-box face recognition models via diffusion,

    M. Kansy, A. Ra ¨el, G. Mignone, J. Naruniec, C. Schroers, M. Gross, and R. M. Weber, “Controllable inversion of black-box face recognition models via diffusion,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pp. 3167–3177, 2023

  29. [29]

    Un- stoppable attack: Label-only model inversion via conditional diffusion model,

    R. Liu, D. Wang, Y . Ren, Z. Wang, K. Guo, Q. Qin, and X. Liu, “Un- stoppable attack: Label-only model inversion via conditional diffusion model,”IEEE Transactions on Information Forensics and Security, 2024

  30. [30]

    Plug & play attacks: Towards robust and flexible model inversion attacks,

    L. Struppek, D. Hintersdorf, A. D. A. Correira, A. Adler, and K. Kerst- ing, “Plug & play attacks: Towards robust and flexible model inversion attacks,” inInternational Conference on Machine Learning (ICML), pp. 20522–20545, PMLR, 2022

  31. [31]

    A closer look at gan priors: Exploiting intermediate features for enhanced model inversion attacks,

    Y . Qiu, H. Fang, H. Yu, B. Chen, M. Qiu, and S.-T. Xia, “A closer look at gan priors: Exploiting intermediate features for enhanced model inversion attacks,” inEuropean Conference on Computer Vision (ECCV), pp. 109–126, Springer, 2024

  32. [32]

    Pridm: Effective and universal private data recovery via diffusion models,

    S. Pang, Y . Rao, Z. Lu, H. Wang, Y . Zhou, and M. Xue, “Pridm: Effective and universal private data recovery via diffusion models,”IEEE Transactions on Dependable and Secure Computing, 2025

  33. [33]

    Oulu-npu: A mobile face presentation attack database with real-world variations,

    Z. Boulkenafet, J. Komulainen, L. Li, X. Feng, and A. Hadid, “Oulu-npu: A mobile face presentation attack database with real-world variations,” in2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017), pp. 612–618, IEEE, 2017

  34. [34]

    The many-faced god: Attacking face veri- fication system with embedding and image recovery,

    M. Tan, Z. Zhou, and Z. Li, “The many-faced god: Attacking face veri- fication system with embedding and image recovery,” inProceedings of the 37th Annual Computer Security Applications Conference (ACSAC), pp. 17–30, 2021

  35. [35]

    High- resolution image synthesis with latent diffusion models,

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 10684–10695, 2022

  36. [36]

    An- alyzing and improving the image quality of stylegan,

    T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “An- alyzing and improving the image quality of stylegan,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8110–8119, 2020

  37. [37]

    SDEdit: Guided image synthesis and editing with stochastic differential equations,

    C. Meng, Y . He, Y . Song, J. Song, J. Wu, J.-Y . Zhu, and S. Ermon, “SDEdit: Guided image synthesis and editing with stochastic differential equations,” inInternational Conference on Learning Representations (ICLR), 2022

  38. [38]

    Maskgan: Towards diverse and interactive facial image manipulation,

    C.-H. Lee, Z. Liu, L. Wu, and P. Luo, “Maskgan: Towards diverse and interactive facial image manipulation,” inIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

  39. [39]

    Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,

    G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,” inWorkshop on Faces in ’Real-Life’ Images: Detection, Alignment, and Recognition, 2008

  40. [40]

    Transformation to normality of the null distribution ofg 1,

    R. B. D’Agostino, “Transformation to normality of the null distribution ofg 1,”Biometrika, vol. 57, no. 3, pp. 679–681, 1970

  41. [41]

    Tests for departure from normality. empirical results for the distributions ofb 2 and √b1,

    R. D’agostino and E. S. Pearson, “Tests for departure from normality. empirical results for the distributions ofb 2 and √b1,”Biometrika, vol. 60, no. 3, pp. 613–622, 1973

  42. [42]

    A suggestion for using powerful and informative tests of normality,

    R. B. D’agostino, A. Belanger, and R. B. D’Agostino Jr, “A suggestion for using powerful and informative tests of normality,”The American Statistician, vol. 44, no. 4, pp. 316–321, 1990

  43. [43]

    Joint face detection and alignment using multitask cascaded convolutional networks,

    K. Zhang, Z. Zhang, Z. Li, and Y . Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,”IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016

  44. [44]

    Greedyp- ixel: Fine-grained black-box adversarial attack via greedy algorithm,

    H. Wang, C.-C. Chang, C.-S. Lu, C. Leckie, and I. Echizen, “Greedyp- ixel: Fine-grained black-box adversarial attack via greedy algorithm,” arXiv preprint arXiv:2501.14230, 2025

  45. [45]

    Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks,

    F. Croce and M. Hein, “Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks,” inInternational Conference on Machine Learning (ICML), pp. 2206–2216, 2020

  46. [46]

    Vggface2: A dataset for recognising faces across pose and age,

    Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “Vggface2: A dataset for recognising faces across pose and age,” in2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74, IEEE, 2018

  47. [47]

    Ms-celeb-1m: A dataset and benchmark for large-scale face recognition,

    Y . Guo, L. Zhang, Y . Hu, X. He, and J. Gao, “Ms-celeb-1m: A dataset and benchmark for large-scale face recognition,” in14th European Conference on Computer Vision (ECCV), pp. 87–102, Springer, 2016

  48. [48]

    A style-based generator architecture for generative adversarial networks,

    T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4401–4410, 2019

  49. [49]

    Square attack: a query-efficient black-box adversarial attack via random search,

    M. Andriushchenko, F. Croce, N. Flammarion, and M. Hein, “Square attack: a query-efficient black-box adversarial attack via random search,” inEuropean Conference on Computer Vision (ECCV), pp. 484–501, 2020

  50. [50]

    Brusleattack: Query- efficient score-based sparse adversarial attack,

    Q. V . V o, E. Abbasnejad, and D. Ranasinghe, “Brusleattack: Query- efficient score-based sparse adversarial attack,” inThe Twelfth Interna- tional Conference on Learning Representations (ICLR), 2024

  51. [51]

    Billions of logins for apple, google, facebook, telegram, and more found exposed online,

    Malwarebytes Labs, “Billions of logins for apple, google, facebook, telegram, and more found exposed online,” June 2025. Accessed: 2025- 08-06

  52. [52]

    Similarity-based gray-box adversarial attack against deep face recogni- tion,

    H. Wang, S. Wang, Z. Jin, Y . Wang, C. Chen, and M. Tistarelli, “Similarity-based gray-box adversarial attack against deep face recogni- tion,” in2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pp. 1–8, IEEE, 2021

  53. [53]

    A multi- task adversarial attack against face authentication,

    H. Wang, S. Wang, C. Chen, M. Tistarelli, and Z. Jin, “A multi- task adversarial attack against face authentication,”ACM Transactions on Multimedia Computing, Communications and Applications, vol. 20, no. 11, pp. 1–24, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 In this supplementary document, we provide additional materials to furth...