pith. sign in

arxiv: 2605.19020 · v1 · pith:GDJZ6LC4new · submitted 2026-05-18 · 💻 cs.CV

A Systematic Failure Analysis of Vision Foundation Models for Open Set Iris Presentation Attack Detection

Pith reviewed 2026-05-20 11:04 UTC · model grok-4.3

classification 💻 cs.CV
keywords iris presentation attack detectionopen-set evaluationvision foundation modelscross-spectral transferLoRA adaptationperiocular biometricsdistribution shift
0
0 comments X

The pith

Vision foundation models transfer between similar iris datasets but fail on unseen attack instruments and cross-spectral shifts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests five general-purpose vision foundation models on iris presentation attack detection under realistic open-set conditions using periocular images. It separates three distinct sources of distribution shift: new attack instruments never seen in training, entirely new datasets from different sensors, and transfer from near-infrared to visible-light imagery. Results show reliable transfer only when sensing characteristics stay the same, while performance collapses on novel attacks and spectral changes. Even parameter-efficient adaptation with LoRA improves some cross-dataset cases yet often worsens the failures on attack-level and spectral shifts. The work concludes that strong closed-set or cross-dataset scores cannot be taken as evidence of robust open-set security for biometric systems.

Core claim

Foundation models can transfer across datasets with similar sensing characteristics, but fail to generalise reliably to unseen attack instruments and degrade sharply under cross-spectral evaluation from NIR to VIS imagery. Both frozen representations and LoRA adaptation were tested in a unified framework, with additional checks using segmented iris inputs, full fine-tuning, joint shifts, and reverse spectral transfer confirming the pattern of failures.

What carries the argument

Three open-set evaluation protocols that isolate unseen presentation attack instruments, unseen datasets, and NIR-to-VIS spectral transfer, applied to both frozen foundation model features and LoRA-adapted versions.

Load-bearing premise

The three chosen open-set protocols capture the main distribution shifts that would appear in real iris PAD deployments.

What would settle it

A foundation model that maintains high detection accuracy on all three protocols simultaneously, including on a held-out attack instrument, a new sensor dataset, and NIR-to-VIS transfer.

Figures

Figures reproduced from arXiv: 2605.19020 by Dileep A D, Mahadeva Prasanna, Raghavendra Ramachandra, Rahul Anand, Siddharth Singh.

Figure 1
Figure 1. Figure 1: Conceptual overview of open-set iris PAD using periocular images and vision foundation models. Models are trained under known conditions [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example periocular images from the NIR (top row) and VIS (bottom row) corpora used in this study. The NIR data span four PAI categories [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

Vision foundation models have demonstrated strong transferability across diverse visual recognition tasks and are increasingly considered for biometric applications. Their suitability for iris Presentation Attack Detection (PAD), particularly under realistic open-set operating conditions, remains insufficiently examined. This work presents a systematic failure analysis of general-purpose vision foundation models for open-set iris PAD using periocular imagery. Five representative foundation models are evaluated under three open-set protocols that explicitly separate different sources of distribution shift: unseen Presentation Attack Instruments (PAIs), unseen datasets captured with different sensors and cross-spectral transfer from near-infrared (NIR) to visible spectrum (VIS) imagery. Both frozen feature representations and parameter-efficient task adaptation using Low-Rank Adaptation (LoRA) are assessed within a unified experimental framework. The results indicate that foundation models can transfer across datasets with similar sensing characteristics, but fail to generalise reliably to unseen attack instruments and degrade sharply under cross-spectral evaluation. While LoRA improves performance in certain cross-dataset settings, it frequently amplifies failure under attack-level and spectral shifts. Additional validation experiments using segmented iris inputs, full backbone fine-tuning, joint cross-dataset and cross-PAI shifts, and reverse VIS to NIR transfer further confirm that these failures are not simply artefacts of periocular input, weak adaptation, or one-directional spectral evaluation. These findings show that strong closed-set or cross-dataset performance should not be treated as evidence of robust open-set security, and highlight the need for PAD representations that maintain sensitivity to presentation artefacts while remaining stable under realistic deployment variation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a systematic empirical evaluation of five vision foundation models for open-set iris presentation attack detection (PAD) on periocular imagery. It defines three distinct open-set protocols to isolate distribution shifts from unseen presentation attack instruments (PAIs), unseen datasets with different sensors, and cross-spectral transfer (NIR to VIS). Both frozen feature extractors and parameter-efficient LoRA adaptation are tested in a unified framework, supplemented by validation experiments on segmented iris inputs, full fine-tuning, joint shifts, and reverse spectral transfer. The central claim is that foundation models transfer reliably only under matched sensing conditions, fail to generalize to unseen PAIs, and degrade sharply under cross-spectral evaluation, with LoRA sometimes amplifying these failures.

Significance. If the reported patterns hold under the described controls, the work is significant for the biometric security community. It supplies concrete empirical evidence that strong closed-set or cross-dataset performance cannot be taken as a proxy for open-set robustness in iris PAD, a point with direct implications for deployment. The explicit separation of shift sources via multiple protocols and the additional validation experiments (segmented inputs, full fine-tuning, reverse transfer) constitute a strength; the paper thereby ships a reproducible experimental framework that future work can build upon or challenge.

major comments (2)
  1. [Abstract and experimental protocol descriptions] The abstract and experimental sections state that LoRA 'frequently amplifies failure' under attack-level and spectral shifts, yet the magnitude and consistency of this effect across the five models and three protocols are not quantified with per-model deltas or statistical comparisons in the provided summary; this weakens the load-bearing claim that adaptation can be counterproductive.
  2. [Section describing the three open-set protocols] The three open-set protocols are presented as representative of practical distribution shifts, but the manuscript does not include a quantitative analysis (e.g., feature-space distances or PAI diversity metrics) showing how well they cover the space of real-world sensor and attack variations; this is a potential gap for the generalization argument.
minor comments (2)
  1. [Abstract] The abstract lists 'five representative foundation models' without naming them; the introduction or methods section should explicitly identify the models (e.g., by architecture and pre-training dataset) for immediate clarity.
  2. [Results tables] Tables reporting performance metrics would benefit from inclusion of standard deviations or confidence intervals across multiple runs to support statements of 'sharp degradation'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation of minor revision. The comments highlight opportunities to strengthen the presentation of our empirical findings. We address each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Abstract and experimental protocol descriptions] The abstract and experimental sections state that LoRA 'frequently amplifies failure' under attack-level and spectral shifts, yet the magnitude and consistency of this effect across the five models and three protocols are not quantified with per-model deltas or statistical comparisons in the provided summary; this weakens the load-bearing claim that adaptation can be counterproductive.

    Authors: We agree that explicit quantification would make the claim more robust. In the revised manuscript we will add a supplementary table that reports per-model performance deltas (LoRA minus frozen) for each of the three protocols, together with paired statistical tests (e.g., McNemar or Wilcoxon) on the underlying trial-level scores. This will allow readers to evaluate both the size and the consistency of the observed amplification effect across models and shift types. revision: yes

  2. Referee: [Section describing the three open-set protocols] The three open-set protocols are presented as representative of practical distribution shifts, but the manuscript does not include a quantitative analysis (e.g., feature-space distances or PAI diversity metrics) showing how well they cover the space of real-world sensor and attack variations; this is a potential gap for the generalization argument.

    Authors: We acknowledge that a quantitative coverage analysis would strengthen the generalization argument. Computing exhaustive feature-space distances or PAI-diversity metrics across the full space of real-world sensors and attacks is not feasible within the present study, as it would require a substantially larger collection of datasets and instruments than currently available in the community. Our protocols follow established practices for isolating specific, practically relevant shifts (unseen PAIs, sensor change, spectral change). In the revision we will expand the discussion section to explicitly state this limitation and to suggest how future work could quantify broader coverage using additional benchmarks. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

This is a direct empirical benchmarking study that evaluates five vision foundation models under three explicitly defined open-set protocols for iris PAD using periocular imagery. Performance is measured via standard metrics on held-out test sets for unseen PAIs, cross-dataset shifts, and cross-spectral transfer, with additional controls for segmented inputs, LoRA adaptation, full fine-tuning, and reverse spectral evaluation. No mathematical derivations, equations, fitted parameters renamed as predictions, or self-citation chains are present to support the central claims; results follow directly from the reported experiments without reduction to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is an empirical evaluation study that applies existing models and standard open-set protocols without introducing new mathematical constructs, fitted parameters, or postulated entities.

axioms (1)
  • standard math Standard definitions and evaluation protocols for open-set recognition and cross-dataset transfer in computer vision
    The three protocols rely on conventional notions of unseen classes and domain shifts without additional justification in the abstract.

pith-pipeline@v0.9.0 · 5819 in / 1197 out tokens · 45562 ms · 2026-05-20T11:04:59.548209+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

  1. [1]

    In- troduction to presentation attack detection in iris biometrics and recent advances,

    A. Morales, J. Fierrez, J. Galbally, and M. Gomez-Barrero, “In- troduction to presentation attack detection in iris biometrics and recent advances,”Handbook of Biometric Anti-Spoofing: Presentation Attack Detection and Vulnerability Assessment, pp. 103–121, 2023

  2. [2]

    Review of iris presentation attack detection competitions,

    D. Yambay, P . Das, A. Boyd, J. McGrath, Z. Fang, A. Czajka, S. Schuckers, K. Bowyer, M. Vatsa, R. Singhet al., “Review of iris presentation attack detection competitions,” inHandbook of Bio- metric Anti-Spoofing: Presentation Attack Detection and Vulnerability Assessment. Springer, 2023, pp. 149–169

  3. [3]

    Deep learning for iris recognition: A survey,

    K. Nguyen, H. Proenc ¸a, and F. Alonso-Fernandez, “Deep learning for iris recognition: A survey,”ACM Computing Surveys, vol. 56, no. 9, pp. 1–35, 2024

  4. [4]

    Comprehensive study in open-set iris presentation attack de- tection,

    A. Boyd, J. Speth, L. Parzianello, K. W. Bowyer, and A. Czajka, “Comprehensive study in open-set iris presentation attack de- tection,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 3238–3250, 2023

  5. [5]

    Robust scheme for iris presenta- tion attack detection using multiscale binarized statistical image features,

    R. Raghavendra and C. Busch, “Robust scheme for iris presenta- tion attack detection using multiscale binarized statistical image features,”IEEE Transactions on Information Forensics and Security, vol. 10, no. 4, pp. 703–715, 2015

  6. [6]

    Iris anti-spoofing through score- level fusion of handcrafted and data-driven features,

    M. Choudhary, V . Tiwariet al., “Iris anti-spoofing through score- level fusion of handcrafted and data-driven features,”Applied Soft Computing, vol. 91, p. 106206, 2020

  7. [7]

    Micro stripes analyses for iris presentation attack detection,

    M. Fang, N. Damer, F. Kirchbuchner, and A. Kuijper, “Micro stripes analyses for iris presentation attack detection,” in2020 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2020, pp. 1–10

  8. [8]

    Saliency-guided textured contact lens-aware iris recognition,

    L. Parzianello and A. Czajka, “Saliency-guided textured contact lens-aware iris recognition,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 330–337

  9. [9]

    De- tecting textured contact lens in uncontrolled environment using densepad,

    D. Yadav, N. Kohli, M. Vatsa, R. Singh, and A. Noore, “De- tecting textured contact lens in uncontrolled environment using densepad,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 0–0. PREPRINT: ACCEPTED IN IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE (T -BIOM) 18

  10. [10]

    An effective model for the iris regional characteristics and classification using deep learning alex network,

    T. Balashanmugam, K. Sengottaiyan, M. S. Kulandairaj, and H. Dang, “An effective model for the iris regional characteristics and classification using deep learning alex network,”IET Image Processing, vol. 17, no. 1, pp. 227–238, 2023

  11. [11]

    Deep supervised class encoding for iris presentation attack detection,

    G. Gautam, A. Raj, and S. Mukhopadhyay, “Deep supervised class encoding for iris presentation attack detection,”Digital Signal Processing, vol. 121, p. 103329, 2022

  12. [12]

    Convolutional neural networks for iris presentation attack detection: Toward cross- dataset and cross-sensor generalization,

    S. Hoffman, R. Sharma, and A. Ross, “Convolutional neural networks for iris presentation attack detection: Toward cross- dataset and cross-sensor generalization,” inProceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 1620–1628

  13. [13]

    A parametric approach to ad- versarial augmentation for cross-domain iris presentation attack detection,

    D. Pal, R. Sony, and A. Ross, “A parametric approach to ad- versarial augmentation for cross-domain iris presentation attack detection,” in2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 5719–5729

  14. [14]

    Foundation models and bio- metrics: A survey and outlook,

    H. O. Shahreza and S. Marcel, “Foundation models and bio- metrics: A survey and outlook,”IEEE Transactions on Information Forensics and Security, 2025

  15. [15]

    Benchmarking foundation models for zero-shot biometric tasks,

    R. Sony, P . Farmanifard, H. Alzwairy, N. Shukla, and A. Ross, “Benchmarking foundation models for zero-shot biometric tasks,” arXiv preprint arXiv:2505.24214, 2025

  16. [16]

    Towards iris pre- sentation attack detection with foundation models,

    J. E. Tapia, L. J. Gonz ´alez-Soler, and C. Busch, “Towards iris pre- sentation attack detection with foundation models,”arXiv preprint arXiv:2501.06312, 2025

  17. [17]

    Spectrairispad: Leveraging vision foundation models for spectrally conditioned multispectral iris presentation attack detection,

    R. Ramachandra and S. Venkatesh, “Spectrairispad: Leveraging vision foundation models for spectrally conditioned multispectral iris presentation attack detection,”IEEE Transactions on Biometrics, Behavior, and Identity Science (T-BIOM), 2025

  18. [18]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agar- wal, G. Sastry, A. Askell, P . Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” Proceedings of the 38th International Conference on Machine Learning, 2021

  19. [19]

    Dinov2: Learning ro- bust visual features without supervision,

    M. Oquab, T. Darcet, T. Moutakanniet al., “Dinov2: Learning ro- bust visual features without supervision,”Transactions on Machine Learning Research, 2023

  20. [20]

    Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

    M. Caron, H. Touvron, I. Misraet al., “Dinov3: Learning robust visual features without supervision,”arXiv preprint arXiv:2404.07143, 2024

  21. [21]

    Eva- 02: A visual representation for neon genesis,

    Y. Fang, Q. Sun, X. Wang, T. Huang, X. Wang, and Y. Cao, “Eva- 02: A visual representation for neon genesis,”Image and Vision Computing, vol. 149, p. 105171, 2024

  22. [22]

    Openvision: A fully-open, cost-effective family of advanced vision encoders for multimodal learning,

    X. Liet al., “Openvision: A fully-open, cost-effective family of advanced vision encoders for multimodal learning,”Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

  23. [23]

    Livdet-iris 2013 – iris liveness detection competition 2013,

    D. Yambay, J. Doyle, A. Czajka, K. Bowyer, and S. Schuckers, “Livdet-iris 2013 – iris liveness detection competition 2013,” 09 2014

  24. [24]

    Livdet-iris 2015 – iris liveness detection competition 2015,

    D. Yambay, B. Walczak, S. Schuckers, and A. Czajka, “Livdet-iris 2015 – iris liveness detection competition 2015,” 02 2017

  25. [25]

    Livdet-iris 2017 – iris liveness detection compe- tition,

    A. Czajkaet al., “Livdet-iris 2017 – iris liveness detection compe- tition,” inIEEE International Joint Conference on Biometrics (IJCB), 2017

  26. [26]

    Casia irisv4 image database,

    Chinese Academy of Sciences’ Institute of Au- tomation (CASIA), “Casia irisv4 image database,” http://www.cbsr.ia.ac.cn/china/Iris%20Databases%20CH.asp

  27. [27]

    Synthesis of large realistic iris databases using patch-based sampling,

    Z. Wei, T. Tan, and Z. Sun, “Synthesis of large realistic iris databases using patch-based sampling,” in2008 19th International Conference on Pattern Recognition. IEEE, 2008, pp. 1–4

  28. [28]

    Unraveling the effect of textured contact lenses on iris recognition,

    D. Yadav, N. Kohli, J. S. Doyle, R. Singh, M. Vatsa, and K. W. Bowyer, “Unraveling the effect of textured contact lenses on iris recognition,”IEEE Transactions on Information Forensics and Secu- rity, vol. 14, no. 2, 2019

  29. [29]

    Variation in accuracy of textured contact lens detection based on sensor and lens pattern,

    J. S. Doyle, K. W. Bowyer, and P . J. Flynn, “Variation in accuracy of textured contact lens detection based on sensor and lens pattern,” in2013 IEEE sixth international conference on biometrics: theory, applications and systems (BTAS). IEEE, 2013, pp. 1–7

  30. [30]

    Robust detection of textured contact lenses in iris recognition using bsif,

    J. S. Doyle and K. W. Bowyer, “Robust detection of textured contact lenses in iris recognition using bsif,”IEEE Access, vol. 3, pp. 1672– 1683, 2015

  31. [31]

    Assessment of iris recognition reliability for eyes affected by ocular pathologies,

    M. Trokielewicz, A. Czajka, and P . Maciejewicz, “Assessment of iris recognition reliability for eyes affected by ocular pathologies,” in2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, 2015, pp. 1–6

  32. [32]

    Eye diseases and their impact on iris recognition reliability,

    M. Trokielewiczet al., “Eye diseases and their impact on iris recognition reliability,”IEEE Transactions on Information Forensics and Security, 2016

  33. [33]

    Privacy-safe iris presenta- tion attack detection,

    M. Mitcheff, P . Tinsley, and A. Czajka, “Privacy-safe iris presenta- tion attack detection,” in2024 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2024, pp. 1–10

  34. [34]

    Information Tech- nology - Biometric presentation attack detection - Part 3: Testing and Reporting, International Organization for Standardization, 2017

    ISO/IEC JTC1 SC37 Biometrics,ISO/IEC 30107-3. Information Tech- nology - Biometric presentation attack detection - Part 3: Testing and Reporting, International Organization for Standardization, 2017

  35. [35]

    The linear separability effect in color visual search: Ruling out the additive color hypothe- sis,

    B. Bauer, P . Jolicoeur, and W. B. Cowan, “The linear separability effect in color visual search: Ruling out the additive color hypothe- sis,”Perception & Psychophysics, vol. 60, no. 6, pp. 1083–1093, 1998

  36. [36]

    Vreyesam: Virtual reality non-frontal iris segmentation using foundational model with uncertainty weighted loss,

    G. Sharma, D. Nagaich, G. Jaswal, A. Nigam, and R. Ramachandra, “Vreyesam: Virtual reality non-frontal iris segmentation using foundational model with uncertainty weighted loss,” in2025 IEEE International Joint Conference on Biometrics (IJCB), 2025, pp. 1–9