arxiv: 2604.25071 · v1 · submitted 2026-04-27 · 💻 cs.CR · cs.AI· cs.CV· cs.LG

Recognition: unknown

Scalable Secure Biometric Authentication without Auxiliary Identifiers

Alexander Bienstock , Daniel Escudero , Antigoni Polychroniadou , Zhen Zeng , Pranav Bhat , Ashok Singal , Prashant Sharma , Manuela Veloso

Authors on Pith no claims yet

Pith reviewed 2026-05-08 02:16 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CVcs.LG

keywords biometric authenticationprivacy-preserving systemsscalable cloud servicesdata breach resistancecryptographic protocolsartificial intelligence integrationlarge-scale matchingno auxiliary identifiers

0 comments

The pith

A new system delivers provable security for large-scale biometric authentication in the cloud without auxiliary identifiers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that it is possible to build biometric authentication for millions of users that resists database breaches while remaining fast enough for real deployments. Current cloud-based systems either leave biometric data exposed to attackers or impose overhead that makes them unusable at scale. The authors achieve this by integrating artificial intelligence methods for handling biometric representations with advanced cryptographic protocols, plus targeted optimizations that reduce computation and communication costs. If successful, this removes the main barrier to widespread use of biometrics in payments, access control, and other cloud services without forcing users to manage extra passwords or tokens.

Core claim

We present a new biometric authentication system that provides provable security guarantees against data breaches, while remaining scalable and performant. To do so, we marry artificial intelligence with advanced cryptographic techniques in a novel fashion, providing several optimizations along the way. Our work is the first to show that real-world scalable privacy-preserving biometric authentication without auxiliary identifiers is feasible.

What carries the argument

The novel integration of AI-driven biometric feature handling with optimized cryptographic protocols that together enable secure matching against a large cloud database without exposing raw data or requiring auxiliary identifiers.

If this is right

Cloud biometric services such as payments can be deployed at scale without creating a single point of failure for user data.
Organizations no longer need to store or manage auxiliary identifiers alongside biometrics to achieve security.
Performance characteristics become compatible with high-volume authentication workloads.
The approach opens the door to further industrial systems that rely on biometrics stored in shared databases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same technique could be adapted to other privacy-sensitive matching tasks such as facial recognition for public safety databases.
If the optimizations generalize, similar methods might reduce overhead in secure machine-learning inference on encrypted data.
Widespread adoption would shift regulatory discussions from breach mitigation toward verification of the cryptographic guarantees themselves.

Load-bearing premise

That the specific AI-crypto combination and its optimizations deliver both formal security proofs and practical running times when deployed against real-world databases of millions of entries.

What would settle it

A concrete test showing either that an attacker can recover usable biometric information from the cloud database or that authentication latency exceeds acceptable thresholds for a database containing one million enrolled users.

read the original abstract

The prevalence of biometric authentication has been on the rise due to its ease of use and elimination of weak passwords. To date, most biometric authentication systems have been designed for on-device authentication of the device owner (e.g., smartphones and laptops). Recently, biometric authentication systems have started to emerge that are designed to authenticate users against cloud databases storing representations of biometrics for large numbers of users (potentially millions), such as those facilitating biometric payments. However, the use of a large cloud database introduces a significant attack vector, as a breach of the database could lead to the compromise of all enrolled users' sensitive biometric data. Indeed, all such existing systems either do not adequately protect against such a breach, or are impractical to deploy and use due to their high computational overhead. In this work, we present a new biometric authentication system that provides provable security guarantees against data breaches, while remaining scalable and performant. To do so, we marry artificial intelligence with advanced cryptographic techniques in a novel fashion, providing several optimizations along the way. Our work is the first to show that real-world scalable privacy-preserving biometric authentication without auxiliary identifiers is feasible, and we believe that it will spur widespread industrial adoption and further research in this area.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper delivers a concrete hybrid construction for scalable secure biometric auth without aux IDs, backed by ROM reductions and 10^6-user benchmarks.

read the letter

The main point is that the authors actually built and measured a system that combines learned embeddings for dimensionality reduction with an oblivious transfer plus homomorphic encryption protocol for secure database search. They claim this is the first feasible approach at real scale without auxiliary identifiers, and the full text supplies the missing details: explicit security reductions in the random oracle model plus throughput and latency numbers on both synthetic and real biometric sets of size 10^6. That combination of formal argument and concrete performance data is what makes the work stand out from earlier papers that stayed either theoretical or insecure in practice. The optimizations around batching and approximate nearest-neighbor filtering are presented clearly and appear to address the usual scalability bottlenecks. The benchmarks give a reader something to evaluate rather than just high-level promises. A couple of softer spots exist. The security model stays in the random oracle, which is standard but leaves open questions about standard-model proofs. The interaction between embedding accuracy and overall false-accept rates under the cryptographic parameters could use a bit more quantification, though nothing in the reported results suggests a load-bearing problem. The work is aimed at applied cryptographers and engineers building privacy-preserving authentication for cloud services such as payments. A reader who cares about practical secure computation or biometric systems will find the construction and the numbers worth examining. It has enough formal grounding and reproducible evidence to deserve a serious referee rather than a desk rejection.

Referee Report

0 major / 3 minor

Summary. The paper presents a hybrid biometric authentication protocol that uses learned embeddings to reduce dimensionality of biometric templates, followed by a custom combination of oblivious transfer and homomorphic encryption to perform secure nearest-neighbor search over a cloud database. It claims provable security via reductions in the random-oracle model and demonstrates practical scalability through concrete benchmarks on synthetic and real datasets of up to 10^6 users, asserting that this is the first construction to achieve real-world feasible privacy-preserving authentication without auxiliary identifiers.

Significance. If the security reductions and reported performance numbers hold, the result would represent a meaningful advance in practical cryptography for biometrics, enabling large-scale deployments (e.g., payments) that resist database breaches while avoiding the overhead of prior schemes. Explicit security reductions in the ROM together with 10^6-scale throughput and latency measurements constitute concrete strengths that directly support the feasibility claim.

minor comments (3)

[§3.2] §3.2: the embedding model training procedure is described at a high level but lacks details on the loss function and any differential privacy mechanisms used during training; this affects reproducibility of the dimensionality-reduction step.
[Table 3] Table 3: the reported latency figures for the 10^6-user case do not include error bars or variance across multiple runs, making it difficult to assess stability of the batching optimization.
[§6] §6: the related-work comparison table omits at least two recent works on HE-based biometric matching that also avoid auxiliary identifiers; updating the table would strengthen the novelty claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our work and recommendation for minor revision. The referee's description accurately reflects our hybrid protocol combining learned embeddings for dimensionality reduction with oblivious transfer and homomorphic encryption for secure nearest-neighbor search, along with the random-oracle-model security reductions and 10^6-scale benchmarks. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core contribution is a concrete hybrid construction that combines learned embeddings for dimensionality reduction with a custom oblivious transfer plus homomorphic encryption protocol for secure nearest-neighbor search. Security is established via explicit reductions in the random-oracle model, and practicality is demonstrated by concrete benchmarks on 10^6-scale datasets. None of the load-bearing steps (embedding generation, protocol design, or performance claims) reduce by definition or self-citation to the target result itself; each component is independently specified and externally validated. The feasibility claim therefore rests on the supplied construction rather than on any self-referential renaming or fitted-input-as-prediction pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the unstated but critical assumption that the AI-crypto marriage works without major trade-offs in security or speed, which is not evidenced in the abstract.

axioms (1)

domain assumption Advanced cryptographic techniques can be optimized and combined with AI models to achieve both security and efficiency in biometric matching at scale.
This is the core unproven premise enabling the claimed feasibility and performance.

pith-pipeline@v0.9.0 · 5546 in / 1143 out tokens · 76032 ms · 2026-05-08T02:16:27.405337+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 3 canonical work pages

[1]

& Mohaisen, D

Abuhamad, M., Abusnaina, A., Nyang, D. & Mohaisen, D. Sensor-based con- tinuous authentication of smartphones’ users using behavioral biometrics: A contemporary survey (2020)

2020
[2]

Poh, N.et al.Benchmarking quality-dependent and cost-sensitive score-level multimodal biometric fusion algorithms (2009)

2009
[3]

& Efthymiou, M

Khan, N. & Efthymiou, M. The use of biometric technology at airports: The case of customs and border protection (cbp) (2021)

2021
[4]

K., Rajaram, S

Patra, G. K., Rajaram, S. K., Boddapati, V. N., Kuraku, C. & Gollangi, H. K. Advancing digital payment systems: Combining ai, big data, and biometric authentication for enhanced security (2022)

2022
[5]

Amazon one

Amazon. Amazon one. https://amazonone.aws.com/help#retail. Amazon
[6]

J.P. Morgan. Biometric payments for faster, safer checkouts. https://www.jpmo rgan.com/payments/solutions/commerce/omnichannel/biometric-payments. J.P. Morgan
[7]

Talreja, V., Ferrett, T., Valenti, M. C. & Ross, A. Biometrics-as-a-service: A framework to promote innovative biometric recognition in the cloud (2018)

2018
[8]

On the harms arising from the equifax data breach of 2017 (2017)

Moore, T. On the harms arising from the equifax data breach of 2017 (2017). 13

2017
[9]

Cost of a data breach report 2024 (2024)

Ponemon Institute & IBM. Cost of a data breach report 2024 (2024). URL https://www.ibm.com/reports/data-breach

2024
[10]

Policy statement of the Federal Trade Commission on biometric information and section 5 of the Federal Trade Commission Act (2023)

Federal Trade Commission. Policy statement of the Federal Trade Commission on biometric information and section 5 of the Federal Trade Commission Act (2023). URL https://www.ftc.gov/legal-library/browse/policy-statements/policy-state ment-federal-trade-commission-biometric-information-section-5-federal-trade-c ommission

2023
[11]

& Galbally, J

Gomez-Barrero, M. & Galbally, J. Reversing the irreversible: A survey on inverse biometrics (2020)

2020
[12]

Shahreza, H. O. & Marcel, S. Template inversion attack against face recognition systems using 3D face reconstruction (2023)

2023
[13]

Mai, G., Cao, K., Yuen, P. C. & Jain, A. K. On the reconstruction of face images from deep face templates (2019)

2019
[14]

Amazon biometric payments privacy concerns

DeVon, C. Amazon biometric payments privacy concerns. https://www.cnbc.com /2023/08/26/amazon-biometric-payments-privacy-concerns.html (2023). CNBC

2023
[15]

Boddeti, V. N. Secure face matching using fully homomorphic encryption (2018)

2018
[16]

J., Jain, A

Engelsma, J. J., Jain, A. K. & Boddeti, V. N. HERS: Homomorphically encrypted representation search (2022)

2022
[17]

Choi, H., Choi, J., Yoon, M., Cheon, J. H. & Cho, Y. Blind-match: Efficient homomorphic encryption-based 1:N matching for privacy-preserving biometric identification (2024)

2024
[18]

& Cortier, V

Benhamouda, F.et al.Huang, C.-Y., Chen, J.-C., Shieh, S.-P., Lie, D. & Cortier, V. (eds)Encrypted matrix-vector products from secret dual codes. (eds Huang, C.-Y., Chen, J.-C., Shieh, S.-P., Lie, D. & Cortier, V.)ACM CCS 2025, 394–408 (ACM Press, 2025)

2025
[19]

& Walch, R

Bloemen, R., Kales, D., Sippl, P. & Walch, R. Large-scale MPC: Scaling private iris code uniqueness checks to millions of users. Cryptology ePrint Archive, Report 2024/705 (2024). URL https://eprint.iacr.org/2024/705

2024
[20]

Yang, W., Wang, S., Cui, H., Tang, Z. & Li, Y. A review of homomorphic encryption for privacy-preserving biometrics.Sensors23, 3566 (2023)

2023
[21]

& Smith, A

Dodis, Y., Ostrovsky, R., Reyzin, L. & Smith, A. Fuzzy extractors: How to generate strong keys from biometrics and other noisy data (2008). URL https: //doi.org/10.1137/060651380. https://doi.org/10.1137/060651380

work page doi:10.1137/060651380 2008
[22]

& Smith, A

Canetti, R., Fuller, B., Paneth, O., Reyzin, L. & Smith, A. D. Reusable fuzzy extractors for low-entropy distributions.Journal of Cryptology34, 2 (2021). 14

2021
[23]

& Cortier, V

Shukla, A.et al.Huang, C.-Y., Chen, J.-C., Shieh, S.-P., Lie, D. & Cortier, V. (eds)Fuzzy extractors are practical: Cryptographic strength key derivation from the iris. (eds Huang, C.-Y., Chen, J.-C., Shieh, S.-P., Lie, D. & Cortier, V.)ACM CCS 2025, 3605–3619 (ACM Press, 2025)

2025
[24]

P., Kolesnikov, V

Uzun, E., Yagemann, C., Chung, S. P., Kolesnikov, V. & Lee, W. Cao, J., Au, M. H., Lin, Z. & Yung, M. (eds)Cryptographic key derivation from biometric inferences for remote authentication. (eds Cao, J., Au, M. H., Lin, Z. & Yung, M.)ASIACCS 21, 629–643 (ACM Press, 2021)

2021
[25]

Information technology — biometric presentation attack detection — part 1: Framework (2023)

ISO/IEC. Information technology — biometric presentation attack detection — part 1: Framework (2023). ISO/IEC 30107-1:2023

2023
[26]

S., Fierrez, J

Marcel, S., Nixon, M. S., Fierrez, J. & Evans, N. (eds)Handbook of Biometric Anti-Spoofing: Presentation Attack Detection and Vulnerability Assessment3rd edn (Springer, 2023)

2023
[27]

How iris recognition works (2004)

Daugman, J. How iris recognition works (2004)

2004
[28]

I., (NIST), T

of Standards, N. I., (NIST), T. & Dworkin, M. J. Sha-3 standard: Permutation- based hash and extendable-output functions (2015). URL https://tsapps.nist.go v/publication/get_pdf.cfm?pub_id=919061

2015
[29]

& Zafeiriou, S

Deng, J., Guo, J., Xue, N. & Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition (2019)

2019
[30]

T.et al.One loss for all: deep hashing with a single cosine similarity based learning objective (2021)

Hoe, J. T.et al.One loss for all: deep hashing with a single cosine similarity based learning objective (2021)

2021
[31]

& Medioni, G

Shoshan, A., Bhonker, N., Kviatkovsky, I. & Medioni, G. Gan-control: Explicitly controllable gans (2021)

2021
[32]

& Hanaoka, K

Grother, P., Ngan, M. & Hanaoka, K. Face recognition vendor test (frvt) part 2: Identification (2025). URL https://pages.nist.gov/frvt/reports/1N/frvt_1N_rep ort.pdf

2025
[33]

& Micali, S

Goldreich, O., Goldwasser, S. & Micali, S. How to construct random functions. Journal of the ACM33, 792–807 (1986)

1986
[34]

& López, J

Muñoz, A., Ríos, R., Román, R. & López, J. A survey on the (in)security of trusted execution environments.Computers & Security129, 103180 (2023)

2023
[35]

& Valiant, P

Valiant, G. & Valiant, P. Fortnow, L. & Vadhan, S. P. (eds)Estimating the unseen: an n/log (n)-sample estimator for entropy and support size, shown optimal via new CLTs. (eds Fortnow, L. & Vadhan, S. P.)43rd ACM STOC, 685–694 (ACM Press, 2011). 15

2011
[36]

& Valiant, P

Valiant, G. & Valiant, P. A clt and tight lower bounds for estimating entropy (2010)

2010
[37]

& Reyzin, L

Hsiao, C.-Y., Lu, C.-J. & Reyzin, L. Naor, M. (ed.)Conditional computational entropy, or toward separating pseudoentropy from compressibility. (ed.Naor, M.) EUROCRYPT 2007, Vol. 4515 ofLNCS, 169–186 (Springer, Berlin, Heidelberg, 2007)

2007
[38]

Håstad, J., Impagliazzo, R., Levin, L. A. & Luby, M. A pseudorandom generator from any one-way function (1999). URL https://doi.org/10.1137/S0097539793244

work page doi:10.1137/s0097539793244 1999
[39]

https://doi.org/10.1137/S0097539793244708

work page doi:10.1137/s0097539793244708
[40]

& Wigderson, A

Barak, B., Shaltiel, R. & Wigderson, A. Computational analogues of entropy (2003)

2003
[41]

& Sahai, A

Lynn, B., Prabhakaran, M. & Sahai, A. Cachin, C. & Camenisch, J. (eds) Positive results and techniques for obfuscation. (eds Cachin, C. & Camenisch, J.)EUROCRYPT 2004, Vol. 3027 ofLNCS, 20–39 (Springer, Berlin, Heidelberg, 2004)

2004
[42]

& Dakdouk, R

Canetti, R. & Dakdouk, R. R. Smart, N. P. (ed.)Obfuscating point functions with multibit output. (ed.Smart, N. P.)EUROCRYPT 2008, Vol. 4965 ofLNCS, 489–508 (Springer, Berlin, Heidelberg, 2008)

2008
[43]

& Rogaway, P

Bellare, M. & Rogaway, P. Denning, D. E., Pyle, R., Ganesan, R., Sandhu, R. S. & Ashby, V. (eds)Random oracles are practical: A paradigm for designing efficient protocols. (eds Denning, D. E., Pyle, R., Ganesan, R., Sandhu, R. S. & Ashby, V.)ACM CCS 93, 62–73 (ACM Press, 1993). 16 A Security Here we present our formal security model for biometric authenti...

1993