ChaRVoC: A Challenge-Response Voice Cancelable Authentication System

Dinh-Thuc Nguyen; Hoang C. Ta; Hong-Hanh Nguyen-Le; Nhien-An Le-Khac; Phuc-Khang Vo-Hoang

arxiv: 2605.02990 · v2 · submitted 2026-05-04 · 💻 cs.CR

ChaRVoC: A Challenge-Response Voice Cancelable Authentication System

Phuc-Khang Vo-Hoang , Hoang C. Ta , Nhien-An Le-Khac , Dinh-Thuc Nguyen , Hong-Hanh Nguyen-Le This is my paper

Pith reviewed 2026-05-08 18:21 UTC · model grok-4.3

classification 💻 cs.CR

keywords cancelable biometricsvoice authenticationchallenge-responsetemplate securityrevocabilityunlinkabilityHashGray-XORliveness detection

0 comments

The pith

ChaRVoC uses a hash and graycode scheme to create revocable, non-invertible voice templates protected by secret keys and dynamic challenges.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ChaRVoC, a voice authentication system that combines inherent voice features with user-memorized secret keys and system-generated challenges. This three-factor setup aims to block replay attacks through liveness checks, allow template revocation by key changes, and prevent recovery of original data. The central mechanism is the HashGray-XOR scheme, which applies a cryptographic hash followed by an unrecoverable graycode transformation to produce templates that the authors prove cannot be inverted. If correct, the approach would let voice biometrics function like changeable passwords while preserving recognition accuracy. Evaluations against other cancelable methods on VoxCeleb1, TIMIT, and VOiCES datasets support that performance holds alongside the new security properties of cancelability and unlinkability.

Core claim

ChaRVoC integrates voice biometrics, user-memorized secret keys enabling template revocability, and dynamic system-generated challenges providing liveness detection. The novel HashGray-XOR scheme combines a cryptographic hash function with an unrecoverable graycode-based transformation to create secured templates that are mathematically proven to be non-invertible. The system achieves both cancelability and unlinkability properties while maintaining recognition performance comparable to existing methods on VoxCeleb1, TIMIT, and VOiCES datasets.

What carries the argument

The HashGray-XOR scheme, which applies a cryptographic hash function to voice features combined with a secret key and then performs an unrecoverable graycode-based transformation to generate non-invertible templates.

If this is right

Templates can be revoked by changing the user's secret key without needing new voice enrollment.
Dynamic challenges ensure each authentication attempt is unique, blocking recorded replay attacks.
Unlinkability prevents matching of templates from the same user across separate authentication systems.
Recognition accuracy remains competitive with prior cancelable methods such as WTA, IoM, and RoE on standard voice datasets.
The three-factor design reduces the impact of any single compromise in voice, key, or template storage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could extend to other biometric types where revocability is needed without retraining.
If the non-invertibility proof holds under standard cryptographic assumptions, it strengthens arguments for deploying cancelable biometrics at scale.
Integration with mobile devices could use device-stored challenges to add friction against remote attacks.

Load-bearing premise

The graycode-based transformation cannot be reversed to recover the original voice data or secret key even when the hash output and transformation rules are known.

What would settle it

An algorithm or procedure that successfully recovers the original voice features or secret key from a stored ChaRVoC template would show the HashGray-XOR scheme is invertible.

Figures

Figures reproduced from arXiv: 2605.02990 by Dinh-Thuc Nguyen, Hoang C. Ta, Hong-Hanh Nguyen-Le, Nhien-An Le-Khac, Phuc-Khang Vo-Hoang.

**Figure 1.** Figure 1: An overview of our Challenge-Response Voice Cancelable Voice Authen view at source ↗

**Figure 2.** Figure 2: Distributions of mated samples and non-mated samples. view at source ↗

read the original abstract

In this work, we present a Challenge-Response Voice Cancelable authentication system, called ChaRVoC, which provides protection against replay attacks, revocability issues, and template compromise. Our approach integrates three security factors: (1) inherent voice biometric characteristics, (2) user-memorized secret keys enabling template revocability, and (3) dynamic system-generated challenges providing liveness detection. Specifically, we introduce a novel HashGray-XOR scheme which combines a cryptographic hash function with an unrecoverable graycode-based transformation to create secured templates that are mathematically proven to be non-invertible. We compare our methods with existing cancelable biometric methods (WTA, IoM, RoE) on VoxCeleb1, TIMIT, and VOiCES datasets to show the recognition performance of our proposed system. We also show that our system achieves both cancelability and unlinkability properties.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ChaRVoC adds a three-factor challenge-response layer to voice biometrics with HashGray-XOR templates, but the non-invertibility claim does not survive basic scrutiny of Gray codes.

read the letter

The main thing to know is that this paper builds a cancelable voice authentication system called ChaRVoC that combines voice features, a user secret key, and dynamic challenges to address replay, revocability, and template theft. It introduces HashGray-XOR as the template transform and reports comparisons plus cancelability/unlinkability checks on VoxCeleb1, TIMIT, and VOiCES. That architecture is a reasonable engineering step beyond single-factor voice biometrics. The comparisons to WTA, IoM, and RoE give a usable baseline, and the explicit checks for cancelability and unlinkability are the parts that would actually matter in deployment. The work is coherent on its own terms and engages the literature without obvious circular fitting or invented parameters. The stress-test concern holds up: standard binary-reflected Gray codes are invertible via a short successive-XOR procedure, so they add no one-wayness on their own. Any non-invertibility must come from the cryptographic hash and the secret key alone. The abstract's phrasing of an 'unrecoverable graycode-based transformation' and a 'mathematical proof' therefore overstates what the Gray code layer contributes. Without a reduction to a standard hard problem or an explicit security game that survives removing the Gray step, the central security claim stays informal. Performance numbers and proof details are not visible in the abstract, which limits how far the results can be trusted yet. This paper is for people working on practical biometric authentication who need revocable templates and replay resistance. A reader focused on system-level design would find the three-factor integration and dataset comparisons useful. It deserves peer review because the proposal is concrete, the experiments use public data, and the security model can be tightened with referee input rather than being fundamentally broken.

Referee Report

2 major / 2 minor

Summary. The paper proposes ChaRVoC, a challenge-response voice cancelable authentication system integrating voice biometrics, user-memorized secret keys for revocability, and dynamic challenges for liveness detection to address replay attacks and template compromise. It introduces a HashGray-XOR scheme combining a cryptographic hash with a graycode-based transformation, claiming this yields templates that are mathematically proven non-invertible. Performance is compared to WTA, IoM, and RoE on VoxCeleb1, TIMIT, and VOiCES datasets, with additional claims of cancelability and unlinkability.

Significance. If the non-invertibility claim holds under a formal security model and the empirical results demonstrate competitive accuracy with proper statistical controls, the work could meaningfully advance cancelable biometrics for voice by adding challenge-response liveness without sacrificing revocability or unlinkability.

major comments (2)

[Abstract] Abstract: The central claim that the HashGray-XOR scheme produces 'secured templates that are mathematically proven to be non-invertible' lacks any derivation, reduction, or proof sketch. Binary-reflected Gray codes are bijective permutations invertible in linear time via successive XOR with right-shifted copies; therefore the graycode layer adds no one-wayness beyond the hash preimage resistance and secret-key XOR. An explicit security model (e.g., in the random-oracle or standard-model setting) showing that template inversion remains hard even after removal of the graycode step is required.
[Evaluation section] Evaluation (performance comparison): The abstract states comparisons on VoxCeleb1, TIMIT, and VOiCES but supplies no concrete metrics (EER, accuracy, AUC), error bars, dataset splits, or exclusion criteria. Without these, it is impossible to verify whether the claimed recognition performance is statistically distinguishable from the baselines or robust to the three-factor integration.

minor comments (2)

[Abstract] Clarify the exact ordering of hash, graycode, and XOR operations and whether the graycode is applied to the feature vector or the hash output; this affects both invertibility analysis and implementation reproducibility.
Provide formal definitions of cancelability and unlinkability (e.g., via indistinguishability or distance-preserving properties) and explicit experimental protocols demonstrating these properties rather than informal assertions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps strengthen the security analysis and clarity of our results. We address each major comment below, indicating the specific revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the HashGray-XOR scheme produces 'secured templates that are mathematically proven to be non-invertible' lacks any derivation, reduction, or proof sketch. Binary-reflected Gray codes are bijective permutations invertible in linear time via successive XOR with right-shifted copies; therefore the graycode layer adds no one-wayness beyond the hash preimage resistance and secret-key XOR. An explicit security model (e.g., in the random-oracle or standard-model setting) showing that template inversion remains hard even after removal of the graycode step is required.

Authors: We agree that the Gray-code transformation is a bijective permutation and does not itself contribute one-wayness; the non-invertibility claim rests on the combination of the cryptographic hash preimage resistance and the secret-key XOR. The Gray-code step is used for diffusion to support unlinkability and cancelability. We will add a dedicated security analysis section that provides an explicit security model in the random-oracle setting. This will include a proof sketch showing that template inversion remains hard (under standard hash assumptions) even if the Gray-code layer is removed, together with a formal definition of the adversary and the advantage bound. revision: yes
Referee: [Evaluation section] Evaluation (performance comparison): The abstract states comparisons on VoxCeleb1, TIMIT, and VOiCES but supplies no concrete metrics (EER, accuracy, AUC), error bars, dataset splits, or exclusion criteria. Without these, it is impossible to verify whether the claimed recognition performance is statistically distinguishable from the baselines or robust to the three-factor integration.

Authors: The full evaluation section already reports EER, accuracy, and AUC values for ChaRVoC versus WTA, IoM, and RoE on the three datasets, along with dataset splits. However, we acknowledge that the abstract omits these numbers and that the evaluation section would benefit from additional statistical controls. We will revise the abstract to include the key EER figures, and we will expand the evaluation section to report error bars from multiple random seeds, explicit train/test splits, exclusion criteria for utterances, and results of statistical significance tests (e.g., paired t-tests) against the baselines. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on novel construction and standard assumptions.

full rationale

The paper introduces the HashGray-XOR scheme as a new combination of cryptographic hash and graycode transformation, asserting non-invertibility via mathematical proof. No load-bearing steps reduce by construction to fitted parameters, self-citations, or prior inputs; the abstract and description present the scheme as self-contained, relying on the claimed properties of the construction itself plus standard crypto primitives without evident renaming, smuggling, or definitional loops. Security properties are asserted as newly achieved rather than derived from the inputs by equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on standard cryptographic assumptions and the unrecoverability of the introduced transformation, with no free parameters or new physical entities specified.

axioms (2)

standard math Cryptographic hash functions are one-way and secure
Invoked as the basis for the HashGray-XOR scheme to secure templates
domain assumption Graycode-based transformation is unrecoverable
Stated as enabling non-invertible templates in the abstract

invented entities (1)

HashGray-XOR scheme no independent evidence
purpose: To generate secured non-invertible voice templates combining hash and graycode
Newly proposed method whose properties are claimed but not independently evidenced outside the paper

pith-pipeline@v0.9.0 · 5475 in / 1427 out tokens · 105482 ms · 2026-05-08T18:21:49.288100+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Cost (Jcost) washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

t = f(k,v) = H(k) ⊕ T(v) ... H is the cryptographic hash function, T is the unrecoverable graycode-based function

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 4 canonical work pages

[1]

In: Proceedings of the second ACM workshop on Digital identity management

Bhargav-Spantzel, A., Squicciarini, A., Bertino, E.: Privacy preserving multi-factor authentication with biometrics. In: Proceedings of the second ACM workshop on Digital identity management. pp. 63–72 (2006)

2006
[2]

IET Biometrics4(2), 116–126 (2015)

Billeb, S., Rathgeb, C., Reininger, H., Kasper, K., Busch, C.: Biometric template protection for speaker recognition based on universal background models. IET Biometrics4(2), 116–126 (2015)

2015
[3]

Ceaparu, M., Toma, S.A., Segarceanu, S., Suciu, G., Gavat, I.: Multifactor voice- based authentication system. J. Eng. Sci. Technol. Rev pp. 131–136 (2020)

2020
[4]

Pat- tern Recognition76, 273–287 (2018)

Chee, K.Y., Jin, Z., Cai, D., Li, M., Yap, W.S., Lai, Y.L., Goi, B.M.: Cancellable speech template via random binary orthogonal matrices projection hashing. Pat- tern Recognition76, 273–287 (2018)

2018
[5]

In: 2018 IEEE International Conference onAcoustics,SpeechandSignalProcessing(ICASSP).pp.5359–5363.IEEE(2018)

rahman Chowdhury, F.R., Wang, Q., Moreno, I.L., Wan, L.: Attention-based mod- els for text-dependent speaker verification. In: 2018 IEEE International Conference onAcoustics,SpeechandSignalProcessing(ICASSP).pp.5359–5363.IEEE(2018)

2018
[6]

In: Meng, H., Xu, B., Zheng, T.F

Desplanques, B., Thienpondt, J., Demuynck, K.: ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verifi- cation. In: Meng, H., Xu, B., Zheng, T.F. (eds.) Interspeech 2020. pp. 3830–3834. ISCA (2020) Phuc-Khang Vo-Hoang et al

2020
[7]

Interna- tional Journal of Speech Technology25(3), 759–770 (2022).https://doi.org/10

El-Moneim, S.A., Nassar, M.A., Dessouky, M.I., Ismail, N.A., El-Fishawy, A.S., Abd El-Samie, F.E.: Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks. Interna- tional Journal of Speech Technology25(3), 759–770 (2022).https://doi.org/10. 1007/s10772-020-09791-y

2022
[8]

Web Download (1993),https://catalog.ldc.upenn.edu/LDC93S1

Garofolo, J.S., et al.: Timit acoustic-phonetic continuous speech corpus ldc93s1. Web Download (1993),https://catalog.ldc.upenn.edu/LDC93S1

1993
[9]

Gomez-Barrero, M., Galbally, J., Rathgeb, C., Busch, C.: General framework to evaluateunlinkabilityinbiometrictemplateprotectionsystems.IEEETransactions on Information Forensics and Security13(6), 1406–1420 (2017)

2017
[10]

In: Proceedings of the 2018 10th International Conference on Information Management and Engineering

Guamán, S., Calvopiña, A., Orta, P., Tapia, F., Yoo, S.G.: Device control system for a smart home using voice commands: A practical case. In: Proceedings of the 2018 10th International Conference on Information Management and Engineering. pp. 86–89 (2018)

2018
[11]

In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP)

Heigold,G.,Moreno,I.,Bengio,S.,Shazeer,N.:End-to-endtext-dependentspeaker verification. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp. 5115–5119. IEEE (2016)

2016
[12]

IEEE Trans- actions on Information Forensics and Security13(2), 393–407 (2017)

Jin, Z., Hwang, J.Y., Lai, Y.L., Kim, S., Teoh, A.B.J.: Ranking-based locality sen- sitive hashing-enabled cancelable biometrics: Index-of-max hashing. IEEE Trans- actions on Information Forensics and Security13(2), 393–407 (2017)

2017
[13]

In: 2013IEEEInternationalConferenceonTechnologiesforHomelandSecurity(HST)

Johnson, R., Boult, T.E.: With vaulted voice verification my voice is my key. In: 2013IEEEInternationalConferenceonTechnologiesforHomelandSecurity(HST). pp. 453–459. IEEE (2013)

2013
[14]

Jung,J.w.,Kim,S.b.,Shim,H.j.,Kim,J.h.,Yu,H.J.:Improvedrawnetwithfeature map scaling for text-independent speaker verification using raw waveforms. Proc. Interspeech pp. 3583–3587 (2020)

2020
[15]

Jung, J.w., Kim, Y.J., Heo, H.S., Lee, B.J., Kwon, Y., Chung, J.S.: Pushing the limits of raw waveform speaker recognition. Proc. Interspeech (2022)

2022
[16]

arXiv preprint arXiv:2402.18085 (2024)

Mittal, G., Jakobsson, A., Marshall, K.O., Hegde, C., Memon, N.: Pitch: Ai- assisted tagging of deepfake audio calls using challenge-response. arXiv preprint arXiv:2402.18085 (2024)

work page arXiv 2024
[17]

V oxceleb: a large-scale speaker identiﬁcation dataset,

Nagrani, A., Chung, J.S., Zisserman, A.: VoxCeleb: A large-scale speaker identifi- cation dataset. arXiv:1706.08612 (2017), available athttp://www.robots.ox.ac. uk/~vgg/data/voxceleb/

work page arXiv 2017
[18]

arXiv preprint arXiv:1803.03559 (2018)

Nautsch, A., Isadskiy, S., Kolberg, J., Gomez-Barrero, M., Busch, C.: Homomor- phic encryption for speaker recognition: Protection of biometric templates and vendor model parameters. arXiv preprint arXiv:1803.03559 (2018)

work page arXiv 2018
[19]

Pattern Recognition159, 111107 (2025)

Nguyen-Le, H.H., Tran, L., Nguyen, D.S.A., Le-Khac, N.A., Nguyen, T.: Privacy- preserving speaker verification system using ranking-of-element hashing. Pattern Recognition159, 111107 (2025)

2025
[20]

Angarano, M

Nguyen-Le, H.H., Tran, L., Nguyen, D.S.A., Le-Khac, N.A., Nguyen, T.: Privacy- preserving speaker verification system using ranking-of-element hashing. Pat- tern Recognition159, 111107 (2025).https://doi.org/10.1016/j.patcog.2024. 111107

work page doi:10.1016/j.patcog.2024 2025
[21]

In: Odyssey

Paulini,M.,Rathgeb,C.,Nautsch,A.,Reichau,H.,Reininger,H.,Busch,C.:Multi- bit allocation: Preparing voice biometrics for template protection. In: Odyssey. pp. 291–296 (2016)

2016
[22]

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3(3), 1–26 (2019) ChaRVoC: A Challenge-Response Voice Cancelable Authentication System

Pradhan, S., Sun, W., Baig, G., Qiu, L.: Combating replay attacks against voice as- sistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3(3), 1–26 (2019) ChaRVoC: A Challenge-Response Voice Cancelable Authentication System

2019
[23]

Richey, C., Barrios, M.A., Armstrong, Z., Bartels, C., Franco, H., Graciarena, M., Lawson, A., Nandwana, M.K., Stauffer, A., van Hout, J., Gamble, P., Hetherly, J., Stephenson, C., Ni, K.: Voices obscured in complex environmental settings (voices) corpus (2018)

2018
[24]

In: 2023 International Conference on Computational Intelligence, Communication Technology and Networking (CICTN)

Yadav, S.P., Gupta, A., Nascimento, C.D.S., de Albuquerque, V.H.C., Naruka, M.S., Chauhan, S.S.: Voice-based virtual-controlled intelligent personal assistants. In: 2023 International Conference on Computational Intelligence, Communication Technology and Networking (CICTN). pp. 563–568. IEEE (2023)

2023
[25]

In: 2011 International Conference on Computer Vision

Yagnik, J., Strelow, D., Ross, D.A., Lin, R.s.: The power of comparative reason- ing. In: 2011 International Conference on Computer Vision. pp. 2431–2438. IEEE (2011)

2011
[26]

In: Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security

Yasur, L., Frankovits, G., Grabovski, F.M., Mirsky, Y.: Deepfake captcha: A method for preventing fake calls. In: Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security. pp. 608–622 (2023)

2023
[27]

In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Zhang, L., Tan, S., Yang, J.: Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. pp. 57–71 (2017)

2017

[1] [1]

In: Proceedings of the second ACM workshop on Digital identity management

Bhargav-Spantzel, A., Squicciarini, A., Bertino, E.: Privacy preserving multi-factor authentication with biometrics. In: Proceedings of the second ACM workshop on Digital identity management. pp. 63–72 (2006)

2006

[2] [2]

IET Biometrics4(2), 116–126 (2015)

Billeb, S., Rathgeb, C., Reininger, H., Kasper, K., Busch, C.: Biometric template protection for speaker recognition based on universal background models. IET Biometrics4(2), 116–126 (2015)

2015

[3] [3]

Ceaparu, M., Toma, S.A., Segarceanu, S., Suciu, G., Gavat, I.: Multifactor voice- based authentication system. J. Eng. Sci. Technol. Rev pp. 131–136 (2020)

2020

[4] [4]

Pat- tern Recognition76, 273–287 (2018)

Chee, K.Y., Jin, Z., Cai, D., Li, M., Yap, W.S., Lai, Y.L., Goi, B.M.: Cancellable speech template via random binary orthogonal matrices projection hashing. Pat- tern Recognition76, 273–287 (2018)

2018

[5] [5]

In: 2018 IEEE International Conference onAcoustics,SpeechandSignalProcessing(ICASSP).pp.5359–5363.IEEE(2018)

rahman Chowdhury, F.R., Wang, Q., Moreno, I.L., Wan, L.: Attention-based mod- els for text-dependent speaker verification. In: 2018 IEEE International Conference onAcoustics,SpeechandSignalProcessing(ICASSP).pp.5359–5363.IEEE(2018)

2018

[6] [6]

In: Meng, H., Xu, B., Zheng, T.F

Desplanques, B., Thienpondt, J., Demuynck, K.: ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verifi- cation. In: Meng, H., Xu, B., Zheng, T.F. (eds.) Interspeech 2020. pp. 3830–3834. ISCA (2020) Phuc-Khang Vo-Hoang et al

2020

[7] [7]

Interna- tional Journal of Speech Technology25(3), 759–770 (2022).https://doi.org/10

El-Moneim, S.A., Nassar, M.A., Dessouky, M.I., Ismail, N.A., El-Fishawy, A.S., Abd El-Samie, F.E.: Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks. Interna- tional Journal of Speech Technology25(3), 759–770 (2022).https://doi.org/10. 1007/s10772-020-09791-y

2022

[8] [8]

Web Download (1993),https://catalog.ldc.upenn.edu/LDC93S1

Garofolo, J.S., et al.: Timit acoustic-phonetic continuous speech corpus ldc93s1. Web Download (1993),https://catalog.ldc.upenn.edu/LDC93S1

1993

[9] [9]

Gomez-Barrero, M., Galbally, J., Rathgeb, C., Busch, C.: General framework to evaluateunlinkabilityinbiometrictemplateprotectionsystems.IEEETransactions on Information Forensics and Security13(6), 1406–1420 (2017)

2017

[10] [10]

In: Proceedings of the 2018 10th International Conference on Information Management and Engineering

Guamán, S., Calvopiña, A., Orta, P., Tapia, F., Yoo, S.G.: Device control system for a smart home using voice commands: A practical case. In: Proceedings of the 2018 10th International Conference on Information Management and Engineering. pp. 86–89 (2018)

2018

[11] [11]

In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP)

Heigold,G.,Moreno,I.,Bengio,S.,Shazeer,N.:End-to-endtext-dependentspeaker verification. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp. 5115–5119. IEEE (2016)

2016

[12] [12]

IEEE Trans- actions on Information Forensics and Security13(2), 393–407 (2017)

Jin, Z., Hwang, J.Y., Lai, Y.L., Kim, S., Teoh, A.B.J.: Ranking-based locality sen- sitive hashing-enabled cancelable biometrics: Index-of-max hashing. IEEE Trans- actions on Information Forensics and Security13(2), 393–407 (2017)

2017

[13] [13]

In: 2013IEEEInternationalConferenceonTechnologiesforHomelandSecurity(HST)

Johnson, R., Boult, T.E.: With vaulted voice verification my voice is my key. In: 2013IEEEInternationalConferenceonTechnologiesforHomelandSecurity(HST). pp. 453–459. IEEE (2013)

2013

[14] [14]

Jung,J.w.,Kim,S.b.,Shim,H.j.,Kim,J.h.,Yu,H.J.:Improvedrawnetwithfeature map scaling for text-independent speaker verification using raw waveforms. Proc. Interspeech pp. 3583–3587 (2020)

2020

[15] [15]

Jung, J.w., Kim, Y.J., Heo, H.S., Lee, B.J., Kwon, Y., Chung, J.S.: Pushing the limits of raw waveform speaker recognition. Proc. Interspeech (2022)

2022

[16] [16]

arXiv preprint arXiv:2402.18085 (2024)

Mittal, G., Jakobsson, A., Marshall, K.O., Hegde, C., Memon, N.: Pitch: Ai- assisted tagging of deepfake audio calls using challenge-response. arXiv preprint arXiv:2402.18085 (2024)

work page arXiv 2024

[17] [17]

V oxceleb: a large-scale speaker identiﬁcation dataset,

Nagrani, A., Chung, J.S., Zisserman, A.: VoxCeleb: A large-scale speaker identifi- cation dataset. arXiv:1706.08612 (2017), available athttp://www.robots.ox.ac. uk/~vgg/data/voxceleb/

work page arXiv 2017

[18] [18]

arXiv preprint arXiv:1803.03559 (2018)

Nautsch, A., Isadskiy, S., Kolberg, J., Gomez-Barrero, M., Busch, C.: Homomor- phic encryption for speaker recognition: Protection of biometric templates and vendor model parameters. arXiv preprint arXiv:1803.03559 (2018)

work page arXiv 2018

[19] [19]

Pattern Recognition159, 111107 (2025)

Nguyen-Le, H.H., Tran, L., Nguyen, D.S.A., Le-Khac, N.A., Nguyen, T.: Privacy- preserving speaker verification system using ranking-of-element hashing. Pattern Recognition159, 111107 (2025)

2025

[20] [20]

Angarano, M

Nguyen-Le, H.H., Tran, L., Nguyen, D.S.A., Le-Khac, N.A., Nguyen, T.: Privacy- preserving speaker verification system using ranking-of-element hashing. Pat- tern Recognition159, 111107 (2025).https://doi.org/10.1016/j.patcog.2024. 111107

work page doi:10.1016/j.patcog.2024 2025

[21] [21]

In: Odyssey

Paulini,M.,Rathgeb,C.,Nautsch,A.,Reichau,H.,Reininger,H.,Busch,C.:Multi- bit allocation: Preparing voice biometrics for template protection. In: Odyssey. pp. 291–296 (2016)

2016

[22] [22]

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3(3), 1–26 (2019) ChaRVoC: A Challenge-Response Voice Cancelable Authentication System

Pradhan, S., Sun, W., Baig, G., Qiu, L.: Combating replay attacks against voice as- sistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3(3), 1–26 (2019) ChaRVoC: A Challenge-Response Voice Cancelable Authentication System

2019

[23] [23]

Richey, C., Barrios, M.A., Armstrong, Z., Bartels, C., Franco, H., Graciarena, M., Lawson, A., Nandwana, M.K., Stauffer, A., van Hout, J., Gamble, P., Hetherly, J., Stephenson, C., Ni, K.: Voices obscured in complex environmental settings (voices) corpus (2018)

2018

[24] [24]

In: 2023 International Conference on Computational Intelligence, Communication Technology and Networking (CICTN)

Yadav, S.P., Gupta, A., Nascimento, C.D.S., de Albuquerque, V.H.C., Naruka, M.S., Chauhan, S.S.: Voice-based virtual-controlled intelligent personal assistants. In: 2023 International Conference on Computational Intelligence, Communication Technology and Networking (CICTN). pp. 563–568. IEEE (2023)

2023

[25] [25]

In: 2011 International Conference on Computer Vision

Yagnik, J., Strelow, D., Ross, D.A., Lin, R.s.: The power of comparative reason- ing. In: 2011 International Conference on Computer Vision. pp. 2431–2438. IEEE (2011)

2011

[26] [26]

In: Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security

Yasur, L., Frankovits, G., Grabovski, F.M., Mirsky, Y.: Deepfake captcha: A method for preventing fake calls. In: Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security. pp. 608–622 (2023)

2023

[27] [27]

In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Zhang, L., Tan, S., Yang, J.: Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. pp. 57–71 (2017)

2017