Privacy-Preserving Speaker Recognition with Cohort Score Normalisation

Amos Treiber; Andreas Nautsch; Jose Patino; Massimiliano Todisco; Nicholas Evans; Petr Mizera; Themos Stafylakis; Thomas Schneider

arxiv: 1907.03454 · v1 · pith:X24IJWUZnew · submitted 2019-07-08 · 📡 eess.AS · cs.CR

Privacy-Preserving Speaker Recognition with Cohort Score Normalisation

Andreas Nautsch , Jose Patino , Amos Treiber , Themos Stafylakis , Petr Mizera , Massimiliano Todisco , Thomas Schneider , Nicholas Evans This is my paper

Pith reviewed 2026-05-25 01:09 UTC · model grok-4.3

classification 📡 eess.AS cs.CR

keywords privacy-preserving speaker recognitioncohort score normalisationsecure multi-party computationbinary voice representationsPLDAvoice biometricsGDPR compliance

0 comments

The pith

A cohort pruning scheme with secure multi-party computation enables the first computationally feasible privacy-preserving score normalisation for speaker recognition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets the barrier that prevents privacy-preserving speaker recognition from using cohort score normalisation. Full encrypted comparisons for thousands of cohort members are too slow, so systems must either skip normalisation or risk privacy breaches. The proposed solution applies secure multi-party computation to binary voice representations and prunes the cohort to a small set that still supports PLDA scoring. This keeps the biometric data private throughout while producing normalised scores. A reader would care because the approach satisfies data-protection rules such as GDPR without forcing a choice between privacy and usable accuracy.

Core claim

The central claim is that a cohort pruning scheme based on secure multi-party computation is the first computationally feasible method for privacy-preserving cohort score normalisation in speaker recognition. It operates on binarised voice representations so that PLDA comparisons can be performed under encryption; the pruning step reduces the number of comparisons from thousands to a practical size while the original data remains hidden.

What carries the argument

Cohort pruning scheme based on secure multi-party computation applied to binary voice representations, which selects a small relevant cohort for PLDA scoring without exposing private data.

If this is right

Cohort score normalisation can now be performed entirely in the encrypted domain for speaker recognition.
Rank-n biometric comparisons become practical under privacy constraints even though rank-1 accuracy drops due to binarisation.
The computational overhead of thousands of PLDA comparisons is reduced to a feasible level while data stays private.
Systems can meet GDPR-style requirements without accepting the performance penalty of skipping normalisation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pruning-plus-secure-computation pattern could be tested on other biometric modalities that rely on cohort normalisation.
Further tuning of the binarisation threshold might reduce the acknowledged rank-1 loss while keeping the secure computation tractable.
Real deployments could combine this method with existing homomorphic-encryption pipelines to cover both single comparisons and normalisation.

Load-bearing premise

Binarisation of the voice representations together with the pruning decisions made inside secure multi-party computation still retain enough information for the final normalised scores to be useful.

What would settle it

Running the full speaker recognition pipeline with and without the pruning scheme on the same evaluation set and measuring whether equal-error-rate or rank-n metrics remain within acceptable bounds of standard cohort normalisation.

Figures

Figures reproduced from arXiv: 1907.03454 by Amos Treiber, Andreas Nautsch, Jose Patino, Massimiliano Todisco, Nicholas Evans, Petr Mizera, Themos Stafylakis, Thomas Schneider.

**Figure 1.** Figure 1: BK extraction process from T frames with Fdimensional acoustic features to BKs from a KBM with A anchors for each of the C UBM components. Before setting K KBM elements as True at the sample level, M elements are preselected at the frame level. 3. Binary Key Voice Representations Binary voice representations have been reported previously in the context of privacy preservation. Cryptobiometric (extracti… view at source ↗

**Figure 2.** Figure 2: Our proposed privacy-preserving as-norm protocol with cohort pruning (green dashed area). The red dotted areas indicate that operations are carried out in the encrypted domain and do not leak any information except the decryptable outputs. 4.3. Cohort pruning The research hypothesis under investigation here is that the selection of top-n relevant cohort comparisons can be performed more efficiently by acc… view at source ↗

read the original abstract

In many voice biometrics applications there is a requirement to preserve privacy, not least because of the recently enforced General Data Protection Regulation (GDPR). Though progress in bringing privacy preservation to voice biometrics is lagging behind developments in other biometrics communities, recent years have seen rapid progress, with secure computation mechanisms such as homomorphic encryption being applied successfully to speaker recognition. Even so, the computational overhead incurred by processing speech data in the encrypted domain is substantial. While still tolerable for single biometric comparisons, most state-of-the-art systems perform some form of cohort-based score normalisation, requiring many thousands of biometric comparisons. The computational overhead is then prohibitive, meaning that one must accept either degraded performance (no score normalisation) or potential for privacy violations. This paper proposes the first computationally feasible approach to privacy-preserving cohort score normalisation. Our solution is a cohort pruning scheme based on secure multi-party computation which enables privacy-preserving score normalisation using probabilistic linear discriminant analysis (PLDA) comparisons. The solution operates upon binary voice representations. While the binarisation is lossy in biometric rank-1 performance, it supports computationally-feasible biometric rank-n comparisons in the encrypted domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a pruning scheme to make cohort normalisation feasible under SMC on binarized PLDA features, but supplies no numbers on whether the normalised scores still improve verification.

read the letter

Colleague, the main thing to know is that this work proposes a cohort pruning method so privacy-preserving speaker recognition can still use score normalisation without prohibitive encrypted-domain cost. It runs the comparisons via secure multi-party computation on binary voice representations and claims this is the first approach that scales to the thousands of comparisons needed for cohorts. Prior secure-computation work handled single matches but not this volume. The paper does a clean job stating the GDPR-driven motivation and the exact bottleneck, and it is upfront that binarisation loses rank-1 accuracy while hoping to retain enough for rank-n work. That honesty is useful. The soft spot is the complete absence of results. The abstract asserts the pruned cohorts still yield useful normalised scores but gives no error rates, no comparison to unnormalised binary baselines, and no check on whether the impostor distribution survives the binarisation and pruning steps. The stress-test note about lost between-speaker variance therefore lands; without data it is impossible to tell whether the normalisation step adds anything once everything is binarized. The protocol construction itself looks free of circular fitting or invented parameters. This is for people working on practical privacy-preserving biometrics who need a concrete way to keep normalisation without leaking data or accepting huge latency. A reader who already knows the SMC and PLDA literature would get the most from the pruning details. It deserves peer review because the problem is real and the proposed fix is targeted, even though experiments are required before anyone can judge whether the accuracy trade-off is acceptable.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes the first computationally feasible approach to privacy-preserving cohort score normalisation for speaker recognition. It introduces a cohort pruning scheme based on secure multi-party computation (SMC) that operates on binary voice representations, enabling PLDA-based score normalisation in the encrypted domain despite the prohibitive cost of full-cohort comparisons.

Significance. If the SMC pruning decisions preserve sufficient impostor distribution coverage and the binarised PLDA scores retain enough separability to yield useful normalisation gains, the work would address a key practical barrier in encrypted-domain voice biometrics, allowing GDPR-compliant systems to use state-of-the-art normalisation without sacrificing privacy or incurring prohibitive overhead.

major comments (2)

[Abstract] Abstract: The central feasibility claim—that the pruning scheme makes rank-n comparisons 'computationally-feasible' in the encrypted domain—rests on an unverified assertion; the manuscript supplies no complexity analysis, runtime measurements, or security proofs to substantiate efficiency or correctness of the SMC protocol.
[Abstract] Abstract: The assumption that binarisation and SMC-based pruning preserve enough between-speaker variance for normalised scores to improve verification utility over the unnormalised binary baseline is unsupported by any quantitative results, separability bounds, or error analysis, which is load-bearing for the motivation of the entire pipeline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive review and the recommendation for major revision. We address each major comment below, acknowledging where the manuscript requires strengthening, and commit to revisions that will incorporate the requested substantiation without altering the core contributions.

read point-by-point responses

Referee: [Abstract] Abstract: The central feasibility claim—that the pruning scheme makes rank-n comparisons 'computationally-feasible' in the encrypted domain—rests on an unverified assertion; the manuscript supplies no complexity analysis, runtime measurements, or security proofs to substantiate efficiency or correctness of the SMC protocol.

Authors: We agree that the feasibility claim in the abstract would be strengthened by explicit supporting material. The manuscript describes the SMC-based pruning protocol and its reduction of comparisons to a small pruned cohort, but does not include a dedicated complexity analysis, runtime figures, or formal security argument. In revision we will add a new subsection detailing the communication and computation complexity (linear in pruned cohort size), benchmark runtimes on representative hardware, and a security proof sketch under the semi-honest adversarial model. revision: yes
Referee: [Abstract] Abstract: The assumption that binarisation and SMC-based pruning preserve enough between-speaker variance for normalised scores to improve verification utility over the unnormalised binary baseline is unsupported by any quantitative results, separability bounds, or error analysis, which is load-bearing for the motivation of the entire pipeline.

Authors: This observation is correct; while the manuscript reports overall system performance, it does not provide direct quantitative evidence (e.g., EER deltas, score-distribution statistics, or separability metrics) demonstrating that normalisation still yields gains after binarisation and pruning relative to the unnormalised binary baseline. We will revise the experimental section to include these comparisons, together with an analysis of between-speaker variance retention and any resulting error bounds. revision: yes

Circularity Check

0 steps flagged

No circularity: new protocol construction with no self-referential derivations

full rationale

The paper introduces a novel cohort pruning scheme based on secure multi-party computation applied to binary voice representations for privacy-preserving PLDA score normalisation. No equations, predictions, or uniqueness claims reduce by construction to fitted parameters, self-defined quantities, or load-bearing self-citations. The contribution is an engineering protocol whose feasibility and utility rest on external SMC primitives and empirical verification rather than any re-expression of prior fitted results. Binarisation loss is acknowledged explicitly as a trade-off, not hidden via redefinition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that secure multi-party computation can be applied to PLDA scoring without prohibitive overhead once pruning is introduced, and that binarisation does not invalidate the normalisation benefit. No free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Secure multi-party computation protocols exist that correctly compute PLDA scores on binarised features while revealing nothing beyond the final normalised score.
Invoked when the abstract states that the pruning scheme enables privacy-preserving comparisons.

pith-pipeline@v0.9.0 · 5753 in / 1287 out tokens · 17615 ms · 2026-05-25T01:09:04.511305+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The solution operates upon binary voice representations. While the binarisation is lossy in biometric rank-1 performance, it supports computationally-feasible biometric rank-n comparisons in the encrypted domain.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 1 internal anchor

[1]

Introduction Today there is a growing drive to bring privacy preservation to the realm of speech processing. Following new privacy regu- lation such as the European GDPR [1], technology to protect sensitive data, including voice data, is attracting the attention of researchers and industrial stakeholders alike. Perhaps the most compelling argument to pres...

work page
[2]

Privacy-Preserving Speaker Recognition with Cohort Score Normalisation

Preliminaries and Related Work There is an extensive body of literature concerning the preser- vation of privacy in biometrics. Unfortunately, most relates not to speaker recognition, but to other biometric characteristics, e.g. ﬁngerprint, iris, and face recognition [6, 7]. Whatever the characteristic, the requirements for effective privacy preserva- tio...

work page internal anchor Pith review Pith/arXiv arXiv 1907
[3]

mean statistics pooling

work page
[4]

Before setting K KBM elements as True at the sample level,M elements are pre- selected at the frame level

top-K activation top-M activation Figure 1: BK extraction process from T frames with F - dimensional acoustic features to BKs from a KBM with A an- chors for each of the C UBM components. Before setting K KBM elements as True at the sample level,M elements are pre- selected at the frame level

work page
[5]

Cryptobiometric (extrac- tion/binding of cryptographic keys from biometric data) 3 sys- tems based upon the binarisation4 of GMM-based supervectors are reported in [20, 3]

Binary Key V oice Representations Binary voice representations have been reported previously in the context of privacy preservation. Cryptobiometric (extrac- tion/binding of cryptographic keys from biometric data) 3 sys- tems based upon the binarisation4 of GMM-based supervectors are reported in [20, 3]. The work in this paper uses an alter- native, more ...

work page
[6]

It is based upon cohort prun- ing using BK speaker representations that allow for efﬁcient computation in the encrypted domain

Privacy-Preserving Cohort Pruning The contribution in this paper is an efﬁcient, privacy-preserving approach to score normalisation. It is based upon cohort prun- ing using BK speaker representations that allow for efﬁcient computation in the encrypted domain. The use of HE-protected i-vectors here is too slow; unprotected i-vectors are not unlink- able. ...

work page 2030
[7]

It is based on 400-dimensional i- vectors, extracted from conventional acoustic features using time delay deep neural network (TDNN) for estimating UBM posteriors

Experimental Validation Given the research objective to demonstrate improvements in computational efﬁciency, rather than improved performance, only brief details of the text-independent speaker recognition system are provided here. It is based on 400-dimensional i- vectors, extracted from conventional acoustic features using time delay deep neural network...

work page arXiv 2048
[8]

Conclusions This paper reports the ﬁrst approach to computationally man- ageable (yet demanding) privacy-preserving speaker recogni- tion with cohort score normalisation. Prior to this work, the latter was a computational bottleneck for PLDA with Paillier homomorphic encryption, with normalisation strategies that re- quire many thousands of biometric comp...

work page 2050
[9]

European Council, “Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation),” April 2016

work page 2016
[10]

Privacy-preserving speaker veriﬁcation using garbled GMMs,

J. Port ˆelo, B. Raj, A. Abad, and I. Trancoso, “Privacy-preserving speaker veriﬁcation using garbled GMMs,” in Proc. European Signal Processing Conf. (EUSIPCO) . IEEE, 2014, pp. 2070– 2074

work page 2014
[11]

Multi-bit allocation: Preparing voice biometrics for template protection,

M. Paulini, C. Rathgeb, A. Nautsch, H. Reichau, H. Reininger, and C. Busch, “Multi-bit allocation: Preparing voice biometrics for template protection,” in Proc. The Speaker and Language Recognition Workshop (Odyssey), 2016, pp. 291–296

work page 2016
[12]

Homomorphic encryption for speaker recognition: Protection of biometric templates and vendor model parame- ters,

A. Nautsch, S. Isadskiy, J. Kolberg, M. Gomez-Barrero, and C. Busch, “Homomorphic encryption for speaker recognition: Protection of biometric templates and vendor model parame- ters,” in Proc. The Speaker and Language Recognition Workshop (Odyssey). ISCA, 2018, pp. 16–23

work page 2018
[13]

A novel speaker binary key de- rived from anchor models,

X. Anguera and J.-F. Bonastre, “A novel speaker binary key de- rived from anchor models,” in Proc. Annual Conf. of the Intl. Speech Communication Association (INTERSPEECH) . ISCA, 2010, pp. 2118–2121

work page 2010
[14]

Secure and efﬁcient protocols for iris and ﬁngerprint identiﬁcation,

M. Blanton and P. Gasti, “Secure and efﬁcient protocols for iris and ﬁngerprint identiﬁcation,” in Proc. European Symposium on Research in Computer Security (ESORICS). Springer, 2011, pp. 190–209

work page 2011
[15]

GSHADE: faster privacy-preserving distance com- putation and biometric identiﬁcation,

J. Bringer, H. Chabanne, M. Favre, A. Patey, T. Schneider, and M. Zohner, “GSHADE: faster privacy-preserving distance com- putation and biometric identiﬁcation,” in Proc. ACM Workshop on Information Hiding and Multimedia Security (IH&MMSec) . ACM, 2014, pp. 187–198

work page 2014
[16]

Information Technology - Security Techniques - Biometric Infor- mation Protection , International Organization for Standardiza- tion, 2011

ISO/IEC JTC1 SC27 Security Techniques, ISO/IEC 24745:2011. Information Technology - Security Techniques - Biometric Infor- mation Protection , International Organization for Standardiza- tion, 2011

work page 2011
[17]

Protocols for secure computations,

A. C. Yao, “Protocols for secure computations,” in Proc. Annual Symposium on F oundations of Computer Science (SFCS). IEEE, 1982, pp. 160–164

work page 1982
[18]

How to play any mental game,

O. Goldreich, S. Micali, and A. Wigderson, “How to play any mental game,” in Proc. ACM Symposium on Theory of Computing (STOC). ACM, 1987, pp. 218–229

work page 1987
[19]

SoK: General-purpose compilers for secure multi-party computation,

M. Hastings, B. Hemenway, D. Noble, and S. Zdancewic, “SoK: General-purpose compilers for secure multi-party computation,” in Proc. IEEE Symposium on Security and Privacy (S&P). IEEE, 2019, full version: https://marsella.github.io/static/mpcsok.pdf

work page 2019
[20]

ABY-A frame- work for efﬁcient mixed-protocol secure two-party computation,

D. Demmler, T. Schneider, and M. Zohner, “ABY-A frame- work for efﬁcient mixed-protocol secure two-party computation,” in Proc. Network and Distributed System Security Symposium (NDSS). The Internet Society, 2015

work page 2015
[21]

A framework for secure speech recognition,

P. Smaragdis and M. Shashanka, “A framework for secure speech recognition,” IEEE Transactions on Audio, Speech, and Language Processing (TASLP), vol. 15, no. 4, pp. 1404–1413, 2007

work page 2007
[22]

Privacy-preserving speaker veriﬁcation and identiﬁcation using Gaussian mixture models,

M. Pathak and B. Raj, “Privacy-preserving speaker veriﬁcation and identiﬁcation using Gaussian mixture models,” IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol. 21, no. 2, pp. 397–406, 2013

work page 2013
[23]

Secure computation of hidden Markov models,

M. Aliasgari and M. Blanton, “Secure computation of hidden Markov models,” in Proc. Intl. Conf. on Security and Cryptog- raphy (SECRYPT). IEEE, 2013, pp. 1–12

work page 2013
[24]

Secure com- putation of hidden Markov models and secure ﬂoating-point arith- metic in the malicious model,

M. Aliasgari, M. Blanton, and F. Bayatbabolghani, “Secure com- putation of hidden Markov models and secure ﬂoating-point arith- metic in the malicious model,” Intl. Journal of Information Secu- rity, vol. 16, no. 6, pp. 577–601, 2017

work page 2017
[25]

Secure outsourced computation in a multi-tenant cloud,

S. Kamara and M. Raykova, “Secure outsourced computation in a multi-tenant cloud,” in Proc. IBM Workshop on Cryptography and Security in Clouds , 2011, pp. 15–16

work page 2011
[26]

V oiceGuard: Secure and private speech processing,

F. Brasser, T. Frassetto, K. Riedhammer, A.-R. Sadeghi, T. Schneider, and C. Weinert, “V oiceGuard: Secure and private speech processing,” in Proc. Annual Conf. of the Intl. Speech Communication Association (INTERSPEECH). ISCA, 2018, pp. 1303–1307

work page 2018
[27]

Innovative instructions and software model for isolated execution,

F. McKeen, I. Alexandrovich, A. Berenzon, C. V . Rozas, H. Shaﬁ, V . Shanbhogue, and U. R. Savagaonkar, “Innovative instructions and software model for isolated execution,” in Proc. Workshop on Hardware and Architectural Support for Security and Privacy (HASP). ACM, 2013

work page 2013
[28]

Biometric template protection for speaker recognition based on universal background models,

S. Billeb, C. Rathgeb, H. Reininger, K. Kasper, and C. Busch, “Biometric template protection for speaker recognition based on universal background models,” IET Biometrics, vol. 4, no. 2, pp. 116–126, 2015

work page 2015
[29]

Discriminant binary data representation for speaker recognition,

J.-F. Bonastre, P.-M. Bousquet, D. Matrouf, and X. Anguera, “Discriminant binary data representation for speaker recognition,” in Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Pro- cessing (ICASSP). IEEE, 2011, pp. 5284–5287

work page 2011
[30]

Non directly acous- tic process for costless speaker recognition and indexation,

T. Merlin, J.-F. Bonastre, and C. Fredouille, “Non directly acous- tic process for costless speaker recognition and indexation,” in Proc. Intl. Workshop on Intelligent Communication Technologies and Applications, vol. 29, 1999

work page 1999
[31]

Speaker identiﬁcation by location in an optimal space of anchor models,

Y . Mami and D. Charlet, “Speaker identiﬁcation by location in an optimal space of anchor models,” in Proc. Intl. Conf. on Spoken Language Processing (ICSLP), 2002

work page 2002
[32]

On the modeling of natural vocal emo- tion expressions through binary key,

J. Luque and X. Anguera, “On the modeling of natural vocal emo- tion expressions through binary key,” in Proc. European Signal Processing Conference (EUSIPCO) . IEEE, 2014, pp. 1562– 1566

work page 2014
[33]

Speaker change detection using binary key modelling with contextual information,

J. Patino, H. Delgado, and N. Evans, “Speaker change detection using binary key modelling with contextual information,” inProc. Intl. Conf. on Statistical Language and Speech Processing (IC- SLP). Springer, 2017, pp. 250–261

work page 2017
[34]

Fast single-and cross-show speaker diarization using binary key speaker modeling,

H. Delgado, X. Anguera, C. Fredouille, and J. Serrano, “Fast single-and cross-show speaker diarization using binary key speaker modeling,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2286–2297, 2015

work page 2015
[35]

The EURECOM submis- sion to the ﬁrst DIHARD challenge,

J. Patino, H. Delgado, and N. Evans, “The EURECOM submis- sion to the ﬁrst DIHARD challenge,” in Proc. Annual Conf. of the Intl. Speech Communication Association (INTERSPEECH) . ISCA, 2018, pp. 2813–2817

work page 2018
[36]

Cance- lable speaker veriﬁcation system based on binary Gaussian mix- tures,

A. Mtibaa, D. Petrovska-Delacretaz, and A. B. Hamida, “Cance- lable speaker veriﬁcation system based on binary Gaussian mix- tures,” in Proc. Advanced Technologies for Signal and Image Pro- cessing (ATSIP), 2018, pp. 1–6

work page 2018
[37]

Front-end factor analysis for speaker veriﬁcation,

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker veriﬁcation,” IEEE Trans- actions on Audio, Speech, and Language Processing (TASLP) , vol. 19, no. 4, pp. 788–798, 2011

work page 2011
[38]

NIST special publication 800–57 part 1, revision 4,

E. Barker, “NIST special publication 800–57 part 1, revision 4,” 2016

work page 2016
[39]

The exact multiplicative complexity of the Hamming weight function,

J. Boyar and R. Peralta, “The exact multiplicative complexity of the Hamming weight function,” in Proc. Electronic Colloquium on Computational Complexity (ECCC) , 2005

work page 2005
[40]

PILOT: Practical privacy-preserving Indoor Localization using OuTsourcing,

K. J ¨arvinen, H. Lepp¨akoski, E. S. Lohan, P. Richter, T. Schneider, O. Tkachenko, and Z. Yang, “PILOT: Practical privacy-preserving Indoor Localization using OuTsourcing,” in Proc. IEEE Euro- pean Symposium on Security and Privacy (EuroS&P) . IEEE, 2019, to appear. Preliminary version: https://encrypto.de/papers/ JLLRSTY19.pdf

work page 2019
[41]

The kaldi speech recog- nition toolkit,

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y . Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, “The kaldi speech recog- nition toolkit,” in IEEE 2011 Workshop on Automatic Speech Recognition and Understanding . IEEE Signal Processing So- ciety, Dec. 2011, iEEE Catalog No.: CFP11SRW-USB

work page 2011

[1] [1]

Introduction Today there is a growing drive to bring privacy preservation to the realm of speech processing. Following new privacy regu- lation such as the European GDPR [1], technology to protect sensitive data, including voice data, is attracting the attention of researchers and industrial stakeholders alike. Perhaps the most compelling argument to pres...

work page

[2] [2]

Privacy-Preserving Speaker Recognition with Cohort Score Normalisation

Preliminaries and Related Work There is an extensive body of literature concerning the preser- vation of privacy in biometrics. Unfortunately, most relates not to speaker recognition, but to other biometric characteristics, e.g. ﬁngerprint, iris, and face recognition [6, 7]. Whatever the characteristic, the requirements for effective privacy preserva- tio...

work page internal anchor Pith review Pith/arXiv arXiv 1907

[3] [3]

mean statistics pooling

work page

[4] [4]

Before setting K KBM elements as True at the sample level,M elements are pre- selected at the frame level

top-K activation top-M activation Figure 1: BK extraction process from T frames with F - dimensional acoustic features to BKs from a KBM with A an- chors for each of the C UBM components. Before setting K KBM elements as True at the sample level,M elements are pre- selected at the frame level

work page

[5] [5]

Cryptobiometric (extrac- tion/binding of cryptographic keys from biometric data) 3 sys- tems based upon the binarisation4 of GMM-based supervectors are reported in [20, 3]

Binary Key V oice Representations Binary voice representations have been reported previously in the context of privacy preservation. Cryptobiometric (extrac- tion/binding of cryptographic keys from biometric data) 3 sys- tems based upon the binarisation4 of GMM-based supervectors are reported in [20, 3]. The work in this paper uses an alter- native, more ...

work page

[6] [6]

It is based upon cohort prun- ing using BK speaker representations that allow for efﬁcient computation in the encrypted domain

Privacy-Preserving Cohort Pruning The contribution in this paper is an efﬁcient, privacy-preserving approach to score normalisation. It is based upon cohort prun- ing using BK speaker representations that allow for efﬁcient computation in the encrypted domain. The use of HE-protected i-vectors here is too slow; unprotected i-vectors are not unlink- able. ...

work page 2030

[7] [7]

It is based on 400-dimensional i- vectors, extracted from conventional acoustic features using time delay deep neural network (TDNN) for estimating UBM posteriors

Experimental Validation Given the research objective to demonstrate improvements in computational efﬁciency, rather than improved performance, only brief details of the text-independent speaker recognition system are provided here. It is based on 400-dimensional i- vectors, extracted from conventional acoustic features using time delay deep neural network...

work page arXiv 2048

[8] [8]

Conclusions This paper reports the ﬁrst approach to computationally man- ageable (yet demanding) privacy-preserving speaker recogni- tion with cohort score normalisation. Prior to this work, the latter was a computational bottleneck for PLDA with Paillier homomorphic encryption, with normalisation strategies that re- quire many thousands of biometric comp...

work page 2050

[9] [9]

European Council, “Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation),” April 2016

work page 2016

[10] [10]

Privacy-preserving speaker veriﬁcation using garbled GMMs,

J. Port ˆelo, B. Raj, A. Abad, and I. Trancoso, “Privacy-preserving speaker veriﬁcation using garbled GMMs,” in Proc. European Signal Processing Conf. (EUSIPCO) . IEEE, 2014, pp. 2070– 2074

work page 2014

[11] [11]

Multi-bit allocation: Preparing voice biometrics for template protection,

M. Paulini, C. Rathgeb, A. Nautsch, H. Reichau, H. Reininger, and C. Busch, “Multi-bit allocation: Preparing voice biometrics for template protection,” in Proc. The Speaker and Language Recognition Workshop (Odyssey), 2016, pp. 291–296

work page 2016

[12] [12]

Homomorphic encryption for speaker recognition: Protection of biometric templates and vendor model parame- ters,

A. Nautsch, S. Isadskiy, J. Kolberg, M. Gomez-Barrero, and C. Busch, “Homomorphic encryption for speaker recognition: Protection of biometric templates and vendor model parame- ters,” in Proc. The Speaker and Language Recognition Workshop (Odyssey). ISCA, 2018, pp. 16–23

work page 2018

[13] [13]

A novel speaker binary key de- rived from anchor models,

X. Anguera and J.-F. Bonastre, “A novel speaker binary key de- rived from anchor models,” in Proc. Annual Conf. of the Intl. Speech Communication Association (INTERSPEECH) . ISCA, 2010, pp. 2118–2121

work page 2010

[14] [14]

Secure and efﬁcient protocols for iris and ﬁngerprint identiﬁcation,

M. Blanton and P. Gasti, “Secure and efﬁcient protocols for iris and ﬁngerprint identiﬁcation,” in Proc. European Symposium on Research in Computer Security (ESORICS). Springer, 2011, pp. 190–209

work page 2011

[15] [15]

GSHADE: faster privacy-preserving distance com- putation and biometric identiﬁcation,

J. Bringer, H. Chabanne, M. Favre, A. Patey, T. Schneider, and M. Zohner, “GSHADE: faster privacy-preserving distance com- putation and biometric identiﬁcation,” in Proc. ACM Workshop on Information Hiding and Multimedia Security (IH&MMSec) . ACM, 2014, pp. 187–198

work page 2014

[16] [16]

Information Technology - Security Techniques - Biometric Infor- mation Protection , International Organization for Standardiza- tion, 2011

ISO/IEC JTC1 SC27 Security Techniques, ISO/IEC 24745:2011. Information Technology - Security Techniques - Biometric Infor- mation Protection , International Organization for Standardiza- tion, 2011

work page 2011

[17] [17]

Protocols for secure computations,

A. C. Yao, “Protocols for secure computations,” in Proc. Annual Symposium on F oundations of Computer Science (SFCS). IEEE, 1982, pp. 160–164

work page 1982

[18] [18]

How to play any mental game,

O. Goldreich, S. Micali, and A. Wigderson, “How to play any mental game,” in Proc. ACM Symposium on Theory of Computing (STOC). ACM, 1987, pp. 218–229

work page 1987

[19] [19]

SoK: General-purpose compilers for secure multi-party computation,

M. Hastings, B. Hemenway, D. Noble, and S. Zdancewic, “SoK: General-purpose compilers for secure multi-party computation,” in Proc. IEEE Symposium on Security and Privacy (S&P). IEEE, 2019, full version: https://marsella.github.io/static/mpcsok.pdf

work page 2019

[20] [20]

ABY-A frame- work for efﬁcient mixed-protocol secure two-party computation,

D. Demmler, T. Schneider, and M. Zohner, “ABY-A frame- work for efﬁcient mixed-protocol secure two-party computation,” in Proc. Network and Distributed System Security Symposium (NDSS). The Internet Society, 2015

work page 2015

[21] [21]

A framework for secure speech recognition,

P. Smaragdis and M. Shashanka, “A framework for secure speech recognition,” IEEE Transactions on Audio, Speech, and Language Processing (TASLP), vol. 15, no. 4, pp. 1404–1413, 2007

work page 2007

[22] [22]

Privacy-preserving speaker veriﬁcation and identiﬁcation using Gaussian mixture models,

M. Pathak and B. Raj, “Privacy-preserving speaker veriﬁcation and identiﬁcation using Gaussian mixture models,” IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol. 21, no. 2, pp. 397–406, 2013

work page 2013

[23] [23]

Secure computation of hidden Markov models,

M. Aliasgari and M. Blanton, “Secure computation of hidden Markov models,” in Proc. Intl. Conf. on Security and Cryptog- raphy (SECRYPT). IEEE, 2013, pp. 1–12

work page 2013

[24] [24]

Secure com- putation of hidden Markov models and secure ﬂoating-point arith- metic in the malicious model,

M. Aliasgari, M. Blanton, and F. Bayatbabolghani, “Secure com- putation of hidden Markov models and secure ﬂoating-point arith- metic in the malicious model,” Intl. Journal of Information Secu- rity, vol. 16, no. 6, pp. 577–601, 2017

work page 2017

[25] [25]

Secure outsourced computation in a multi-tenant cloud,

S. Kamara and M. Raykova, “Secure outsourced computation in a multi-tenant cloud,” in Proc. IBM Workshop on Cryptography and Security in Clouds , 2011, pp. 15–16

work page 2011

[26] [26]

V oiceGuard: Secure and private speech processing,

F. Brasser, T. Frassetto, K. Riedhammer, A.-R. Sadeghi, T. Schneider, and C. Weinert, “V oiceGuard: Secure and private speech processing,” in Proc. Annual Conf. of the Intl. Speech Communication Association (INTERSPEECH). ISCA, 2018, pp. 1303–1307

work page 2018

[27] [27]

Innovative instructions and software model for isolated execution,

F. McKeen, I. Alexandrovich, A. Berenzon, C. V . Rozas, H. Shaﬁ, V . Shanbhogue, and U. R. Savagaonkar, “Innovative instructions and software model for isolated execution,” in Proc. Workshop on Hardware and Architectural Support for Security and Privacy (HASP). ACM, 2013

work page 2013

[28] [28]

Biometric template protection for speaker recognition based on universal background models,

S. Billeb, C. Rathgeb, H. Reininger, K. Kasper, and C. Busch, “Biometric template protection for speaker recognition based on universal background models,” IET Biometrics, vol. 4, no. 2, pp. 116–126, 2015

work page 2015

[29] [29]

Discriminant binary data representation for speaker recognition,

J.-F. Bonastre, P.-M. Bousquet, D. Matrouf, and X. Anguera, “Discriminant binary data representation for speaker recognition,” in Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Pro- cessing (ICASSP). IEEE, 2011, pp. 5284–5287

work page 2011

[30] [30]

Non directly acous- tic process for costless speaker recognition and indexation,

T. Merlin, J.-F. Bonastre, and C. Fredouille, “Non directly acous- tic process for costless speaker recognition and indexation,” in Proc. Intl. Workshop on Intelligent Communication Technologies and Applications, vol. 29, 1999

work page 1999

[31] [31]

Speaker identiﬁcation by location in an optimal space of anchor models,

Y . Mami and D. Charlet, “Speaker identiﬁcation by location in an optimal space of anchor models,” in Proc. Intl. Conf. on Spoken Language Processing (ICSLP), 2002

work page 2002

[32] [32]

On the modeling of natural vocal emo- tion expressions through binary key,

J. Luque and X. Anguera, “On the modeling of natural vocal emo- tion expressions through binary key,” in Proc. European Signal Processing Conference (EUSIPCO) . IEEE, 2014, pp. 1562– 1566

work page 2014

[33] [33]

Speaker change detection using binary key modelling with contextual information,

J. Patino, H. Delgado, and N. Evans, “Speaker change detection using binary key modelling with contextual information,” inProc. Intl. Conf. on Statistical Language and Speech Processing (IC- SLP). Springer, 2017, pp. 250–261

work page 2017

[34] [34]

Fast single-and cross-show speaker diarization using binary key speaker modeling,

H. Delgado, X. Anguera, C. Fredouille, and J. Serrano, “Fast single-and cross-show speaker diarization using binary key speaker modeling,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2286–2297, 2015

work page 2015

[35] [35]

The EURECOM submis- sion to the ﬁrst DIHARD challenge,

J. Patino, H. Delgado, and N. Evans, “The EURECOM submis- sion to the ﬁrst DIHARD challenge,” in Proc. Annual Conf. of the Intl. Speech Communication Association (INTERSPEECH) . ISCA, 2018, pp. 2813–2817

work page 2018

[36] [36]

Cance- lable speaker veriﬁcation system based on binary Gaussian mix- tures,

A. Mtibaa, D. Petrovska-Delacretaz, and A. B. Hamida, “Cance- lable speaker veriﬁcation system based on binary Gaussian mix- tures,” in Proc. Advanced Technologies for Signal and Image Pro- cessing (ATSIP), 2018, pp. 1–6

work page 2018

[37] [37]

Front-end factor analysis for speaker veriﬁcation,

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker veriﬁcation,” IEEE Trans- actions on Audio, Speech, and Language Processing (TASLP) , vol. 19, no. 4, pp. 788–798, 2011

work page 2011

[38] [38]

NIST special publication 800–57 part 1, revision 4,

E. Barker, “NIST special publication 800–57 part 1, revision 4,” 2016

work page 2016

[39] [39]

The exact multiplicative complexity of the Hamming weight function,

J. Boyar and R. Peralta, “The exact multiplicative complexity of the Hamming weight function,” in Proc. Electronic Colloquium on Computational Complexity (ECCC) , 2005

work page 2005

[40] [40]

PILOT: Practical privacy-preserving Indoor Localization using OuTsourcing,

K. J ¨arvinen, H. Lepp¨akoski, E. S. Lohan, P. Richter, T. Schneider, O. Tkachenko, and Z. Yang, “PILOT: Practical privacy-preserving Indoor Localization using OuTsourcing,” in Proc. IEEE Euro- pean Symposium on Security and Privacy (EuroS&P) . IEEE, 2019, to appear. Preliminary version: https://encrypto.de/papers/ JLLRSTY19.pdf

work page 2019

[41] [41]

The kaldi speech recog- nition toolkit,

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y . Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, “The kaldi speech recog- nition toolkit,” in IEEE 2011 Workshop on Automatic Speech Recognition and Understanding . IEEE Signal Processing So- ciety, Dec. 2011, iEEE Catalog No.: CFP11SRW-USB

work page 2011