Privacy-Preserving Product-Quantized Approximate Nearest Neighbor Search Framework for Large-scale Datasets via A Hybrid of Fully Homomorphic Encryption and Trusted Execution Environment
Pith reviewed 2026-05-10 04:52 UTC · model grok-4.3
The pith
A hybrid of fully homomorphic encryption and trusted execution environments combined with product quantization enables privacy-preserving approximate nearest neighbor search on million-scale datasets with practical performance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PPPQ-ANN provides a multi-layered security structure for vectors based on a hybrid of FHE and TEE. It minimizes FHE ciphertext computations by combining Product-Quantization with optimized data packing. On million-scale datasets it achieves database generation in less than 2 hours and more than 50 QPS in a sequential search while preserving privacy.
What carries the argument
The PPPQ-ANN architecture, which layers product quantization and optimized packing inside a hybrid of fully homomorphic encryption and trusted execution environments to protect vectors during both index building and search.
If this is right
- Confidential vector collections of million-scale size become usable for nearest-neighbor tasks without exposing raw embeddings.
- Database construction finishes in less than two hours, making repeated indexing of private data feasible.
- Sequential search sustains more than 50 queries per second, supporting moderate real-time workloads.
- The security-performance trade-off is shifted enough to support deployment in regulated domains that handle sensitive similarity data.
Where Pith is reading between the lines
- The same hybrid pattern could be applied to other vector operations such as secure clustering or private recommendation retrieval.
- Hardware improvements in trusted execution environments would directly increase the achievable query rate without altering the encryption layer.
- Further packing refinements might reduce the number of FHE operations still further, extending the approach to billion-scale collections.
Load-bearing premise
The hybrid FHE and TEE architecture together with product quantization and optimized packing actually supplies both the stated privacy guarantees and the measured performance without hidden attack surfaces or unstated slowdowns.
What would settle it
A concrete embedding-inversion or membership-inference attack that succeeds against the deployed PPPQ-ANN system, or a timing measurement showing database generation exceeding two hours or search throughput dropping below 50 QPS on a million-scale collection, would falsify the central performance and security claims.
Figures
read the original abstract
A nearest-neighbor framework is a fundamental tool for various applications involving Large Language Models (LLMs) and Visual Language Models (VLMs). Vectors used for nearest-neighbor searches have richer information for similarity searches. This information leads to security risks, such as embedding inversion and membership attacks. Therefore, Privacy-Preserving Approximate Nearest-Neighbor (PP-ANN) approaches are necessary for highly confidential data. However, conventional PP-ANN approaches based on a Trusted Execution Environment (TEE) or Fully Homomorphic Encryption (FHE) do not achieve practical security or performance. Additionally, conventional approaches focus on the search process rather than database generation for nearest-neighbor. To address these issues, we propose a Privacy-Preserving Product-Quantization Approximate Nearest Neighbor (PPPQ-ANN) framework. PPPQ-ANN provides a multi-layered security structure for vectors based on a hybrid of FHE and TEE. Additionally, PPPQ-ANN minimizes FHE ciphertext computations by combining Product-Quantization (PQ) with optimized data packing. We demonstrate the performance of PPPQ-ANN on million-scale datasets. As a result, PPPQ-ANN achieves database generation in less than 2 hours and more than 50 QPS in a sequential search while preserving privacy. Therefore, PPPQ-ANN optimizes the trade-off between security and performance by utilizing a hybrid of FHE and TEE, achieving practical performance while preserving privacy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes PPPQ-ANN, a hybrid privacy-preserving approximate nearest-neighbor search framework that integrates product quantization (PQ) with a combination of fully homomorphic encryption (FHE) and trusted execution environments (TEE). It minimizes FHE operations via optimized packing of PQ codes and claims to deliver practical performance on million-scale datasets: database generation in under 2 hours and sequential search exceeding 50 QPS, while protecting against embedding inversion and membership attacks.
Significance. If the performance numbers and security guarantees are substantiated, the hybrid FHE+TEE+PQ design would offer a concrete improvement over pure-FHE or pure-TEE baselines for vector search in sensitive LLM/VLM pipelines. The emphasis on database-generation cost rather than only query latency is a useful shift, and the packing optimizations could be reusable in other encrypted vector workloads.
major comments (3)
- [Abstract, §4] Abstract and §4 (Experimental Evaluation): the headline claims (<2 h database generation, >50 QPS sequential search on 1 M-scale data) are stated without any description of hardware platform, dataset statistics, baseline implementations (e.g., plain TEE, pure FHE, or prior PP-ANN systems), or ablation results on the packing scheme. Without these, it is impossible to judge whether the reported numbers reflect a genuine advance or depend on unstated implementation advantages.
- [Security Analysis] Security Analysis section (presumably §3 or §6): the multi-layered security argument rests on the assumption that TEE isolation is perfect and that the chosen FHE packing of PQ sub-vectors leaks nothing usable beyond the final ANN result. No reduction to standard FHE/TEE assumptions, no formal leakage analysis, and no empirical attack evaluations (membership inference, codebook recovery, or embedding inversion) are described. These omissions make the privacy claims unverifiable at the reported scale.
- [§3] §3 (Proposed Method): the description of how PQ codebooks and packed ciphertexts are generated inside the hybrid architecture is high-level only. It is unclear whether the codebook itself is treated as public or secret, how the TEE-FHE boundary is crossed for sub-vector operations, and whether any side-channel leakage arises from the interleaving of FHE calls and TEE offloads. These details are load-bearing for both the security and performance claims.
minor comments (2)
- [Abstract] The abstract contains minor grammatical issues (e.g., “A nearest-neighbor framework is a fundamental tool” should be “Nearest-neighbor search is a fundamental tool”).
- [§3] Notation for the hybrid architecture (FHE ciphertext sizes, PQ code lengths, packing factors) is introduced without a consolidated table or consistent symbols across sections.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point-by-point below, indicating where we will revise the manuscript to improve clarity, reproducibility, and rigor while preserving the core contributions of the hybrid FHE+TEE+PQ design.
read point-by-point responses
-
Referee: [Abstract, §4] Abstract and §4 (Experimental Evaluation): the headline claims (<2 h database generation, >50 QPS sequential search on 1 M-scale data) are stated without any description of hardware platform, dataset statistics, baseline implementations (e.g., plain TEE, pure FHE, or prior PP-ANN systems), or ablation results on the packing scheme. Without these, it is impossible to judge whether the reported numbers reflect a genuine advance or depend on unstated implementation advantages.
Authors: We agree that the experimental claims require explicit supporting details for proper evaluation. In the revised version we will expand §4 with: (i) hardware specifications (CPU model, TEE enclave type, FHE library and parameters), (ii) full dataset statistics (vector dimensionality, cardinality, source), (iii) performance tables comparing against plain TEE, pure FHE, and relevant prior PP-ANN systems, and (iv) ablation results isolating the contribution of the optimized PQ packing. The reported <2 h and >50 QPS figures were measured on our testbed; adding these elements will make the evaluation transparent and reproducible. revision: yes
-
Referee: [Security Analysis] Security Analysis section (presumably §3 or §6): the multi-layered security argument rests on the assumption that TEE isolation is perfect and that the chosen FHE packing of PQ sub-vectors leaks nothing usable beyond the final ANN result. No reduction to standard FHE/TEE assumptions, no formal leakage analysis, and no empirical attack evaluations (membership inference, codebook recovery, or embedding inversion) are described. These omissions make the privacy claims unverifiable at the reported scale.
Authors: The current security argument relies on the standard TEE isolation model (no enclave compromise) and FHE semantic security, with the hybrid design confining sensitive operations (codebook generation, sub-vector handling) to the TEE while using FHE only for encrypted distance computation on packed codes. We acknowledge the absence of a formal leakage function and empirical attack results. In revision we will add a leakage analysis, a sketch reducing the claims to standard assumptions, and empirical evaluations of membership inference and embedding inversion on representative subsets (full million-scale exhaustive attacks are resource-intensive but we will report proof-of-concept results). revision: partial
-
Referee: [§3] §3 (Proposed Method): the description of how PQ codebooks and packed ciphertexts are generated inside the hybrid architecture is high-level only. It is unclear whether the codebook itself is treated as public or secret, how the TEE-FHE boundary is crossed for sub-vector operations, and whether any side-channel leakage arises from the interleaving of FHE calls and TEE offloads. These details are load-bearing for both the security and performance claims.
Authors: We will revise §3 to supply the missing operational details: codebooks are generated and stored exclusively inside the TEE and never leave it; we will include a precise description (with pseudocode and a diagram) of the TEE-FHE boundary crossings for sub-vector packing and distance computation; and we will discuss potential side-channel vectors (timing, memory access patterns) together with the mitigations employed. These additions will make the architecture concrete and directly support the security and performance arguments. revision: yes
Circularity Check
No circularity; empirical performance claims without derivation chain
full rationale
The paper presents an engineering framework for privacy-preserving ANN using a hybrid of FHE and TEE combined with product quantization, evaluated empirically on million-scale datasets for database generation time (<2 hours) and query throughput (>50 QPS). No equations, mathematical derivations, fitted parameters, predictions, or uniqueness theorems appear in the provided text or abstract. Central claims rest on reported runtime measurements rather than any reduction to self-referential definitions or self-citation chains. The work is therefore self-contained as an empirical contribution with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Retrieval-augmented generation for knowledge-intensive nlp tasks,
P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K¨uttler, M. Lewis, W.-t. Yih, T. Rockt¨aschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive nlp tasks,” inAdvances in Neural Information Processing Systems(H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, eds.), vol. 33, pp. 9459–9474, ...
work page 2020
-
[2]
K. Sawarkar, A. Mangal, and S. R. Solanki, “Blended rag: Improving rag (retriever-augmented generation) accuracy with semantic search and hybrid query-based retrievers,” in2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 155–161, 2024. 13 PPPQ-ANNA PREPRINT
work page 2024
-
[3]
Learning transferable visual models from natural language supervision,
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” 2021
work page 2021
-
[4]
Gemini embedding: Generalizable embeddings from gemini,
J. Lee, F. Chen, S. Dua, D. Cer, M. Shanbhogue, I. Naim, G. H. ´Abrego, Z. Li, K. Chen, H. S. Vera, X. Ren, S. Zhang, D. Salz, M. Boratko, J. Han, B. Chen, S. Huang, V . Rao, P. Suganthan, F. Han, A. Doumanoglou, N. Gupta, F. Moiseev, C. Yip, A. Jain, S. Baumgartner, S. Shahi, F. P. Gomez, S. Mariserla, M. Choi, P. Shah, S. Goenka, K. Chen, Y . Xia, K. Ch...
work page 2025
-
[5]
H. Li, M. Xu, and Y . Song, “Sentence embedding leaks more information than you expect: Generative embedding inversion attack to recover the whole sentence,” inFindings of the Association for Computational Linguistics: ACL 2023(A. Rogers, J. Boyd-Graber, and N. Okazaki, eds.), (Toronto, Canada), pp. 14022–14040, Association for Computational Linguistics, ...
work page 2023
-
[6]
Universal zero-shot embedding inversion,
C. Zhang, J. X. Morris, and V . Shmatikov, “Universal zero-shot embedding inversion,” 2025
work page 2025
-
[7]
How private are dna embeddings? inverting foundation model representa- tions of genomic sequences,
S. Ouaari, J. Kreuer, and N. Pfeifer, “How private are dna embeddings? inverting foundation model representa- tions of genomic sequences,” 2026
work page 2026
-
[8]
Membership inference attacks against machine learning models via prediction sensitivity,
L. Liu, Y . Wang, G. Liu, K. Peng, and C. Wang, “Membership inference attacks against machine learning models via prediction sensitivity,”IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 3, pp. 2341– 2347, 2023
work page 2023
-
[9]
Are attribute inference attacks just imputation?,
B. Jayaraman and D. Evans, “Are attribute inference attacks just imputation?,” inProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS ’22, (New York, NY , USA), p. 1569–1582, Association for Computing Machinery, 2022
work page 2022
-
[10]
Information leakage in embedding models,
C. Song and A. Raghunathan, “Information leakage in embedding models,” inProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, CCS ’20, (New York, NY , USA), p. 377–390, Association for Computing Machinery, 2020
work page 2020
-
[11]
Y .-H. Huang, Y . Tsai, H. Hsiao, H.-Y . Lin, and S.-D. Lin, “Transferable embedding inversion attack: Uncovering privacy risks in text embeddings without model queries,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)(L.-W. Ku, A. Martins, and V . Srikumar, eds.), (Bangkok, Thailand), pp. ...
work page 2024
-
[12]
Y . Wakai, T. Shibahara, R. Okada, T. Miura, and H. Kinoshita, “Understanding privacy risks of large language models in japanese based on training data extraction attacks,” inProceedings of the Workshop on Privacy in Large Language Models (LLM) and Natural Language Processing (NLP) 2025, LM-SHIELD ’25, (New York, NY , USA), p. 25–31, Association for Compu...
work page 2025
-
[13]
Secure multi-party computation for machine learning: A survey,
I. Zhou, F. Tofigh, M. Piccardi, M. Abolhasan, D. Franklin, and J. Lipman, “Secure multi-party computation for machine learning: A survey,”IEEE Access, vol. 12, pp. 53881–53899, 2024
work page 2024
- [14]
-
[15]
Trusted execution environment: What it is, and what it is not,
M. Sabt, M. Achemlal, and A. Bouabdallah, “Trusted execution environment: What it is, and what it is not,” in 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 57–64, 2015
work page 2015
-
[16]
Tfhe: fast fully homomorphic encryption over the torus,
I. Chillotti, N. Gama, M. Georgieva, and M. Izabach `ene, “Tfhe: fast fully homomorphic encryption over the torus,”Journal of Cryptology, vol. 33, no. 1, pp. 34–91, 2020
work page 2020
-
[17]
Homomorphic encryption for arithmetic of approximate numbers,
J. H. Cheon, A. Kim, M. Kim, and Y . Song, “Homomorphic encryption for arithmetic of approximate numbers,” inAdvances in Cryptology – ASIACRYPT 2017(T. Takagi and T. Peyrin, eds.), (Cham), pp. 409–437, Springer International Publishing, 2017
work page 2017
-
[18]
SANNS: Scaling up secure approximate k-Nearest neighbors search,
H. Chen, I. Chillotti, Y . Dong, O. Poburinnaya, I. Razenshteyn, and M. S. Riazi, “SANNS: Scaling up secure approximate k-Nearest neighbors search,” in29th USENIX Security Symposium (USENIX Security 20), pp. 2111– 2128, USENIX Association, Aug. 2020
work page 2020
-
[19]
Nearest neighbour search over encrypted data using intel sgx,
K. W. Ahmed, M. M. A. Aziz, M. N. Sadat, D. Alhadidi, and N. Mohammed, “Nearest neighbour search over encrypted data using intel sgx,”Journal of Information Security and Applications, vol. 54, p. 102579, 2020
work page 2020
-
[20]
A. Boldyreva and T. Tang, “Privacy-preserving approximate k-nearest-neighbors search that hides access, query and volume patterns,”Proceedings on Privacy Enhancing Technologies, vol. 2021, no. 4, pp. 549–574, 2021
work page 2021
-
[21]
GraSS: Graph-based similarity search on encrypted query
D. Kim, Y . Nam, W. Wang, H. Gong, I. Bhati, R. Cammarota, T. S. Rosing, M. Tepper, and T. L. Willke, “GraSS: Graph-based similarity search on encrypted query.” Cryptology ePrint Archive, Paper 2024/2012, 2024
work page 2024
-
[22]
A survey on the (in)security of trusted execution environments,
A. Mu ˜noz, R. R´ıos, R. Rom´an, and J. L ´opez, “A survey on the (in)security of trusted execution environments,” Computers & Security, vol. 129, p. 103180, 2023. 14 PPPQ-ANNA PREPRINT
work page 2023
-
[23]
Crypten: Secure multi-party computation meets machine learning,
B. Knott, S. Venkataraman, A. Hannun, S. Sengupta, M. Ibrahim, and L. van der Maaten, “Crypten: Secure multi-party computation meets machine learning,” inarXiv 2109.00984, 2021
-
[24]
Gazelle: a low latency framework for secure neural network inference,
C. Juvekar, V . Vaikuntanathan, and A. Chandrakasan, “Gazelle: a low latency framework for secure neural network inference,” inProceedings of the 27th USENIX Conference on Security Symposium, SEC’18, (USA), p. 1651–1668, USENIX Association, 2018
work page 2018
-
[25]
Guarantee: Towards attestable and private ml with cca,
S. Siby, S. Abdollahi, M. Maheri, M. Kogias, and H. Haddadi, “Guarantee: Towards attestable and private ml with cca,” inProceedings of the 4th Workshop on Machine Learning and Systems, EuroMLSys ’24, (New York, NY , USA), p. 1–9, Association for Computing Machinery, 2024
work page 2024
-
[26]
F. Kato, Y . Cao, and M. Yoshikawa, “Olive: Oblivious federated learning on trusted execution environment against the risk of sparsification,”Proc. VLDB Endow., vol. 16, p. 2404–2417, June 2023
work page 2023
-
[27]
Plinius: Secure and persistent machine learning model train- ing,
P. Yuhala, P. Felber, V . Schiavoni, and A. Tchana, “Plinius: Secure and persistent machine learning model train- ing,” in2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 52–62, 2021
work page 2021
-
[28]
Fast but approximate homomorphic k-means based on masking technique,
L. Rovida, “Fast but approximate homomorphic k-means based on masking technique,”Int. J. Inf. Secur., vol. 22, p. 1605–1619, June 2023
work page 2023
-
[29]
Secure low-complexity k-mcmc for large-scale datasets with fully ho- momorphic encryption,
S. Shozo, K. Minoru, and A. Hirohisa, “Secure low-complexity k-mcmc for large-scale datasets with fully ho- momorphic encryption,” inProceedings of 2025 10th International Conference on Information and Network Technologies, pp. 47–53, IEEE, 2025
work page 2025
-
[30]
Training encrypted neural networks on encrypted data with fully homo- morphic encryption,
L. Colombo, A. Falcetta, and M. Roveri, “Training encrypted neural networks on encrypted data with fully homo- morphic encryption,” inProceedings of the 12th Workshop on Encrypted Computing & Applied Homomorphic Cryptography, W AHC ’24, (New York, NY , USA), p. 64–75, Association for Computing Machinery, 2024
work page 2024
-
[31]
Malware guard extension: abusing intel sgx to conceal cache attacks,
M. Schwarz, S. Weiser, D. Gruss, C. Maurice, and S. Mangard, “Malware guard extension: abusing intel sgx to conceal cache attacks,”Cybersecurity, vol. 3, p. 2, Jan 2020
work page 2020
-
[32]
Confidential computing and related technologies: a critical review,
M. U. Sardar and C. Fetzer, “Confidential computing and related technologies: a critical review,”Cybersecurity, vol. 6, p. 10, May 2023
work page 2023
-
[33]
Towards a practical cluster analysis over encrypted data,
J. H. Cheon, D. Kim, and J. H. Park, “Towards a practical cluster analysis over encrypted data,” inSelected Areas in Cryptography – SAC 2019: 26th International Conference, Waterloo, ON, Canada, August 12–16, 2019, Revised Selected Papers, (Berlin, Heidelberg), p. 227–249, Springer-Verlag, 2019
work page 2019
-
[34]
Fully homomorphic encryption using ideal lattices,
C. Gentry, “Fully homomorphic encryption using ideal lattices,” inProceedings of the forty-first annual ACM symposium on Theory of computing, pp. 169–178, 2009
work page 2009
-
[35]
(leveled) fully homomorphic encryption without bootstrap- ping,
Z. Brakerski, C. Gentry, and V . Vaikuntanathan, “(leveled) fully homomorphic encryption without bootstrap- ping,” inProceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12, (New York, NY , USA), p. 309–325, Association for Computing Machinery, 2012
work page 2012
-
[36]
Somewhat practical fully homomorphic encryption,
J. Fan and F. Vercauteren, “Somewhat practical fully homomorphic encryption,”Cryptology ePrint Archive, 2012
work page 2012
-
[37]
C. Gentry, A. Sahai, and B. Waters, “Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based,” inAdvances in Cryptology – CRYPTO 2013(R. Canetti and J. A. Garay, eds.), (Berlin, Heidelberg), pp. 75–92, Springer Berlin Heidelberg, 2013
work page 2013
-
[38]
Product quantization for nearest neighbor search,
H. J ´egou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117–128, 2011
work page 2011
-
[39]
Pqk-means: Billion-scale clustering for product-quantized codes,
Y . Matsui, K. Ogaki, T. Yamasaki, and K. Aizawa, “Pqk-means: Billion-scale clustering for product-quantized codes,” inProceedings of the 25th ACM International Conference on Multimedia, MM ’17, (New York, NY , USA), p. 1725–1733, Association for Computing Machinery, 2017
work page 2017
-
[40]
Hra-secure homomorphic lattice-based proxy re-encryption with tight security,
A. Cohen, D. B. Cousins, N. Genise, E. Kline, Y . Polyakov, and S. RV , “Hra-secure homomorphic lattice-based proxy re-encryption with tight security,”IACR Cryptol. ePrint Arch., p. 681, 2024
work page 2024
-
[41]
Glove: Global vectors for word representation.,
J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation.,” inEMNLP, vol. 14, pp. 1532–1543, 2014
work page 2014
-
[42]
Openfhe: Open-source fully homomorphic encryption library,
A. Al Badawi, J. Bates, F. Bergamaschi, D. B. Cousins, S. Erabelli, N. Genise, S. Halevi, H. Hunt, A. Kim, Y . Lee, Z. Liu, D. Micciancio, I. Quah, Y . Polyakov, S. R.V ., K. Rohloff, J. Saylor, D. Suponitsky, M. Triplett, V . Vaikuntanathan, and V . Zucca, “Openfhe: Open-source fully homomorphic encryption library,” inProceedings of the 10th Workshop on ...
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.