VIPIR: A Versatile GPU Framework for Integrating Private Information Retrieval Protocols
Pith reviewed 2026-06-27 09:49 UTC · model grok-4.3
The pith
VIPIR partitions PIR protocols into two complementary categories and uses GPU hybrids plus ExpPack compression to raise throughput while cutting overheads.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
VIPIR establishes that state-of-the-art PIR protocols divide into two categories whose limitations in throughput, memory, and bandwidth are complementary. Two new hybrid protocols combine elements from both categories, augmented by expansion-based ring packing for data compression and tensor-core execution for database multiplication, yielding orders-of-magnitude throughput gains and lower overheads that make large-scale PIR practical.
What carries the argument
The unified analytic model that partitions PIR protocols into two categories with complementary limits, together with expansion-based ring packing (ExpPack) that enables parallel compression with minimal communication.
If this is right
- Hybrid protocols can be assembled on demand from the two categories to balance computation against memory and communication.
- ExpPack compression delivers high parallelism and low communication when mapped to GPU execution.
- Tensor-core reinterpretation of database multiplication accelerates the dominant matrix operations inside PIR.
- Memory-efficient scheduling removes large intermediate buffers and supports scaling across multiple GPUs.
- Large-scale private database services become feasible under realistic hardware constraints.
Where Pith is reading between the lines
- The category model could be used to classify and improve future PIR variants that the paper does not yet consider.
- The same hybrid-plus-GPU pattern might transfer to other privacy primitives that also face computation-memory trade-offs.
- If the two-category partition holds, protocol designers can target the remaining gaps inside each category rather than building entirely new schemes.
- Multi-GPU scaling demonstrated here suggests that even larger databases could be handled by adding more GPUs without redesigning the protocols.
Load-bearing premise
That every state-of-the-art PIR protocol fits into exactly one of the two categories and that the categories' limitations are always complementary enough for the hybrids to overcome them.
What would settle it
Measure end-to-end query throughput, communication volume, and memory footprint for a fixed large database on VIPIR versus the best prior single-category PIR system; the central claim fails if throughput does not improve by at least an order of magnitude while overheads drop.
Figures
read the original abstract
While private information retrieval (PIR) enables private database services by fully concealing access patterns, it simultaneously requires high computational throughput, large memory capacity, and substantial memory bandwidth. We introduce VIPIR, a versatile GPU framework that co-designs PIR protocols with GPU acceleration. We develop a unified analytic model showing that state-of-the-art PIR protocols fall into two categories with complementary limitations, and propose two protocols that flexibly combine techniques across these categories, overcoming the limitations of both classes. These protocols incorporate a GPU-friendly data compression method called expansion-based ring packing (ExpPack), which offers a high degree of parallelism and minimal communication cost. VIPIR applies further optimizations to core operations, including number-theoretic transforms (NTTs) and various matrix-matrix multiplications (GEMMs). Notably, we develop a tensor-core-based execution method for database multiplication by interpreting it as a mixed-integer-type GEMM. We also design memory-efficient scheduling methods that minimize intermediate buffers and enable multi-GPU scaling under memory capacity constraints. Overall, VIPIR achieves orders-of-magnitude higher throughput than prior PIR systems while reducing communication and memory overheads, making large-scale PIR practical.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces VIPIR, a versatile GPU framework for private information retrieval (PIR) that co-designs protocols with GPU acceleration. It develops a unified analytic model claiming that state-of-the-art PIR protocols fall into two categories with complementary limitations, proposes two hybrid protocols that combine techniques across categories, introduces expansion-based ring packing (ExpPack) for GPU-friendly data compression with high parallelism and low communication cost, and applies optimizations including tensor-core-based execution for database multiplication interpreted as mixed-integer GEMM, NTT improvements, and memory-efficient scheduling for multi-GPU scaling under capacity constraints. The central claim is that these yield orders-of-magnitude higher throughput than prior PIR systems while reducing communication and memory overheads, making large-scale PIR practical.
Significance. If the performance claims and model hold, the work could substantially advance practical large-scale PIR deployment by demonstrating effective GPU co-design and hybrid protocol construction. Strengths include the introduction of ExpPack, the tensor-core GEMM mapping, and memory scheduling methods that address concrete hardware constraints; these are concrete engineering contributions that could be adopted independently of the model.
major comments (1)
- [unified analytic model section] The unified analytic model (described in the section introducing the model and the two categories): the central throughput and overhead claims rest on the assertion that every relevant prior PIR protocol partitions cleanly into exactly two categories whose limitations are strictly complementary, allowing the two hybrid protocols to inherit the best properties without new bottlenecks. The manuscript must provide an explicit enumeration or table of all cited SOTA protocols, their category assignments, and verification that no protocol falls outside the partition or exhibits non-complementary limitations; without this, the generality of the hybrids and the orders-of-magnitude improvement claim cannot be assessed.
minor comments (1)
- [abstract] The abstract asserts high throughput and reduced overheads but contains no quantitative results, error bars, baseline comparisons, or validation of the analytic model; the full paper should ensure these appear early and are tied directly to the model and hybrids.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the unified analytic model. We address the major comment below and will revise the manuscript accordingly to strengthen the presentation of the model and its generality.
read point-by-point responses
-
Referee: [unified analytic model section] The unified analytic model (described in the section introducing the model and the two categories): the central throughput and overhead claims rest on the assertion that every relevant prior PIR protocol partitions cleanly into exactly two categories whose limitations are strictly complementary, allowing the two hybrid protocols to inherit the best properties without new bottlenecks. The manuscript must provide an explicit enumeration or table of all cited SOTA protocols, their category assignments, and verification that no protocol falls outside the partition or exhibits non-complementary limitations; without this, the generality of the hybrids and the orders-of-magnitude improvement claim cannot be assessed.
Authors: We agree that an explicit enumeration strengthens the verifiability of the model. In the revised manuscript, we will add a dedicated table (or subsection) in the unified analytic model section that lists every cited SOTA PIR protocol, assigns it to one of the two categories, and provides a brief justification based on the model's criteria (e.g., query generation vs. response generation bottlenecks). We will also include a short discussion confirming that no cited protocol falls outside the partition and that the limitations remain complementary, with no new bottlenecks introduced by the hybrids. This addition directly addresses the assessment concern while preserving the model's analytic focus. revision: yes
Circularity Check
No circularity: claims rest on new model, hybrid protocols, and GPU mappings without reduction to self-defined inputs
full rationale
The abstract and provided text introduce a unified analytic model as a developed result that partitions prior protocols, followed by new hybrid protocols, ExpPack compression, NTT/GEMM optimizations, and tensor-core scheduling. No equations, fitted parameters, or self-citations are quoted that would make any performance claim equivalent to its own inputs by construction. The derivation chain is self-contained, with the model and hybrids presented as independent contributions rather than tautological renamings or load-bearing self-references.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption State-of-the-art PIR protocols fall into two categories with complementary limitations
invented entities (1)
-
ExpPack
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Martin Albrecht, Melissa Chase, Hao Chen, Jintai Ding, Shafi Goldwasser, Sergey Gorbunov, Shai Halevi, Jeffrey Hoffstein, Kim Laine, Kristin Lauter, Satya Lokam, Daniele Micciancio, Dustin Moody, Travis Morrison, Amit Sahai, and Vinod Vaikuntanathan. 2021. Homomorphic Encryption Standard. InProtecting Privacy through Homomorphic Encryption. Springer, 31–6...
-
[2]
Albrecht, Rachel Player, and Sam Scott
Martin R. Albrecht, Rachel Player, and Sam Scott. 2015. On the Concrete Hardness of Learning with Errors.Journal of Mathematical Cryptology9, 3 (2015), 169–203. doi:10.1515/jmc-2015-0016
-
[3]
Asra Ali, Tancrède Lepoint, Sarvar Patel, Mariana Raykova, Phillipp Schopp- mann, Karn Seth, and Kevin Yeo. 2021. Communication–Computation Trade-offs in PIR. InUSENIX Security Symposium. https://www.usenix.org/conference/ usenixsecurity21/presentation/ali
2021
-
[4]
Sebastian Angel, Hao Chen, Kim Laine, and Srinath Setty. 2018. PIR with Com- pressed Queries and Amortized Query Processing. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/SP.2018.00062
-
[5]
Apple. 2024. Combining Machine Learning and Homomorphic Encryption in the Apple Ecosystem. https://machinelearning.apple.com/research/homomorphic- encryption
2024
-
[6]
Youngjin Bae, Jung Hee Cheon, Jaehyung Kim, Jai Hyun Park, and Damien Stehlé. 2023. HERMES: Efficient Ring Packing Using MLWE Ciphertexts and Application to Transciphering. InAnnual International Cryptology Conference (CRYPTO). doi:10.1007/978-3-031-38551-3_2
-
[7]
Paul Barrett. 1986. Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor. InAnnual Inter- national Conference on the Theory and Application of Cryptographic Techniques. doi:10.5555/36664.36688
-
[8]
Zvika Brakerski. 2012. Fully Homomorphic Encryption without Modulus Switch- ing from Classical GapSVP. InAnnual Cryptology Conference (CRYPTO). 868–886. doi:10.1007/978-3-642-32009-5_50
-
[9]
Zvika Brakerski and Vinod Vaikuntanathan. 2011. Efficient Fully Homomorphic Encryption from (Standard) LWE. InIEEE Annual Symposium on Foundations of Computer Science (FOCS). doi:10.1109/FOCS.2011.12
-
[10]
Brechy. 2025. Ethereum Privacy: Private Information Retrieval. https://pse.dev/ blog/ethereum-privacy-pir
2025
-
[11]
Alexander Burton, Samir Jordan Menon, and David J. Wu. 2024. Respire: High- Rate PIR for Databases with Small Records. InACM Conference on Computer and Communications Security. doi:10.1145/3658644.3690328
-
[12]
Hao Chen, Wei Dai, Miran Kim, and Yongsoo Song. 2021. Efficient Homomorphic Conversion Between (Ring) LWE Ciphertexts. InApplied Cryptography and Network Security (ACNS). doi:10.1007/978-3-030-78372-3_18
-
[14]
Yue Chen and Ling Ren. 2025. OnionPIRv2: Efficient Single-Server PIR.IACR Cryptology ePrint Archive(2025). https://eprint.iacr.org/2025/1142
2025
-
[15]
Zehao Chen, Zhaoyan Shen, Qian Wei, Hang Lu, and Lei Ju. 2026. Conflux: A High-Performance Keyword Private Retrieval System for Dynamic Datasets. In 2026 IEEE International Symposium on High Performance Computer Architecture (HPCA). doi:10.1109/HPCA68181.2026.11408617
-
[16]
Zehao Chen, Honghui You, Qian Wei, Hang Lu, Lei ju, and Zhaoyan Shen
-
[17]
InProceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO)
SmartPIR: A Private Information Retrieval System using Computational Storage Devices. InProceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO). doi:10.1145/3725843.3756060
-
[18]
Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2017. Faster Packed Homomorphic Operations and Efficient Circuit Bootstrapping for TFHE. InInternational Conference on the Theory and Applications of Cryptology and Information Security (Asiacrypt). doi:10.1007/978-3-319-70694-8_14
-
[19]
Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2020. TFHE: Fast Fully Homomorphic Encryption over the Torus.Journal of Cryptology 33, 1 (2020), 34–91. doi:10.1007/s00145-019-09319-x
-
[20]
Wonseok Choi, Jongmin Kim, and Jung Ho Ahn. 2026. Cheddar: A Swift Fully Homomorphic Encryption Library Designed for GPU Architectures. InASPLOS. doi:10.1145/3760250.3762223
-
[21]
Jack Choquette, Olivier Giroux, and Denis Foley. 2018. Volta: Performance and Programmability.IEEE Micro38, 2 (2018), 42–52. doi:10.1109/MM.2018.022071134
-
[22]
Benny Chor, Eyal Kushilevitz, Oded Goldreich, and Madhu Sudan. 1998. Private Information Retrieval.J. ACM45, 6 (1998), 965–981. doi:10.1145/293347.293350
-
[23]
Ali Şah Özcan. 2024. PIRonGPU. https://github.com/Alisah-Ozcan/PIRonGPU
2024
-
[24]
Ali Şah Özcan and Erkay Savaş. 2024. HEonGPU: a GPU-based Fully Ho- momorphic Encryption Library 1.0.IACR Cryptology ePrint Archive(2024). https://eprint.iacr.org/2024/1543
2024
-
[25]
Alex Davidson, Gonçalo Pestana, and Sofía Celi. 2023. FrodoPIR: Simple, Scalable, Single-Server Private Information Retrieval.Proceedings on Privacy Enhancing Technologies(2023), 365–383. doi:10.56553/popets-2023-0022
-
[26]
Leo de Castro, Kevin Lewi, and Edward Suh. 2024. WhisPIR: Stateless Private Information Retrieval with Low Communication.IACR Cryptology ePrint Archive (2024). https://eprint.iacr.org/2024/266
2024
-
[27]
Junfeng Fan and Frederik Vercauteren. 2012. Somewhat Practical Fully Ho- momorphic Encryption.IACR Cryptology ePrint Archive144 (2012). https: //eprint.iacr.org/2012/144
2012
-
[28]
Shengyu Fan, Zhiwei Wang, Weizhi Xu, Rui Hou, Dan Meng, and Mingzhe Zhang
-
[29]
TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU. InHPCA. doi:10.1109/HPCA56546.2023.10071017
-
[30]
Craig Gentry, Amit Sahai, and Brent Waters. 2013. Homomorphic Encryp- tion from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based. InAnnual International Cryptology Conference (CRYPTO). doi:10. 1007/978-3-642-40041-4_5
2013
-
[31]
Daniel Günther, Maurice Heymann, Benny Pinkas, and Thomas Schneider. 2022. GPU-accelerated PIR with Client-Independent Preprocessing for Large-Scale Ap- plications. InUSENIX Security Symposium. https://www.usenix.org/conference/ usenixsecurity22/presentation/gunther
2022
-
[32]
Alexandra Henzinger, Emma Dauterman, Henry Corrigan-Gibbs, and Nickolai Zeldovich. 2023. Private Web Search with Tiptoe. InProceedings of the 29th Symposium on Operating Systems Principles. doi:10.1145/3600006.3613134
-
[33]
Hong, Henry Corrigan-Gibbs, Sarah Meikle- john, and Vinod Vaikuntanathan
Alexandra Henzinger, Matthew M. Hong, Henry Corrigan-Gibbs, Sarah Meikle- john, and Vinod Vaikuntanathan. 2023. One Server for the Price of Two: Simple and Fast Single-Server Private Information Retrieval. InUSENIX Security Sym- posium. https://www.usenix.org/conference/usenixsecurity23/presentation/ henzinger
2023
-
[34]
Hyesung Ji, Sangpyo Kim, Jaewan Choi, and Jung Ho Ahn. 2024. Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture. IEEE Computer Architecture Letters(2024). doi:10.1109/LCA.2024.3418448
-
[35]
Wonkyung Jung, Sangpyo Kim, Jung Ho Ahn, Jung Hee Cheon, and Younho Lee
-
[36]
Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs.IACR Transactions on Cryptographic Hardware and Embedded Systems2021, 4 (2021), 114–148. doi:10.46586/tches. v2021.i4.114-148
-
[37]
Andrew Kerr, Duane Merrill, Julien Demouth, and John Tran. 2017. CUTLASS: Fast Linear Algebra in CUDA C++. https://developer.nvidia.com/blog/cutlass- linear-algebra-cuda 11 Jongmin Kim, Hyesung Ji, Jean-Luc Watson, Charles Gouert, G. Edward Suh, and Jung Ho Ahn
2017
-
[38]
Sanpyo Kim, Hyesung Ji, Jongmin Kim, Wonseok Choi, Jaiyoung Park, and Jung Ho Ahn. 2026. IVE: An Accelerator for Single-Server Private Information Re- trieval Using Versatile Processing Elements. In2026 IEEE International Symposium on High Performance Computer Architecture (HPCA). doi:10.1109/HPCA68181. 2026.11408461
-
[39]
E. Kushilevitz and R. Ostrovsky. 1997. Replication is not needed: single database, computationally-private information retrieval. InProceedings 38th Annual Sympo- sium on Foundations of Computer Science. 364–373. doi:10.1109/SFCS.1997.646125
-
[40]
Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and G
Maximilian Lam, Jeff Johnson, Wenjie Xiong, Kiwan Maeng, Udit Gupta, Yang Li, Liangzhen Lai, Ilias Leontiadis, Minsoo Rhu, Hsien-Hsin S. Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and G. Edward Suh. 2024. GPU-based Private Information Retrieval for On-Device Machine Learning Inference. In Proceedings of the 29th ACM International Conference on Ar...
-
[41]
Adeline Langlois and Damien Stehlé. 2015. Worst-case to average-case reductions for module lattices.Designs, Codes and Cryptography75, 3 (2015), 565–599. doi:10.1007/s10623-014-9938-4
-
[42]
Baiyu Li, Daniele Micciancio, Mariana Raykova, and Mark Schultz-Wu. 2024. Hintless Single-Server Private Information Retrieval. InAnnual International Cryptology Conference (CRYPTO). 183–217. doi:10.1007/978-3-031-68400-5_6
-
[43]
Jilan Lin, Ling Liang, Zheng Qu, Ishtiyaque Ahmad, Liu Liu, Fengbin Tu, Trinabh Gupta, Yufei Ding, and Yuan Xie. 2022. INSPIRE: In-storage Private Information Retrieval via Protocol and Architecture Co-design. InProceedings of the 49th Annual International Symposium on Computer Architecture (ISCA). doi:10.1145/ 3470496.3527433
arXiv 2022
-
[44]
Wen-jie Lu, Zhicong Huang, Cheng Hong, Yiping Ma, and Hunter Qu. 2021. PEGASUS: Bridging Polynomial and Non-polynomial Evaluations in Homomor- phic Encryption. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/ SP40001.2021.00043
arXiv 2021
-
[45]
Ming Luo, Feng-Hao Liu, and Han Wang. 2024. Faster FHE-Based Single-Server Private Information Retrieval. InProceedings of the 2024 on ACM SIGSAC Con- ference on Computer and Communications Security (CCS). doi:10.1145/3658644. 3690233
-
[46]
Vadim Lyubashevsky, Chris Peikert, and Oded Regev. 2010. On Ideal Lattices and Learning with Errors over Rings. InAnnual International Conference on the Theory and Applications of Cryptographic Techniques (Eurocrypt). doi:10.1007/978- 3-642-13190-5_1
-
[47]
Rasoul Akhavan Mahdavi, Sarvar Patel, Joon Young Seo, and Kevin Yeo. 2025. InsPIRe: Communication-Efficient PIR with Server-side Preprocessing.IACR Cryptology ePrint Archive(2025). https://eprint.iacr.org/2025/1352
2025
-
[48]
Carlos Aguilar Melchor, Joris Barrier, Laurent Fousse, and Marc-Olivier Killijian
-
[49]
XPIR: Private information retrieval for everyone.Proceedings on Privacy Enhancing Technologies(2016), 155–174. doi:10.1515/popets-2016-0010
-
[51]
Samir Jordan Menon and David J. Wu. 2024. YPIR: High-Throughput Single- Server PIR with Silent Preprocessing. InUSENIX Security Symposium. https: //www.usenix.org/conference/usenixsecurity24/presentation/menon
2024
-
[52]
Peter L. Montgomery. 1985. Modular Multiplication without Trial Division.Math. Comp.44, 170 (1985), 519–521. doi:10.1090/S0025-5718-1985-0777282-X
-
[53]
Muhammad Haris Mughees, Hao Chen, and Ling Ren. 2021. OnionPIR: Response efficient single-server PIR. InACM Conference on Computer and Communications Security. doi:10.1145/3460120.3485381
-
[54]
Muhammad Haris Mughees and Ling Ren. 2023. Vectorized Batch Private Infor- mation Retrieval. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/ SP46215.2023.10179329
arXiv 2023
-
[55]
NVIDIA. 2026. CUTLASS. https://github.com/NVIDIA/cutlass
2026
-
[56]
NVIDIA. 2026. NVIDIA Collective Communications Library (NCCL). https: //developer.nvidia.com/nccl
2026
-
[57]
NVIDIA Corporation. 2026. PTX ISA. https://docs.nvidia.com/cuda/parallel- thread-execution/
2026
-
[58]
Özgün Özerk, Can Elgezen, Ahmet Can Mert, Erdinç Öztürk, and Erkay Savaş
-
[59]
doi:10.1007/s11227-021-03980-5
Efficient number theoretic transform implementation on GPU for homo- morphic encryption.The Journal of Supercomputing78, 2 (2022), 2840–2872. doi:10.1007/s11227-021-03980-5
-
[60]
Oded Regev. 2009. On Lattices, Learning with Errors, Random Linear Codes, and Cryptography.J. ACM56, 6, Article 34 (2009), 40 pages. doi:10.1145/1568318. 1568324
-
[61]
Zihan Wang, Lutan Zhao, Ming Luo, Zhiwei Wang, Haoqi He, Wenzhe Lv, Xuan Ding, Dan Meng, and Rui Hou. 2025. ShiftPIR: An Efficient PIR System with Gravity Shifting from Client to Server. InACM Conference on Computer and Communications Security. 1143–1157. doi:10.1145/3719027.3765153
-
[62]
Mingxun Zhou, Andrew Park, Wenting Zheng, and Elaine Shi. 2024. Piano: Extremely Simple, Single-Server PIR with Sublinear Server Computation. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/SP54263.2024.00055 A Cryptographic Background & Symbols We introduce the cryptographic constructions of LWE-based scalar- HE and RLWE-based poly-HE in this ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.