VIPIR: A Versatile GPU Framework for Integrating Private Information Retrieval Protocols

Charles Gouert; G. Edward Suh; Hyesung Ji; Jean-Luc Watson; Jongmin Kim; Jung Ho Ahn

arxiv: 2606.11536 · v1 · pith:MYJKXOYCnew · submitted 2026-06-10 · 💻 cs.CR

VIPIR: A Versatile GPU Framework for Integrating Private Information Retrieval Protocols

Jongmin Kim , Hyesung Ji , Jean-Luc Watson , Charles Gouert , G. Edward Suh , Jung Ho Ahn This is my paper

Pith reviewed 2026-06-27 09:49 UTC · model grok-4.3

classification 💻 cs.CR

keywords private information retrievalPIR protocolsGPU accelerationhybrid protocolsring packingExpPackdatabase privacythroughput optimization

0 comments

The pith

VIPIR partitions PIR protocols into two complementary categories and uses GPU hybrids plus ExpPack compression to raise throughput while cutting overheads.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a unified analytic model that places existing PIR protocols into two categories whose computational, memory, and communication bottlenecks offset each other. It then constructs two hybrid protocols that draw techniques from both categories. These hybrids incorporate expansion-based ring packing, a GPU-friendly compression method that keeps communication low while exposing high parallelism. Further GPU optimizations target number-theoretic transforms, matrix multiplications via tensor cores, and memory scheduling for multi-GPU use. The outcome is a framework that supports much larger private database services than prior systems allowed.

Core claim

VIPIR establishes that state-of-the-art PIR protocols divide into two categories whose limitations in throughput, memory, and bandwidth are complementary. Two new hybrid protocols combine elements from both categories, augmented by expansion-based ring packing for data compression and tensor-core execution for database multiplication, yielding orders-of-magnitude throughput gains and lower overheads that make large-scale PIR practical.

What carries the argument

The unified analytic model that partitions PIR protocols into two categories with complementary limits, together with expansion-based ring packing (ExpPack) that enables parallel compression with minimal communication.

If this is right

Hybrid protocols can be assembled on demand from the two categories to balance computation against memory and communication.
ExpPack compression delivers high parallelism and low communication when mapped to GPU execution.
Tensor-core reinterpretation of database multiplication accelerates the dominant matrix operations inside PIR.
Memory-efficient scheduling removes large intermediate buffers and supports scaling across multiple GPUs.
Large-scale private database services become feasible under realistic hardware constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The category model could be used to classify and improve future PIR variants that the paper does not yet consider.
The same hybrid-plus-GPU pattern might transfer to other privacy primitives that also face computation-memory trade-offs.
If the two-category partition holds, protocol designers can target the remaining gaps inside each category rather than building entirely new schemes.
Multi-GPU scaling demonstrated here suggests that even larger databases could be handled by adding more GPUs without redesigning the protocols.

Load-bearing premise

That every state-of-the-art PIR protocol fits into exactly one of the two categories and that the categories' limitations are always complementary enough for the hybrids to overcome them.

What would settle it

Measure end-to-end query throughput, communication volume, and memory footprint for a fixed large database on VIPIR versus the best prior single-category PIR system; the central claim fails if throughput does not improve by at least an order of magnitude while overheads drop.

Figures

Figures reproduced from arXiv: 2606.11536 by Charles Gouert, G. Edward Suh, Hyesung Ji, Jean-Luc Watson, Jongmin Kim, Jung Ho Ahn.

**Figure 2.** Figure 2: (a) Server-side 𝐵-batch execution of SimplePIR [31] and (b) ring packing based on Tiptoe [30]. Each cell represents an INT8/INT32 element. Red-bordered cells mark client-exchanged data, where dotted cells are not exchanged under packing. • Poly-HE encrypts an 𝑁-coefficient integer polynomial (plaintext) into two or more polynomials in the ring R𝑞 = Z𝑞 [𝑋]/(𝑋 𝑁 + 1) (ciphertext). Specifically, scalar-HE re… view at source ↗

**Figure 3.** Figure 3: Our unified PIR model. Each arrow, annotated with [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: A black-box view of expansion-based ring packing [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Two-GPU online computation flow for SimplePIR. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Queries per second (QPS) and peak DRAM memory usage for different [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 8.** Figure 8: Latency breakdown and online communication as [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

read the original abstract

While private information retrieval (PIR) enables private database services by fully concealing access patterns, it simultaneously requires high computational throughput, large memory capacity, and substantial memory bandwidth. We introduce VIPIR, a versatile GPU framework that co-designs PIR protocols with GPU acceleration. We develop a unified analytic model showing that state-of-the-art PIR protocols fall into two categories with complementary limitations, and propose two protocols that flexibly combine techniques across these categories, overcoming the limitations of both classes. These protocols incorporate a GPU-friendly data compression method called expansion-based ring packing (ExpPack), which offers a high degree of parallelism and minimal communication cost. VIPIR applies further optimizations to core operations, including number-theoretic transforms (NTTs) and various matrix-matrix multiplications (GEMMs). Notably, we develop a tensor-core-based execution method for database multiplication by interpreting it as a mixed-integer-type GEMM. We also design memory-efficient scheduling methods that minimize intermediate buffers and enable multi-GPU scaling under memory capacity constraints. Overall, VIPIR achieves orders-of-magnitude higher throughput than prior PIR systems while reducing communication and memory overheads, making large-scale PIR practical.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VIPIR gives a GPU framework with two hybrid PIR protocols, ExpPack compression, and tensor-core GEMM mappings, but the orders-of-magnitude gains rest on whether the analytic model cleanly splits every prior protocol into two complementary categories.

read the letter

VIPIR introduces a GPU framework for PIR that builds two hybrid protocols from a unified analytic model splitting prior work into two categories, plus ExpPack for data compression and a tensor-core method that treats database multiplication as mixed-integer GEMM. It also adds memory scheduling for multi-GPU scaling under capacity limits.

The concrete engineering moves stand out. Interpreting the multiplication step as a GEMM operation lets the work use tensor cores directly, and the scheduling choices aim to cut intermediate buffers. ExpPack is positioned as GPU-parallel with low communication overhead. These are practical steps that address real hardware constraints in PIR deployments.

The soft spot is the analytic model itself. It asserts that all state-of-the-art protocols fall into exactly two categories whose limitations are complementary, so the hybrids can take the best of both. If any protocol sits outside those categories or if the hybrids introduce fresh bottlenecks, the generality of the claimed throughput and overhead reductions weakens. The abstract supplies no numbers, baselines, or error bars, so the experiments must show whether the model holds and whether the measured gains are real and reproducible.

This paper targets systems and applied cryptography researchers who want to run large-scale PIR on GPUs. Readers working on privacy-preserving services will find the framework and hardware mappings useful even if they adapt only pieces of it.

The work shows clear engagement with existing PIR literature through the model and ships concrete protocol combinations plus implementation details. It deserves a serious referee so the experimental section can be checked against the model and the performance claims.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces VIPIR, a versatile GPU framework for private information retrieval (PIR) that co-designs protocols with GPU acceleration. It develops a unified analytic model claiming that state-of-the-art PIR protocols fall into two categories with complementary limitations, proposes two hybrid protocols that combine techniques across categories, introduces expansion-based ring packing (ExpPack) for GPU-friendly data compression with high parallelism and low communication cost, and applies optimizations including tensor-core-based execution for database multiplication interpreted as mixed-integer GEMM, NTT improvements, and memory-efficient scheduling for multi-GPU scaling under capacity constraints. The central claim is that these yield orders-of-magnitude higher throughput than prior PIR systems while reducing communication and memory overheads, making large-scale PIR practical.

Significance. If the performance claims and model hold, the work could substantially advance practical large-scale PIR deployment by demonstrating effective GPU co-design and hybrid protocol construction. Strengths include the introduction of ExpPack, the tensor-core GEMM mapping, and memory scheduling methods that address concrete hardware constraints; these are concrete engineering contributions that could be adopted independently of the model.

major comments (1)

[unified analytic model section] The unified analytic model (described in the section introducing the model and the two categories): the central throughput and overhead claims rest on the assertion that every relevant prior PIR protocol partitions cleanly into exactly two categories whose limitations are strictly complementary, allowing the two hybrid protocols to inherit the best properties without new bottlenecks. The manuscript must provide an explicit enumeration or table of all cited SOTA protocols, their category assignments, and verification that no protocol falls outside the partition or exhibits non-complementary limitations; without this, the generality of the hybrids and the orders-of-magnitude improvement claim cannot be assessed.

minor comments (1)

[abstract] The abstract asserts high throughput and reduced overheads but contains no quantitative results, error bars, baseline comparisons, or validation of the analytic model; the full paper should ensure these appear early and are tied directly to the model and hybrids.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the unified analytic model. We address the major comment below and will revise the manuscript accordingly to strengthen the presentation of the model and its generality.

read point-by-point responses

Referee: [unified analytic model section] The unified analytic model (described in the section introducing the model and the two categories): the central throughput and overhead claims rest on the assertion that every relevant prior PIR protocol partitions cleanly into exactly two categories whose limitations are strictly complementary, allowing the two hybrid protocols to inherit the best properties without new bottlenecks. The manuscript must provide an explicit enumeration or table of all cited SOTA protocols, their category assignments, and verification that no protocol falls outside the partition or exhibits non-complementary limitations; without this, the generality of the hybrids and the orders-of-magnitude improvement claim cannot be assessed.

Authors: We agree that an explicit enumeration strengthens the verifiability of the model. In the revised manuscript, we will add a dedicated table (or subsection) in the unified analytic model section that lists every cited SOTA PIR protocol, assigns it to one of the two categories, and provides a brief justification based on the model's criteria (e.g., query generation vs. response generation bottlenecks). We will also include a short discussion confirming that no cited protocol falls outside the partition and that the limitations remain complementary, with no new bottlenecks introduced by the hybrids. This addition directly addresses the assessment concern while preserving the model's analytic focus. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on new model, hybrid protocols, and GPU mappings without reduction to self-defined inputs

full rationale

The abstract and provided text introduce a unified analytic model as a developed result that partitions prior protocols, followed by new hybrid protocols, ExpPack compression, NTT/GEMM optimizations, and tensor-core scheduling. No equations, fitted parameters, or self-citations are quoted that would make any performance claim equivalent to its own inputs by construction. The derivation chain is self-contained, with the model and hybrids presented as independent contributions rather than tautological renamings or load-bearing self-references.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on the abstract alone, the central claims rest on the analytic model that partitions PIR protocols and on the assumption that the new hybrids overcome the identified limitations; no numerical free parameters or invented physical entities are mentioned.

axioms (1)

domain assumption State-of-the-art PIR protocols fall into two categories with complementary limitations
This partition is invoked to justify the design of the two new hybrid protocols.

invented entities (1)

ExpPack no independent evidence
purpose: GPU-friendly data compression method offering high parallelism and minimal communication cost
New compression technique introduced by the paper.

pith-pipeline@v0.9.1-grok · 5743 in / 1271 out tokens · 17502 ms · 2026-06-27T09:49:34.473205+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

60 extracted references · 35 canonical work pages

[1]

Martin Albrecht, Melissa Chase, Hao Chen, Jintai Ding, Shafi Goldwasser, Sergey Gorbunov, Shai Halevi, Jeffrey Hoffstein, Kim Laine, Kristin Lauter, Satya Lokam, Daniele Micciancio, Dustin Moody, Travis Morrison, Amit Sahai, and Vinod Vaikuntanathan. 2021. Homomorphic Encryption Standard. InProtecting Privacy through Homomorphic Encryption. Springer, 31–6...

work page doi:10.1007/978-3-030-77287- 2021
[2]

Albrecht, Rachel Player, and Sam Scott

Martin R. Albrecht, Rachel Player, and Sam Scott. 2015. On the Concrete Hardness of Learning with Errors.Journal of Mathematical Cryptology9, 3 (2015), 169–203. doi:10.1515/jmc-2015-0016

work page doi:10.1515/jmc-2015-0016 2015
[3]

Asra Ali, Tancrède Lepoint, Sarvar Patel, Mariana Raykova, Phillipp Schopp- mann, Karn Seth, and Kevin Yeo. 2021. Communication–Computation Trade-offs in PIR. InUSENIX Security Symposium. https://www.usenix.org/conference/ usenixsecurity21/presentation/ali

2021
[4]

Sebastian Angel, Hao Chen, Kim Laine, and Srinath Setty. 2018. PIR with Com- pressed Queries and Amortized Query Processing. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/SP.2018.00062

work page doi:10.1109/sp.2018.00062 2018
[5]

Apple. 2024. Combining Machine Learning and Homomorphic Encryption in the Apple Ecosystem. https://machinelearning.apple.com/research/homomorphic- encryption

2024
[6]

Youngjin Bae, Jung Hee Cheon, Jaehyung Kim, Jai Hyun Park, and Damien Stehlé. 2023. HERMES: Efficient Ring Packing Using MLWE Ciphertexts and Application to Transciphering. InAnnual International Cryptology Conference (CRYPTO). doi:10.1007/978-3-031-38551-3_2

work page doi:10.1007/978-3-031-38551-3_2 2023
[7]

Paul Barrett. 1986. Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor. InAnnual Inter- national Conference on the Theory and Application of Cryptographic Techniques. doi:10.5555/36664.36688

work page doi:10.5555/36664.36688 1986
[8]

Zvika Brakerski. 2012. Fully Homomorphic Encryption without Modulus Switch- ing from Classical GapSVP. InAnnual Cryptology Conference (CRYPTO). 868–886. doi:10.1007/978-3-642-32009-5_50

work page doi:10.1007/978-3-642-32009-5_50 2012
[9]

Zvika Brakerski and Vinod Vaikuntanathan. 2011. Efficient Fully Homomorphic Encryption from (Standard) LWE. InIEEE Annual Symposium on Foundations of Computer Science (FOCS). doi:10.1109/FOCS.2011.12

work page doi:10.1109/focs.2011.12 2011
[10]

Brechy. 2025. Ethereum Privacy: Private Information Retrieval. https://pse.dev/ blog/ethereum-privacy-pir

2025
[11]

Alexander Burton, Samir Jordan Menon, and David J. Wu. 2024. Respire: High- Rate PIR for Databases with Small Records. InACM Conference on Computer and Communications Security. doi:10.1145/3658644.3690328

work page doi:10.1145/3658644.3690328 2024
[12]

Hao Chen, Wei Dai, Miran Kim, and Yongsoo Song. 2021. Efficient Homomorphic Conversion Between (Ring) LWE Ciphertexts. InApplied Cryptography and Network Security (ACNS). doi:10.1007/978-3-030-78372-3_18

work page doi:10.1007/978-3-030-78372-3_18 2021
[14]

Yue Chen and Ling Ren. 2025. OnionPIRv2: Efficient Single-Server PIR.IACR Cryptology ePrint Archive(2025). https://eprint.iacr.org/2025/1142

2025
[15]

Zehao Chen, Zhaoyan Shen, Qian Wei, Hang Lu, and Lei Ju. 2026. Conflux: A High-Performance Keyword Private Retrieval System for Dynamic Datasets. In 2026 IEEE International Symposium on High Performance Computer Architecture (HPCA). doi:10.1109/HPCA68181.2026.11408617

work page doi:10.1109/hpca68181.2026.11408617 2026
[16]

Zehao Chen, Honghui You, Qian Wei, Hang Lu, Lei ju, and Zhaoyan Shen
[17]

InProceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO)

SmartPIR: A Private Information Retrieval System using Computational Storage Devices. InProceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO). doi:10.1145/3725843.3756060

work page doi:10.1145/3725843.3756060
[18]

Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2017. Faster Packed Homomorphic Operations and Efficient Circuit Bootstrapping for TFHE. InInternational Conference on the Theory and Applications of Cryptology and Information Security (Asiacrypt). doi:10.1007/978-3-319-70694-8_14

work page doi:10.1007/978-3-319-70694-8_14 2017
[19]

Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2020. TFHE: Fast Fully Homomorphic Encryption over the Torus.Journal of Cryptology 33, 1 (2020), 34–91. doi:10.1007/s00145-019-09319-x

work page doi:10.1007/s00145-019-09319-x 2020
[20]

Wonseok Choi, Jongmin Kim, and Jung Ho Ahn. 2026. Cheddar: A Swift Fully Homomorphic Encryption Library Designed for GPU Architectures. InASPLOS. doi:10.1145/3760250.3762223

work page doi:10.1145/3760250.3762223 2026
[21]

Jack Choquette, Olivier Giroux, and Denis Foley. 2018. Volta: Performance and Programmability.IEEE Micro38, 2 (2018), 42–52. doi:10.1109/MM.2018.022071134

work page doi:10.1109/mm.2018.022071134 2018
[22]

Benny Chor, Eyal Kushilevitz, Oded Goldreich, and Madhu Sudan. 1998. Private Information Retrieval.J. ACM45, 6 (1998), 965–981. doi:10.1145/293347.293350

work page doi:10.1145/293347.293350 1998
[23]

Ali Şah Özcan. 2024. PIRonGPU. https://github.com/Alisah-Ozcan/PIRonGPU

2024
[24]

Ali Şah Özcan and Erkay Savaş. 2024. HEonGPU: a GPU-based Fully Ho- momorphic Encryption Library 1.0.IACR Cryptology ePrint Archive(2024). https://eprint.iacr.org/2024/1543

2024
[25]

Alex Davidson, Gonçalo Pestana, and Sofía Celi. 2023. FrodoPIR: Simple, Scalable, Single-Server Private Information Retrieval.Proceedings on Privacy Enhancing Technologies(2023), 365–383. doi:10.56553/popets-2023-0022

work page doi:10.56553/popets-2023-0022 2023
[26]

Leo de Castro, Kevin Lewi, and Edward Suh. 2024. WhisPIR: Stateless Private Information Retrieval with Low Communication.IACR Cryptology ePrint Archive (2024). https://eprint.iacr.org/2024/266

2024
[27]

Junfeng Fan and Frederik Vercauteren. 2012. Somewhat Practical Fully Ho- momorphic Encryption.IACR Cryptology ePrint Archive144 (2012). https: //eprint.iacr.org/2012/144

2012
[28]

Shengyu Fan, Zhiwei Wang, Weizhi Xu, Rui Hou, Dan Meng, and Mingzhe Zhang
[29]

TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU. InHPCA. doi:10.1109/HPCA56546.2023.10071017

work page doi:10.1109/hpca56546.2023.10071017 2023
[30]

Craig Gentry, Amit Sahai, and Brent Waters. 2013. Homomorphic Encryp- tion from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based. InAnnual International Cryptology Conference (CRYPTO). doi:10. 1007/978-3-642-40041-4_5

2013
[31]

Daniel Günther, Maurice Heymann, Benny Pinkas, and Thomas Schneider. 2022. GPU-accelerated PIR with Client-Independent Preprocessing for Large-Scale Ap- plications. InUSENIX Security Symposium. https://www.usenix.org/conference/ usenixsecurity22/presentation/gunther

2022
[32]

Alexandra Henzinger, Emma Dauterman, Henry Corrigan-Gibbs, and Nickolai Zeldovich. 2023. Private Web Search with Tiptoe. InProceedings of the 29th Symposium on Operating Systems Principles. doi:10.1145/3600006.3613134

work page doi:10.1145/3600006.3613134 2023
[33]

Hong, Henry Corrigan-Gibbs, Sarah Meikle- john, and Vinod Vaikuntanathan

Alexandra Henzinger, Matthew M. Hong, Henry Corrigan-Gibbs, Sarah Meikle- john, and Vinod Vaikuntanathan. 2023. One Server for the Price of Two: Simple and Fast Single-Server Private Information Retrieval. InUSENIX Security Sym- posium. https://www.usenix.org/conference/usenixsecurity23/presentation/ henzinger

2023
[34]

Hyesung Ji, Sangpyo Kim, Jaewan Choi, and Jung Ho Ahn. 2024. Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture. IEEE Computer Architecture Letters(2024). doi:10.1109/LCA.2024.3418448

work page doi:10.1109/lca.2024.3418448 2024
[35]

Wonkyung Jung, Sangpyo Kim, Jung Ho Ahn, Jung Hee Cheon, and Younho Lee
[36]

doi:10.46586/tches

Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs.IACR Transactions on Cryptographic Hardware and Embedded Systems2021, 4 (2021), 114–148. doi:10.46586/tches. v2021.i4.114-148

work page doi:10.46586/tches 2021
[37]

Andrew Kerr, Duane Merrill, Julien Demouth, and John Tran. 2017. CUTLASS: Fast Linear Algebra in CUDA C++. https://developer.nvidia.com/blog/cutlass- linear-algebra-cuda 11 Jongmin Kim, Hyesung Ji, Jean-Luc Watson, Charles Gouert, G. Edward Suh, and Jung Ho Ahn

2017
[38]

Sanpyo Kim, Hyesung Ji, Jongmin Kim, Wonseok Choi, Jaiyoung Park, and Jung Ho Ahn. 2026. IVE: An Accelerator for Single-Server Private Information Re- trieval Using Versatile Processing Elements. In2026 IEEE International Symposium on High Performance Computer Architecture (HPCA). doi:10.1109/HPCA68181. 2026.11408461

work page doi:10.1109/hpca68181 2026
[39]

Kushilevitz and R

E. Kushilevitz and R. Ostrovsky. 1997. Replication is not needed: single database, computationally-private information retrieval. InProceedings 38th Annual Sympo- sium on Foundations of Computer Science. 364–373. doi:10.1109/SFCS.1997.646125

work page doi:10.1109/sfcs.1997.646125 1997
[40]

Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and G

Maximilian Lam, Jeff Johnson, Wenjie Xiong, Kiwan Maeng, Udit Gupta, Yang Li, Liangzhen Lai, Ilias Leontiadis, Minsoo Rhu, Hsien-Hsin S. Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and G. Edward Suh. 2024. GPU-based Private Information Retrieval for On-Device Machine Learning Inference. In Proceedings of the 29th ACM International Conference on Ar...

work page doi:10.1145/3617232.3624855 2024
[41]

Adeline Langlois and Damien Stehlé. 2015. Worst-case to average-case reductions for module lattices.Designs, Codes and Cryptography75, 3 (2015), 565–599. doi:10.1007/s10623-014-9938-4

work page doi:10.1007/s10623-014-9938-4 2015
[42]

Baiyu Li, Daniele Micciancio, Mariana Raykova, and Mark Schultz-Wu. 2024. Hintless Single-Server Private Information Retrieval. InAnnual International Cryptology Conference (CRYPTO). 183–217. doi:10.1007/978-3-031-68400-5_6

work page doi:10.1007/978-3-031-68400-5_6 2024
[43]

Jilan Lin, Ling Liang, Zheng Qu, Ishtiyaque Ahmad, Liu Liu, Fengbin Tu, Trinabh Gupta, Yufei Ding, and Yuan Xie. 2022. INSPIRE: In-storage Private Information Retrieval via Protocol and Architecture Co-design. InProceedings of the 49th Annual International Symposium on Computer Architecture (ISCA). doi:10.1145/ 3470496.3527433

arXiv 2022
[44]

Wen-jie Lu, Zhicong Huang, Cheng Hong, Yiping Ma, and Hunter Qu. 2021. PEGASUS: Bridging Polynomial and Non-polynomial Evaluations in Homomor- phic Encryption. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/ SP40001.2021.00043

arXiv 2021
[45]

Ming Luo, Feng-Hao Liu, and Han Wang. 2024. Faster FHE-Based Single-Server Private Information Retrieval. InProceedings of the 2024 on ACM SIGSAC Con- ference on Computer and Communications Security (CCS). doi:10.1145/3658644. 3690233

work page doi:10.1145/3658644 2024
[46]

Vadim Lyubashevsky, Chris Peikert, and Oded Regev. 2010. On Ideal Lattices and Learning with Errors over Rings. InAnnual International Conference on the Theory and Applications of Cryptographic Techniques (Eurocrypt). doi:10.1007/978- 3-642-13190-5_1

work page doi:10.1007/978- 2010
[47]

Rasoul Akhavan Mahdavi, Sarvar Patel, Joon Young Seo, and Kevin Yeo. 2025. InsPIRe: Communication-Efficient PIR with Server-side Preprocessing.IACR Cryptology ePrint Archive(2025). https://eprint.iacr.org/2025/1352

2025
[48]

Carlos Aguilar Melchor, Joris Barrier, Laurent Fousse, and Marc-Olivier Killijian
[49]

doi:10.1515/popets-2016-0010

XPIR: Private information retrieval for everyone.Proceedings on Privacy Enhancing Technologies(2016), 155–174. doi:10.1515/popets-2016-0010

work page doi:10.1515/popets-2016-0010 2016
[51]

Samir Jordan Menon and David J. Wu. 2024. YPIR: High-Throughput Single- Server PIR with Silent Preprocessing. InUSENIX Security Symposium. https: //www.usenix.org/conference/usenixsecurity24/presentation/menon

2024
[52]

Montgomery

Peter L. Montgomery. 1985. Modular Multiplication without Trial Division.Math. Comp.44, 170 (1985), 519–521. doi:10.1090/S0025-5718-1985-0777282-X

work page doi:10.1090/s0025-5718-1985-0777282-x 1985
[53]

Muhammad Haris Mughees, Hao Chen, and Ling Ren. 2021. OnionPIR: Response efficient single-server PIR. InACM Conference on Computer and Communications Security. doi:10.1145/3460120.3485381

work page doi:10.1145/3460120.3485381 2021
[54]

Muhammad Haris Mughees and Ling Ren. 2023. Vectorized Batch Private Infor- mation Retrieval. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/ SP46215.2023.10179329

arXiv 2023
[55]

NVIDIA. 2026. CUTLASS. https://github.com/NVIDIA/cutlass

2026
[56]

NVIDIA. 2026. NVIDIA Collective Communications Library (NCCL). https: //developer.nvidia.com/nccl

2026
[57]

NVIDIA Corporation. 2026. PTX ISA. https://docs.nvidia.com/cuda/parallel- thread-execution/

2026
[58]

Özgün Özerk, Can Elgezen, Ahmet Can Mert, Erdinç Öztürk, and Erkay Savaş
[59]

doi:10.1007/s11227-021-03980-5

Efficient number theoretic transform implementation on GPU for homo- morphic encryption.The Journal of Supercomputing78, 2 (2022), 2840–2872. doi:10.1007/s11227-021-03980-5

work page doi:10.1007/s11227-021-03980-5 2022
[60]

Oded Regev. 2009. On Lattices, Learning with Errors, Random Linear Codes, and Cryptography.J. ACM56, 6, Article 34 (2009), 40 pages. doi:10.1145/1568318. 1568324

work page doi:10.1145/1568318 2009
[61]

Zihan Wang, Lutan Zhao, Ming Luo, Zhiwei Wang, Haoqi He, Wenzhe Lv, Xuan Ding, Dan Meng, and Rui Hou. 2025. ShiftPIR: An Efficient PIR System with Gravity Shifting from Client to Server. InACM Conference on Computer and Communications Security. 1143–1157. doi:10.1145/3719027.3765153

work page doi:10.1145/3719027.3765153 2025
[62]

Mingxun Zhou, Andrew Park, Wenting Zheng, and Elaine Shi. 2024. Piano: Extremely Simple, Single-Server PIR with Sublinear Server Computation. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/SP54263.2024.00055 A Cryptographic Background & Symbols We introduce the cryptographic constructions of LWE-based scalar- HE and RLWE-based poly-HE in this ...

work page doi:10.1109/sp54263.2024.00055 2024

[1] [1]

Martin Albrecht, Melissa Chase, Hao Chen, Jintai Ding, Shafi Goldwasser, Sergey Gorbunov, Shai Halevi, Jeffrey Hoffstein, Kim Laine, Kristin Lauter, Satya Lokam, Daniele Micciancio, Dustin Moody, Travis Morrison, Amit Sahai, and Vinod Vaikuntanathan. 2021. Homomorphic Encryption Standard. InProtecting Privacy through Homomorphic Encryption. Springer, 31–6...

work page doi:10.1007/978-3-030-77287- 2021

[2] [2]

Albrecht, Rachel Player, and Sam Scott

Martin R. Albrecht, Rachel Player, and Sam Scott. 2015. On the Concrete Hardness of Learning with Errors.Journal of Mathematical Cryptology9, 3 (2015), 169–203. doi:10.1515/jmc-2015-0016

work page doi:10.1515/jmc-2015-0016 2015

[3] [3]

Asra Ali, Tancrède Lepoint, Sarvar Patel, Mariana Raykova, Phillipp Schopp- mann, Karn Seth, and Kevin Yeo. 2021. Communication–Computation Trade-offs in PIR. InUSENIX Security Symposium. https://www.usenix.org/conference/ usenixsecurity21/presentation/ali

2021

[4] [4]

Sebastian Angel, Hao Chen, Kim Laine, and Srinath Setty. 2018. PIR with Com- pressed Queries and Amortized Query Processing. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/SP.2018.00062

work page doi:10.1109/sp.2018.00062 2018

[5] [5]

Apple. 2024. Combining Machine Learning and Homomorphic Encryption in the Apple Ecosystem. https://machinelearning.apple.com/research/homomorphic- encryption

2024

[6] [6]

Youngjin Bae, Jung Hee Cheon, Jaehyung Kim, Jai Hyun Park, and Damien Stehlé. 2023. HERMES: Efficient Ring Packing Using MLWE Ciphertexts and Application to Transciphering. InAnnual International Cryptology Conference (CRYPTO). doi:10.1007/978-3-031-38551-3_2

work page doi:10.1007/978-3-031-38551-3_2 2023

[7] [7]

Paul Barrett. 1986. Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor. InAnnual Inter- national Conference on the Theory and Application of Cryptographic Techniques. doi:10.5555/36664.36688

work page doi:10.5555/36664.36688 1986

[8] [8]

Zvika Brakerski. 2012. Fully Homomorphic Encryption without Modulus Switch- ing from Classical GapSVP. InAnnual Cryptology Conference (CRYPTO). 868–886. doi:10.1007/978-3-642-32009-5_50

work page doi:10.1007/978-3-642-32009-5_50 2012

[9] [9]

Zvika Brakerski and Vinod Vaikuntanathan. 2011. Efficient Fully Homomorphic Encryption from (Standard) LWE. InIEEE Annual Symposium on Foundations of Computer Science (FOCS). doi:10.1109/FOCS.2011.12

work page doi:10.1109/focs.2011.12 2011

[10] [10]

Brechy. 2025. Ethereum Privacy: Private Information Retrieval. https://pse.dev/ blog/ethereum-privacy-pir

2025

[11] [11]

Alexander Burton, Samir Jordan Menon, and David J. Wu. 2024. Respire: High- Rate PIR for Databases with Small Records. InACM Conference on Computer and Communications Security. doi:10.1145/3658644.3690328

work page doi:10.1145/3658644.3690328 2024

[12] [12]

Hao Chen, Wei Dai, Miran Kim, and Yongsoo Song. 2021. Efficient Homomorphic Conversion Between (Ring) LWE Ciphertexts. InApplied Cryptography and Network Security (ACNS). doi:10.1007/978-3-030-78372-3_18

work page doi:10.1007/978-3-030-78372-3_18 2021

[13] [14]

Yue Chen and Ling Ren. 2025. OnionPIRv2: Efficient Single-Server PIR.IACR Cryptology ePrint Archive(2025). https://eprint.iacr.org/2025/1142

2025

[14] [15]

Zehao Chen, Zhaoyan Shen, Qian Wei, Hang Lu, and Lei Ju. 2026. Conflux: A High-Performance Keyword Private Retrieval System for Dynamic Datasets. In 2026 IEEE International Symposium on High Performance Computer Architecture (HPCA). doi:10.1109/HPCA68181.2026.11408617

work page doi:10.1109/hpca68181.2026.11408617 2026

[15] [16]

Zehao Chen, Honghui You, Qian Wei, Hang Lu, Lei ju, and Zhaoyan Shen

[16] [17]

InProceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO)

SmartPIR: A Private Information Retrieval System using Computational Storage Devices. InProceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO). doi:10.1145/3725843.3756060

work page doi:10.1145/3725843.3756060

[17] [18]

Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2017. Faster Packed Homomorphic Operations and Efficient Circuit Bootstrapping for TFHE. InInternational Conference on the Theory and Applications of Cryptology and Information Security (Asiacrypt). doi:10.1007/978-3-319-70694-8_14

work page doi:10.1007/978-3-319-70694-8_14 2017

[18] [19]

Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2020. TFHE: Fast Fully Homomorphic Encryption over the Torus.Journal of Cryptology 33, 1 (2020), 34–91. doi:10.1007/s00145-019-09319-x

work page doi:10.1007/s00145-019-09319-x 2020

[19] [20]

Wonseok Choi, Jongmin Kim, and Jung Ho Ahn. 2026. Cheddar: A Swift Fully Homomorphic Encryption Library Designed for GPU Architectures. InASPLOS. doi:10.1145/3760250.3762223

work page doi:10.1145/3760250.3762223 2026

[20] [21]

Jack Choquette, Olivier Giroux, and Denis Foley. 2018. Volta: Performance and Programmability.IEEE Micro38, 2 (2018), 42–52. doi:10.1109/MM.2018.022071134

work page doi:10.1109/mm.2018.022071134 2018

[21] [22]

Benny Chor, Eyal Kushilevitz, Oded Goldreich, and Madhu Sudan. 1998. Private Information Retrieval.J. ACM45, 6 (1998), 965–981. doi:10.1145/293347.293350

work page doi:10.1145/293347.293350 1998

[22] [23]

Ali Şah Özcan. 2024. PIRonGPU. https://github.com/Alisah-Ozcan/PIRonGPU

2024

[23] [24]

Ali Şah Özcan and Erkay Savaş. 2024. HEonGPU: a GPU-based Fully Ho- momorphic Encryption Library 1.0.IACR Cryptology ePrint Archive(2024). https://eprint.iacr.org/2024/1543

2024

[24] [25]

Alex Davidson, Gonçalo Pestana, and Sofía Celi. 2023. FrodoPIR: Simple, Scalable, Single-Server Private Information Retrieval.Proceedings on Privacy Enhancing Technologies(2023), 365–383. doi:10.56553/popets-2023-0022

work page doi:10.56553/popets-2023-0022 2023

[25] [26]

Leo de Castro, Kevin Lewi, and Edward Suh. 2024. WhisPIR: Stateless Private Information Retrieval with Low Communication.IACR Cryptology ePrint Archive (2024). https://eprint.iacr.org/2024/266

2024

[26] [27]

Junfeng Fan and Frederik Vercauteren. 2012. Somewhat Practical Fully Ho- momorphic Encryption.IACR Cryptology ePrint Archive144 (2012). https: //eprint.iacr.org/2012/144

2012

[27] [28]

Shengyu Fan, Zhiwei Wang, Weizhi Xu, Rui Hou, Dan Meng, and Mingzhe Zhang

[28] [29]

TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU. InHPCA. doi:10.1109/HPCA56546.2023.10071017

work page doi:10.1109/hpca56546.2023.10071017 2023

[29] [30]

Craig Gentry, Amit Sahai, and Brent Waters. 2013. Homomorphic Encryp- tion from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based. InAnnual International Cryptology Conference (CRYPTO). doi:10. 1007/978-3-642-40041-4_5

2013

[30] [31]

Daniel Günther, Maurice Heymann, Benny Pinkas, and Thomas Schneider. 2022. GPU-accelerated PIR with Client-Independent Preprocessing for Large-Scale Ap- plications. InUSENIX Security Symposium. https://www.usenix.org/conference/ usenixsecurity22/presentation/gunther

2022

[31] [32]

Alexandra Henzinger, Emma Dauterman, Henry Corrigan-Gibbs, and Nickolai Zeldovich. 2023. Private Web Search with Tiptoe. InProceedings of the 29th Symposium on Operating Systems Principles. doi:10.1145/3600006.3613134

work page doi:10.1145/3600006.3613134 2023

[32] [33]

Hong, Henry Corrigan-Gibbs, Sarah Meikle- john, and Vinod Vaikuntanathan

Alexandra Henzinger, Matthew M. Hong, Henry Corrigan-Gibbs, Sarah Meikle- john, and Vinod Vaikuntanathan. 2023. One Server for the Price of Two: Simple and Fast Single-Server Private Information Retrieval. InUSENIX Security Sym- posium. https://www.usenix.org/conference/usenixsecurity23/presentation/ henzinger

2023

[33] [34]

Hyesung Ji, Sangpyo Kim, Jaewan Choi, and Jung Ho Ahn. 2024. Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture. IEEE Computer Architecture Letters(2024). doi:10.1109/LCA.2024.3418448

work page doi:10.1109/lca.2024.3418448 2024

[34] [35]

Wonkyung Jung, Sangpyo Kim, Jung Ho Ahn, Jung Hee Cheon, and Younho Lee

[35] [36]

doi:10.46586/tches

Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs.IACR Transactions on Cryptographic Hardware and Embedded Systems2021, 4 (2021), 114–148. doi:10.46586/tches. v2021.i4.114-148

work page doi:10.46586/tches 2021

[36] [37]

Andrew Kerr, Duane Merrill, Julien Demouth, and John Tran. 2017. CUTLASS: Fast Linear Algebra in CUDA C++. https://developer.nvidia.com/blog/cutlass- linear-algebra-cuda 11 Jongmin Kim, Hyesung Ji, Jean-Luc Watson, Charles Gouert, G. Edward Suh, and Jung Ho Ahn

2017

[37] [38]

Sanpyo Kim, Hyesung Ji, Jongmin Kim, Wonseok Choi, Jaiyoung Park, and Jung Ho Ahn. 2026. IVE: An Accelerator for Single-Server Private Information Re- trieval Using Versatile Processing Elements. In2026 IEEE International Symposium on High Performance Computer Architecture (HPCA). doi:10.1109/HPCA68181. 2026.11408461

work page doi:10.1109/hpca68181 2026

[38] [39]

Kushilevitz and R

E. Kushilevitz and R. Ostrovsky. 1997. Replication is not needed: single database, computationally-private information retrieval. InProceedings 38th Annual Sympo- sium on Foundations of Computer Science. 364–373. doi:10.1109/SFCS.1997.646125

work page doi:10.1109/sfcs.1997.646125 1997

[39] [40]

Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and G

Maximilian Lam, Jeff Johnson, Wenjie Xiong, Kiwan Maeng, Udit Gupta, Yang Li, Liangzhen Lai, Ilias Leontiadis, Minsoo Rhu, Hsien-Hsin S. Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and G. Edward Suh. 2024. GPU-based Private Information Retrieval for On-Device Machine Learning Inference. In Proceedings of the 29th ACM International Conference on Ar...

work page doi:10.1145/3617232.3624855 2024

[40] [41]

Adeline Langlois and Damien Stehlé. 2015. Worst-case to average-case reductions for module lattices.Designs, Codes and Cryptography75, 3 (2015), 565–599. doi:10.1007/s10623-014-9938-4

work page doi:10.1007/s10623-014-9938-4 2015

[41] [42]

Baiyu Li, Daniele Micciancio, Mariana Raykova, and Mark Schultz-Wu. 2024. Hintless Single-Server Private Information Retrieval. InAnnual International Cryptology Conference (CRYPTO). 183–217. doi:10.1007/978-3-031-68400-5_6

work page doi:10.1007/978-3-031-68400-5_6 2024

[42] [43]

Jilan Lin, Ling Liang, Zheng Qu, Ishtiyaque Ahmad, Liu Liu, Fengbin Tu, Trinabh Gupta, Yufei Ding, and Yuan Xie. 2022. INSPIRE: In-storage Private Information Retrieval via Protocol and Architecture Co-design. InProceedings of the 49th Annual International Symposium on Computer Architecture (ISCA). doi:10.1145/ 3470496.3527433

arXiv 2022

[43] [44]

Wen-jie Lu, Zhicong Huang, Cheng Hong, Yiping Ma, and Hunter Qu. 2021. PEGASUS: Bridging Polynomial and Non-polynomial Evaluations in Homomor- phic Encryption. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/ SP40001.2021.00043

arXiv 2021

[44] [45]

Ming Luo, Feng-Hao Liu, and Han Wang. 2024. Faster FHE-Based Single-Server Private Information Retrieval. InProceedings of the 2024 on ACM SIGSAC Con- ference on Computer and Communications Security (CCS). doi:10.1145/3658644. 3690233

work page doi:10.1145/3658644 2024

[45] [46]

Vadim Lyubashevsky, Chris Peikert, and Oded Regev. 2010. On Ideal Lattices and Learning with Errors over Rings. InAnnual International Conference on the Theory and Applications of Cryptographic Techniques (Eurocrypt). doi:10.1007/978- 3-642-13190-5_1

work page doi:10.1007/978- 2010

[46] [47]

Rasoul Akhavan Mahdavi, Sarvar Patel, Joon Young Seo, and Kevin Yeo. 2025. InsPIRe: Communication-Efficient PIR with Server-side Preprocessing.IACR Cryptology ePrint Archive(2025). https://eprint.iacr.org/2025/1352

2025

[47] [48]

Carlos Aguilar Melchor, Joris Barrier, Laurent Fousse, and Marc-Olivier Killijian

[48] [49]

doi:10.1515/popets-2016-0010

XPIR: Private information retrieval for everyone.Proceedings on Privacy Enhancing Technologies(2016), 155–174. doi:10.1515/popets-2016-0010

work page doi:10.1515/popets-2016-0010 2016

[49] [51]

Samir Jordan Menon and David J. Wu. 2024. YPIR: High-Throughput Single- Server PIR with Silent Preprocessing. InUSENIX Security Symposium. https: //www.usenix.org/conference/usenixsecurity24/presentation/menon

2024

[50] [52]

Montgomery

Peter L. Montgomery. 1985. Modular Multiplication without Trial Division.Math. Comp.44, 170 (1985), 519–521. doi:10.1090/S0025-5718-1985-0777282-X

work page doi:10.1090/s0025-5718-1985-0777282-x 1985

[51] [53]

Muhammad Haris Mughees, Hao Chen, and Ling Ren. 2021. OnionPIR: Response efficient single-server PIR. InACM Conference on Computer and Communications Security. doi:10.1145/3460120.3485381

work page doi:10.1145/3460120.3485381 2021

[52] [54]

Muhammad Haris Mughees and Ling Ren. 2023. Vectorized Batch Private Infor- mation Retrieval. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/ SP46215.2023.10179329

arXiv 2023

[53] [55]

NVIDIA. 2026. CUTLASS. https://github.com/NVIDIA/cutlass

2026

[54] [56]

NVIDIA. 2026. NVIDIA Collective Communications Library (NCCL). https: //developer.nvidia.com/nccl

2026

[55] [57]

NVIDIA Corporation. 2026. PTX ISA. https://docs.nvidia.com/cuda/parallel- thread-execution/

2026

[56] [58]

Özgün Özerk, Can Elgezen, Ahmet Can Mert, Erdinç Öztürk, and Erkay Savaş

[57] [59]

doi:10.1007/s11227-021-03980-5

Efficient number theoretic transform implementation on GPU for homo- morphic encryption.The Journal of Supercomputing78, 2 (2022), 2840–2872. doi:10.1007/s11227-021-03980-5

work page doi:10.1007/s11227-021-03980-5 2022

[58] [60]

Oded Regev. 2009. On Lattices, Learning with Errors, Random Linear Codes, and Cryptography.J. ACM56, 6, Article 34 (2009), 40 pages. doi:10.1145/1568318. 1568324

work page doi:10.1145/1568318 2009

[59] [61]

Zihan Wang, Lutan Zhao, Ming Luo, Zhiwei Wang, Haoqi He, Wenzhe Lv, Xuan Ding, Dan Meng, and Rui Hou. 2025. ShiftPIR: An Efficient PIR System with Gravity Shifting from Client to Server. InACM Conference on Computer and Communications Security. 1143–1157. doi:10.1145/3719027.3765153

work page doi:10.1145/3719027.3765153 2025

[60] [62]

Mingxun Zhou, Andrew Park, Wenting Zheng, and Elaine Shi. 2024. Piano: Extremely Simple, Single-Server PIR with Sublinear Server Computation. InIEEE Symposium on Security and Privacy (SP). doi:10.1109/SP54263.2024.00055 A Cryptographic Background & Symbols We introduce the cryptographic constructions of LWE-based scalar- HE and RLWE-based poly-HE in this ...

work page doi:10.1109/sp54263.2024.00055 2024