Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

Andre Manoel; FatemehSadat Mireshghallah; Huseyin A. Inan; Janardhan Kulkarni; Richard Shin; Robert Sim; Sivakanth Gopi; Xinyu Tang; Zinan Lin

arxiv: 2309.11765 · v2 · pith:H5QF55QRnew · submitted 2023-09-21 · 💻 cs.LG · cs.CR

Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

Xinyu Tang , Richard Shin , Huseyin A. Inan , Andre Manoel , Fatemehsadat Mireshghallah , Zinan Lin , Sivakanth Gopi , Janardhan Kulkarni

show 1 more author

Robert Sim

This is my paper

classification 💻 cs.LG cs.CR

keywords privacyprivatealgorithmachievefew-shotin-contextlearningllms

0 comments

read the original abstract

We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy (DP) guarantees, and show empirically that it can achieve effective ICL. We conduct extensive experiments on standard benchmarks and compare our algorithm with non-private ICL and zero-shot solutions. Our results demonstrate that our algorithm can achieve competitive performance with strong privacy levels. These results open up new possibilities for ICL with privacy protection for a broad range of applications.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SynBench: A Benchmark for Differentially Private Text Generation
cs.AI 2025-09 conditional novelty 7.0

SynBench benchmarks DP text generators across nine datasets and uses a new MIA to show that public pre-training on portions of private data overestimates synthetic text quality and breaks DP privacy bounds.
Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models
cs.CR 2026-05 unverdicted novelty 6.0

PopQuiz Attack infers LLM training data membership by turning examples into quiz questions and measuring answer accuracy, reaching 0.873 average ROC-AUC across six models and outperforming prior methods by 20.6%.
SnapAudit: Active Auditing of Differentially Private In-Context Learning via Snapshot-Based Simulation
cs.CR 2025-11 conditional novelty 6.0

SnapAudit decomposes DP-ICL into a deterministic snapshot stage and a stochastic noise stage, using bootstrap simulation to achieve 80-200x faster auditing and exposing privacy bound violations in existing Gaussian an...
InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy
cs.LG 2025-06 unverdicted novelty 6.0

InvisibleInk achieves high-utility differentially private long-form LLM text generation at 4-8x the cost of non-private generation by isolating and clipping sensitive logits and sampling from a small superset of top-k...
Agents That Know Too Much: A Data-Centric Survey of Privacy in LLM Agents
cs.CR 2026-06 unverdicted novelty 5.0

A data-centric survey finds that only information-flow control covers compositional and cross-session leakage in LLM agents and that no single benchmark tests an agent across all its data surfaces under one policy.
Are Large Language Models Suitable for Graph Computation? Progress and Prospects
cs.CL 2026-06 unverdicted novelty 4.0

A survey of LLMs for graph computation introduces a role-based taxonomy of executors versus planners and concludes that current models suit simple small-scale tasks but remain unreliable for large-scale exact computation.