Sequential LLM defense deployment leads to risk exacerbation in 38.9% of cases due to anti-aligned updates in shared critical layers, addressed by conflict-guided layer freezing.
Large lan- guage models can be strong differentially private learners
7 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Balanced Iteration Subsampling achieves stronger privacy amplification than Poisson subsampling in DP-SGD by eliminating participation variance while keeping uniform marginal participation.
DPrivBench is a new benchmark for evaluating LLMs on differential privacy reasoning, with results showing good performance on textbook mechanisms but substantial failures on advanced algorithms.
DP-OPD achieves lower perplexity than DP fine-tuning and synthesis-based private distillation under ε=2.0 by enforcing DP-SGD solely on the student during on-policy training with a frozen teacher.
DP-SelFT improves the privacy-utility trade-off for LLM fine-tuning by selecting robust layer subsets via DP synthetic data and perturbation-matched evaluation.
GroupGPT decouples intervention timing from response generation via edge-cloud collaboration for multi-user chats, scoring 4.72/5 on the new MUIR benchmark of 2500 segments while cutting token use by up to 3x and adding privacy sanitization.
DP-GRAPE reduces memory in differentially private neural network training by using random Gaussian projections on gradients instead of SVD, achieving comparable privacy-utility tradeoffs to DP-SGD and scaling to 6.7B parameter models.
citing papers explorer
-
Defenses at Odds: Measuring and Explaining Defense Conflicts in Large Language Models
Sequential LLM defense deployment leads to risk exacerbation in 38.9% of cases due to anti-aligned updates in shared critical layers, addressed by conflict-guided layer freezing.
-
Less Random, More Private: What is the Optimal Subsampling Scheme for DP-SGD?
Balanced Iteration Subsampling achieves stronger privacy amplification than Poisson subsampling in DP-SGD by eliminating participation variance while keeping uniform marginal participation.
-
DPrivBench: Benchmarking LLMs' Reasoning for Differential Privacy
DPrivBench is a new benchmark for evaluating LLMs on differential privacy reasoning, with results showing good performance on textbook mechanisms but substantial failures on advanced algorithms.
-
DP-OPD: Differentially Private On-Policy Distillation for Language Models
DP-OPD achieves lower perplexity than DP fine-tuning and synthesis-based private distillation under ε=2.0 by enforcing DP-SGD solely on the student during on-policy training with a frozen teacher.
-
DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models
DP-SelFT improves the privacy-utility trade-off for LLM fine-tuning by selecting robust layer subsets via DP synthetic data and perturbation-matched evaluation.
-
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
GroupGPT decouples intervention timing from response generation via edge-cloud collaboration for multi-user chats, scoring 4.72/5 on the new MUIR benchmark of 2500 segments while cutting token use by up to 3x and adding privacy sanitization.
-
Memory-Efficient Differentially Private Training with Gradient Random Projection
DP-GRAPE reduces memory in differentially private neural network training by using random Gaussian projections on gradients instead of SVD, achieving comparable privacy-utility tradeoffs to DP-SGD and scaling to 6.7B parameter models.