DP-SelFT improves the privacy-utility trade-off for LLM fine-tuning by selecting robust layer subsets via DP synthetic data and perturbation-matched evaluation.
hub
A.; Kamath, G.; Kulkarni, J.; Lee, Y
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
A test-driven pipeline with an auto-constructed privacy feature library detects 2.56 times more confirmed privacy leaks in LLM-based code generation than existing baselines.
GraphMind equips LLM agents with graph awareness to construct human-like social networks, producing botnets that substantially degrade performance of both text-based and graph-based detectors.
GroupGPT decouples intervention timing from response generation via edge-cloud collaboration for multi-user chats, scoring 4.72/5 on the new MUIR benchmark of 2500 segments while cutting token use by up to 3x and adding privacy sanitization.
PeCL applies token-level dynamic differential privacy and privacy-guided memory sculpting to achieve superior privacy-utility balance in continual learning.
DP-GRAPE reduces memory in differentially private neural network training by using random Gaussian projections on gradients instead of SVD, achieving comparable privacy-utility tradeoffs to DP-SGD and scaling to 6.7B parameter models.
ConfusionPrompt enables private black-box LLM inference via prompt decomposition and pseudo-prompt mixing, claiming better privacy-utility trade-off than perturbation methods and lower memory use than open-source local models.
Finetuning open LMs on ChatGPT outputs creates models that mimic style and fool human raters but fail to close the performance gap to proprietary systems on tasks not well-represented in the imitation data.
FedShield-LLM integrates pruning and FHE on LoRA parameters to support secure, scalable federated fine-tuning of LLMs such as Llama-2.
Authors introduce MLM and CLM specialization methods that avoid memorizing identifiers in sensitive training data while aiming for a privacy-utility tradeoff on medical datasets.
Industry AI practitioners view model quality through nine attributes with context-dependent priorities, where data imbalance is a key challenge addressed by strategies like active learning, as confirmed by interviews and a follow-up survey.
An overview revisits LoRA variants by categorizing advances in architectural design, efficient optimization, and applications while linking them to classical signal processing tools for principled fine-tuning.
Fine-tuned small language models outperform larger models in natural language to domain-specific code generation with improved performance, latency, and the ability to adapt to customer-specific scenarios without losing general capabilities.
citing papers explorer
-
DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models
DP-SelFT improves the privacy-utility trade-off for LLM fine-tuning by selecting robust layer subsets via DP synthetic data and perturbation-matched evaluation.
-
Probing Privacy Leaks in LLM-based Code Generation via Test Generation
A test-driven pipeline with an auto-constructed privacy feature library detects 2.56 times more confirmed privacy leaks in LLM-based code generation than existing baselines.
-
Beyond Individual Mimicry: Constructing Human-Like Social network with Graph-Augmented LLM Agents
GraphMind equips LLM agents with graph awareness to construct human-like social networks, producing botnets that substantially degrade performance of both text-based and graph-based detectors.
-
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
GroupGPT decouples intervention timing from response generation via edge-cloud collaboration for multi-user chats, scoring 4.72/5 on the new MUIR benchmark of 2500 segments while cutting token use by up to 3x and adding privacy sanitization.
-
Forget What's Sensitive, Remember What Matters: Token-Level Differential Privacy in Memory Sculpting for Continual Learning
PeCL applies token-level dynamic differential privacy and privacy-guided memory sculpting to achieve superior privacy-utility balance in continual learning.
-
Memory-Efficient Differentially Private Training with Gradient Random Projection
DP-GRAPE reduces memory in differentially private neural network training by using random Gaussian projections on gradients instead of SVD, achieving comparable privacy-utility tradeoffs to DP-SGD and scaling to 6.7B parameter models.
-
ConfusionPrompt: Practical Private Inference for Online Large Language Models
ConfusionPrompt enables private black-box LLM inference via prompt decomposition and pseudo-prompt mixing, claiming better privacy-utility trade-off than perturbation methods and lower memory use than open-source local models.
-
The False Promise of Imitating Proprietary LLMs
Finetuning open LMs on ChatGPT outputs creates models that mimic style and fool human raters but fail to close the performance gap to proprietary systems on tasks not well-represented in the imitation data.
-
FedShield-LLM: A Secure and Scalable Federated Fine-Tuned Large Language Model
FedShield-LLM integrates pruning and FHE on LoRA parameters to support secure, scalable federated fine-tuning of LLMs such as Llama-2.
-
Towards the Anonymization of the Language Modeling
Authors introduce MLM and CLM specialization methods that avoid memorizing identifiers in sensitive training data while aiming for a privacy-utility tradeoff on medical datasets.
-
Industry Practitioners Perspectives on AI Model Quality: Perceptions, Challenges, and Solutions
Industry AI practitioners view model quality through nine attributes with context-dependent priorities, where data imbalance is a key challenge addressed by strategies like active learning, as confirmed by interviews and a follow-up survey.
-
Low-Rank Adaptation Redux for Large Models
An overview revisits LoRA variants by categorizing advances in architectural design, efficient optimization, and applications while linking them to classical signal processing tools for principled fine-tuning.
-
SLM Finetuning for Natural Language to Domain Specific Code Generation in Production
Fine-tuned small language models outperform larger models in natural language to domain-specific code generation with improved performance, latency, and the ability to adapt to customer-specific scenarios without losing general capabilities.