Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
Large-scale differentially private bert
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
unclear 1representative citing papers
Memorization in language models increases log-linearly with model capacity, data duplication count, and prompt context length.
Introduces MM-Privacy dataset and evaluations showing MLLMs leak sensitive data from images in various tasks, highlighting task inconsistency effects.
Shuffled DP-SGD requires σ ≥ 1/√(2 ln M) or κ ≥ (1/√8)(1 - 1/√(4π ln M)) to limit adversarial advantage, preventing strong privacy and high utility simultaneously.
citing papers explorer
-
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
-
Quantifying Memorization Across Neural Language Models
Memorization in language models increases log-linearly with model capacity, data duplication count, and prompt context length.
-
Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges
Introduces MM-Privacy dataset and evaluations showing MLLMs leak sensitive data from images in various tasks, highlighting task inconsistency effects.
-
Fundamental Limitations of Favorable Privacy-Utility Guarantees for DP-SGD
Shuffled DP-SGD requires σ ≥ 1/√(2 ln M) or κ ≥ (1/√8)(1 - 1/√(4π ln M)) to limit adversarial advantage, preventing strong privacy and high utility simultaneously.