Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
hub
Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Trade-off functions between two distributions are finitely testable if and only if their Neyman-Pearson rejection regions are attainable by a VC-class of sets.
A new orthogonal projection module for video anomaly detection suppresses facial attributes via weak face-presence signals and cosine alignment while preserving anomaly-relevant features like pose and motion.
A per-step layer-wise embedding exchange in federated GNNs recovers centralized node representations for cross-client subgraph patterns under an extended-subgraph assumption.
CCA does not compose autoregressively and retrofitting requires exponential query complexity under weak optimality.
An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.
Replaces determinant growth with generalized Rayleigh quotient for rare switching in private linear bandits to control worst-direction volume despite non-monotonic design matrices from noise.
Zeroth-order optimization is underexplored rather than underpowered in deep learning, with limitations stemming from full-space designs that can be addressed via subspace, spectral, and systems-aware approaches.
Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.
FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.
ALDEN boosts private data extraction rates from RAG systems by combining active learning for query diversification with dynamic estimation of the underlying knowledge-base topic distribution.
Establishes stability bounds for SHK flows yielding dimension-free controls on log-likelihood ratios and divergences, then applies them to time-dependent Pure-DP and Approximate-DP certificates for exponential-mechanism samplers.
citing papers explorer
-
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
-
When Are Trade-Off Functions Testable from Finite Samples?
Trade-off functions between two distributions are finitely testable if and only if their Neyman-Pearson rejection regions are attainable by a VC-class of sets.
-
Privacy-Aware Video Anomaly Detection through Orthogonal Subspace Projection
A new orthogonal projection module for video anomaly detection suppresses facial attributes via weak face-presence signals and cosine alignment while preserving anomaly-relevant features like pose and motion.
-
Federated Cross-Client Subgraph Pattern Detection
A per-step layer-wise embedding exchange in federated GNNs recovers centralized node representations for cross-client subgraph patterns under an extended-subgraph assumption.
-
Barriers to Counterfactual Credit Attribution for Autoregressive Models
CCA does not compose autoregressively and retrofitting requires exponential query complexity under weak optimality.
-
On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference
An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.
-
When Determinants Are Not Enough: Private Rare Switching
Replaces determinant growth with generalized Rayleigh quotient for rare switching in private linear bandits to control worst-direction volume despite non-monotonic design matrices from noise.
-
Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered
Zeroth-order optimization is underexplored rather than underpowered in deep learning, with limitations stemming from full-space designs that can be addressed via subspace, spectral, and systems-aware approaches.
-
Differentially Private Model Merging
Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.
-
FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion
FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.
-
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation
ALDEN boosts private data extraction rates from RAG systems by combining active learning for query diversification with dynamic estimation of the underlying knowledge-base topic distribution.
-
On the Stability of Spherical Hellinger-Kantorovich Flows and Their Implications for Differential Privacy
Establishes stability bounds for SHK flows yielding dimension-free controls on log-likelihood ratios and divergences, then applies them to time-dependent Pure-DP and Approximate-DP certificates for exponential-mechanism samplers.