The paper decomposes errors in trajectory-based data attribution into config, algorithm, and system levels, proposes AdamW-influence to fix optimizer mismatch, derives an error proxy for Taylor approximation, and unifies data selection under a K-step look-ahead framework.
hub
Studying large language model generalization with influence functions.arXiv preprint arXiv:2308.03296
16 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Filter-then-Weight is a two-stage optimizer-aware method that filters geometrically useful data candidates and optimizes their coefficients to shape target updates in online LLM fine-tuning.
New analysis without global strong convexity yields tight scaling laws: NS error ~Θ(kd/n²) and NS-IF difference ~Θ((k+d)√(kd)/n²) for well-behaved logistic regressions.
MIMIC is a new inversion framework that recovers visual concepts from VLM internal states using joint inversion, feature alignment, and three regularizers.
Extends influence functions with a second-order pairwise interaction term that improves group attribution accuracy over simple summation on multiple model-dataset pairs and instruction-tuning selection tasks.
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.
RISE applies CountSketch to dual lexical and semantic channels derived from output-layer gradient outer products, cutting data attribution storage by up to 112x and enabling retrospective and prospective influence analysis on LLMs up to 32B parameters.
Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.
Kernel surrogate models with first-order gradient approximation achieve 25% higher correlation to leave-one-out ground truth for task attribution and 40% better downstream data selection than linear surrogates.
Eigenanalysis of the empirical NTK surfaces feature directions that align with Fourier features in modular addition networks and grammatical features in Gemma-3-270M, outperforming PCA baselines on activations.
DMin uses gradient compression to scalably estimate training data influence in billion-parameter diffusion models.
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
citing papers explorer
-
How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines
The paper decomposes errors in trajectory-based data attribution into config, algorithm, and system levels, proposes AdamW-influence to fix optimizer mismatch, derives an error proxy for Taylor approximation, and unifies data selection under a K-step look-ahead framework.
-
Filter-then-Weight: Online Data Selection and Reweighting for LLM Fine-Tuning
Filter-then-Weight is a two-stage optimizer-aware method that filters geometrically useful data candidates and optimizes their coefficients to shape target updates in online LLM fine-tuning.
-
On the Accuracy of Newton Step and Influence Function Data Attributions
New analysis without global strong convexity yields tight scaling laws: NS error ~Θ(kd/n²) and NS-IF difference ~Θ((k+d)√(kd)/n²) for well-behaved logistic regressions.
-
MIMIC: Multimodal Inversion for Model Interpretation and Conceptualization
MIMIC is a new inversion framework that recovers visual concepts from VLM internal states using joint inversion, feature alignment, and three regularizers.
-
Interaction-Aware Influence Functions for Group Attribution
Extends influence functions with a second-order pairwise interaction term that improves group attribution accuracy over simple summation on multiple model-dataset pairs and instruction-tuning selection tasks.
-
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
-
Convergent Evolution: How Different Language Models Learn Similar Number Representations
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
-
Representation-Guided Parameter-Efficient LLM Unlearning
REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.
-
Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation
RISE applies CountSketch to dual lexical and semantic channels derived from output-layer gradient outer products, cutting data attribution storage by up to 112x and enabling retrospective and prospective influence analysis on LLMs up to 32B parameters.
-
A Human-Centric Framework for Data Attribution in Large Language Models
Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.
-
Efficient Estimation of Kernel Surrogate Models for Task Attribution
Kernel surrogate models with first-order gradient approximation achieve 25% higher correlation to leave-one-out ground truth for task attribution and 40% better downstream data selection than linear surrogates.
-
Feature Identification via the Empirical NTK
Eigenanalysis of the empirical NTK surfaces feature directions that align with Fourier features in modular addition networks and grammatical features in Gemma-3-270M, outperforming PCA baselines on activations.
-
DMin: Scalable Training Data Influence Estimation for Diffusion Models
DMin uses gradient compression to scalably estimate training data influence in billion-parameter diffusion models.
-
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
- Mechanistic Anomaly Detection via Functional Attribution
- Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment