How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines
Pith reviewed 2026-05-20 23:08 UTC · model grok-4.3
The pith
Accounting for AdamW dynamics makes trajectory-based data attribution match ground-truth influence far more closely.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Trajectory-based attribution is unfaithful primarily because existing formulas assume SGD while modern models use AdamW; deriving influence scores that exactly mirror AdamW's update rules removes this dominant error and produces substantially higher agreement with the ground-truth effect of removing each training point.
What carries the argument
AdamW-influence, the trajectory-unrolling formula that replaces the SGD gradient step with AdamW's bias-corrected first- and second-moment updates inside the influence calculation.
Load-bearing premise
The total attribution error decomposes cleanly into independent configuration, algorithm, and system components, with optimizer mismatch as the largest configuration term.
What would settle it
Leave-one-out retraining on the same AdamW-trained models to obtain exact influence values; if Spearman correlation with AdamW-influence scores does not rise markedly above prior methods, the central claim fails.
Figures
read the original abstract
Trajectory-based data attribution methods estimate the influence of training samples on model predictions by unrolling the training trajectory. They are widely used in applications such as data selection, data valuation, and model diagnosis, but there is a lack of comprehensive error analysis of these methods, raising concerns about method faithfulness and hindering reliable deployment. In this work, we provide the first systematic analysis of error sources in trajectory-based data attribution, together with concrete remedies to mitigate them and practical guidelines for downstream use. We organize the total error into three categories, config-level, algorithm-level, and system-level. We make three contributions. First, we identify optimizer mismatch as the dominant config-level error: existing methods derive their attribution under the assumption of SGD, even for models trained with the modern de facto optimizer AdamW. We propose AdamW-influence to fully account for AdamW's optimization dynamics, yielding improvements from 10% to over 300% in Spearman correlation between estimated and ground-truth influence across four settings spanning MLP, CNN, GPT-2, and Llama 3.2-1B. Second, we isolate the remaining algorithm-level error arising from the first-order Taylor approximation, identify the learning rate and trajectory length as factors governing the error magnitude, and derive a closed-form error proxy that can be evaluated along the original trajectory without retraining. Third, we translate these insights into practical guidelines for data selection by unifying offline and online strategies under a K-step look-ahead framework. Under this framework, online selection with a short horizon often matches or exceeds offline, and the optimal horizon can be tuned jointly with the learning rate. Together, these results turn the framework into an actionable selection recipe for practitioners.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper provides the first systematic analysis of error sources in trajectory-based data attribution methods, organizing total error into config-level (primarily optimizer mismatch between SGD assumptions and AdamW training), algorithm-level (first-order Taylor approximation), and system-level categories. It proposes AdamW-influence to correct for AdamW dynamics, yielding Spearman correlation improvements from 10% to over 300% across MLP, CNN, GPT-2, and Llama 3.2-1B settings; derives a closed-form error proxy for the Taylor approximation governed by learning rate and trajectory length; and unifies offline/online data selection under a K-step look-ahead framework with practical guidelines.
Significance. If the central claims hold, this work meaningfully advances the reliability of trajectory-based attribution for data selection, valuation, and diagnosis by quantifying and mitigating dominant error sources in modern training regimes. The empirical gains with AdamW-influence, the retraining-free error proxy, and the K-step unification provide both theoretical insight and actionable recipes; the cross-model evaluation spanning small MLPs to 1B-scale LLMs strengthens generalizability.
major comments (2)
- [§3] §3 (error decomposition): The organization of total error into independent config-level, algorithm-level, and system-level components treats these as separable so that AdamW-influence can be applied in isolation, but the training trajectory is itself generated by AdamW; any first-order Taylor proxy derived under SGD dynamics is therefore evaluated on a different path, creating potential coupling that is not analyzed when proposing independent remedies.
- [§4.2] §4.2 (closed-form error proxy): The claim that the proxy can be evaluated along the original trajectory without retraining relies on the first-order approximation remaining valid after the AdamW correction; if the optimizer change alters higher-order terms, the proxy's accuracy and the identified governing factors (learning rate, trajectory length) require explicit validation on AdamW trajectories.
minor comments (2)
- [§5] Experimental details on how ground-truth influence is computed (e.g., exact leave-one-out or retraining protocol) are referenced but not fully specified in the main text, making it difficult to assess whether the reported Spearman gains are robust to alternative ground-truth definitions.
- [Figures 3-5] Table captions and axis labels in the correlation plots should explicitly state the number of runs and seeds used to compute the reported improvements, to clarify statistical reliability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our error decomposition and the closed-form proxy. We address each major comment below and outline the changes we will make.
read point-by-point responses
-
Referee: [§3] §3 (error decomposition): The organization of total error into independent config-level, algorithm-level, and system-level components treats these as separable so that AdamW-influence can be applied in isolation, but the training trajectory is itself generated by AdamW; any first-order Taylor proxy derived under SGD dynamics is therefore evaluated on a different path, creating potential coupling that is not analyzed when proposing independent remedies.
Authors: We agree that the components are coupled through the shared AdamW-generated trajectory and that a fully independent treatment is an approximation. Our analysis isolates the dominant config-level mismatch (optimizer mismatch), which AdamW-influence corrects directly on the observed trajectory; the algorithm-level Taylor proxy is then applied to the corrected estimates. While we did not derive a joint higher-order expansion of the coupled errors, the large empirical gains (10% to >300% Spearman correlation) indicate that addressing the primary mismatch yields reliable improvements in practice. We will add an explicit discussion of this coupling and its limitations in the revised manuscript. revision: partial
-
Referee: [§4.2] §4.2 (closed-form error proxy): The claim that the proxy can be evaluated along the original trajectory without retraining relies on the first-order approximation remaining valid after the AdamW correction; if the optimizer change alters higher-order terms, the proxy's accuracy and the identified governing factors (learning rate, trajectory length) require explicit validation on AdamW trajectories.
Authors: The error proxy is obtained from a first-order Taylor expansion whose leading terms depend on the learning rate and trajectory length; these quantities are directly observable along the original (AdamW) trajectory. Because AdamW-influence already aligns the influence function with the actual optimizer, the proxy is meant to be evaluated after this correction. To confirm that higher-order terms do not materially affect the proxy's accuracy or the identified governing factors, we will add explicit validation experiments that compare proxy predictions against measured errors on AdamW-trained trajectories in the revision. revision: yes
Circularity Check
Derivation chain is self-contained; no reduction to inputs by construction
full rationale
The paper organizes error into config-, algorithm-, and system-level categories as an analytical framework, proposes AdamW-influence to correct optimizer mismatch, and derives a closed-form proxy for first-order Taylor error that is evaluated along the existing trajectory without retraining. These steps are validated via external Spearman correlations to ground-truth influence on held-out models (MLP through Llama 3.2-1B), not by fitting parameters that are then renamed as predictions. Self-citations to prior trajectory-based attribution work supply background but are not invoked as uniqueness theorems or load-bearing premises that force the new results; the K-step look-ahead unification follows directly from the identified factors (learning rate, horizon) and remains testable independently. No equation or claim reduces to its own inputs by definition or statistical necessity.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Training dynamics can be approximated by unrolling the optimization trajectory under the chosen optimizer.
- domain assumption The first-order Taylor approximation error is governed primarily by learning rate and trajectory length.
Reference graph
Works this paper leans on
-
[1]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929,
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[2]
Association for Computational Linguistics. doi: 10.18653/v1/D19-5409. URLhttps://www.aclweb.org/anthology/D19-5409. Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Let- man, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783,
work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/d19-5409
-
[3]
Roger Grosse, Juhan Bae, Cem Anil, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez, et al. Studying large language model generalization with influence functions.arXiv preprint arXiv:2308.03296,
-
[4]
Kelvin Guu, Albert Webson, Ellie Pavlick, Lucas Dixon, Ian Tenney, and Tolga Bolukbasi. Simfluence: Modeling the influence of individual training examples by simulating training runs.arXiv preprint arXiv:2303.08114,
-
[5]
Andrew Ilyas and Logan Engstrom
URL https://openreview.net/forum?id= sYK4yPDuT1. Andrew Ilyas and Logan Engstrom. Magic: Near-optimal data attribution for deep learning.arXiv preprint arXiv:2504.16430,
-
[6]
Decoupled Weight Decay Regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101,
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
Accumulative sgd influence estimation for data attribution
Yunxiao Shi, Shuo Yang, Yixin Su, Rui Zhang, and Min Xu. Accumulative sgd influence estimation for data attribution. arXiv preprint arXiv:2510.26185,
-
[8]
Shichang Zhang, Hongzhe Du, Jiaqi W Ma, and Himabindu Lakkaraju
URL https://openreview.net/forum?id=6gzPSMUAz2. Shichang Zhang, Hongzhe Du, Jiaqi W Ma, and Himabindu Lakkaraju. Who gets credit or blame? attributing accountability in modern ai systems.arXiv preprint arXiv:2506.00175,
-
[9]
diag(gt) .(11b) Mt collects the Hessian-free terms (momentum decay and weight decay), while Rt collects the Hessian-mediated coupling between optimizer states. How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines Deng et al. B.4 Backward recurrence for the summary matrix Analogous to SGD-influence, we ...
work page 2019
-
[10]
We also consider additional settings presented in Appendix E
3× D Experiment details D.1 Experimental settings for fidelity evaluation In section 3.2, we consider four experiment settings to evaluate the attribution fidelity of AdamW-influence. We also consider additional settings presented in Appendix E. • MNIST+MLP.We train a three-layer multilayer perceptron (MLP) with hidden layer sizes of 16, consisting of two...
work page 2002
-
[11]
on the Alpaca instruction-tuning dataset (Taori et al., 2023), using the first512 training examples with a maximum sequence length of 512 tokens. We train for 1 epoch and evaluate attribution fidelity on 100 test samples from the SAMSum summarization task (Gliwa et al., 2019). We use the AdamW optimizer and sweep learning rates spanning 5×10 −7, 2×10 −6, ...
work page 2023
-
[12]
Both belong to a single second-order remainder off t,i and are not independent error sources
The first term originates from the loss nonlinearity (∇3ℓ) entering ¨ˆmt,i and is present even for non-adaptive optimizers; the second originates from the quadratic dependence of ˆvt on gradient perturbations and is specific to Adam-family optimizers. Both belong to a single second-order remainder off t,i and are not independent error sources. How Faithfu...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.