A dynamic data valuation system for LLMs combines token entropy, influence functions, and proxy-based Shapley estimates to price data by its measured contribution to model performance, outperforming simple count-based methods in experiments across instruction, math, and code tasks.
Understanding black-box predictions via influence func- tions
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Utility-Aware Data Pricing: Token-Level Quality and Empirical Training Gain for LLMs
A dynamic data valuation system for LLMs combines token entropy, influence functions, and proxy-based Shapley estimates to price data by its measured contribution to model performance, outperforming simple count-based methods in experiments across instruction, math, and code tasks.