Helpful or harmful data? fine-tuning-free shapley attribution for explaining language model predictions

Jingtan Wang, Xiaoqiang Lin, Rui Qiao, Chuan-Sheng Foo, Bryan Kian Hsiang Low · 2023 · arXiv 2402.02318

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

What Makes Good Instruction-Tuning Data? An In-Context Learning Perspective

cs.CL · 2026-04-28 · unverdicted · novelty 6.0

A weighted in-context influence metric selects effective instruction-tuning data, outperforming baselines while showing that harder samples have lower influence.

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts

cs.CL · 2026-04-09 · conditional · novelty 6.0

Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.

DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks

cs.LG · 2025-02-01 · unverdicted · novelty 6.0

DUET is a global-to-local method that optimizes LLM training data mixtures via Bayesian optimization guided by influence-based selection and feedback from unseen evaluation tasks, with a regret bound showing convergence to the optimal mixture.

citing papers explorer

Showing 3 of 3 citing papers.

What Makes Good Instruction-Tuning Data? An In-Context Learning Perspective cs.CL · 2026-04-28 · unverdicted · none · ref 9
A weighted in-context influence metric selects effective instruction-tuning data, outperforming baselines while showing that harder samples have lower influence.
Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts cs.CL · 2026-04-09 · conditional · none · ref 89
Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.
DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks cs.LG · 2025-02-01 · unverdicted · none · ref 26
DUET is a global-to-local method that optimizes LLM training data mixtures via Bayesian optimization guided by influence-based selection and feedback from unseen evaluation tasks, with a regret bound showing convergence to the optimal mixture.

Helpful or harmful data? fine-tuning-free shapley attribution for explaining language model predictions

fields

years

verdicts

representative citing papers

citing papers explorer