hub

Less: Selecting influential data for targeted instruction tuning

Mengzhou Xia, Sadhika Malladi, Suchin Gururangan, Sanjeev Arora, Danqi Chen · 2023 · arXiv 2402.04333

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 method 1

citation-polarity summary

background 2 use method 1

representative citing papers

Selective Contrastive Learning For Gloss Free Sign Language Translation

cs.CL · 2026-04-24 · unverdicted · novelty 7.0

A pair selection strategy based on negative similarity dynamics strengthens contrastive supervision in gloss-free sign language translation by reducing noisy negatives.

Unified Data Selection for LLM Reasoning

cs.CL · 2026-05-21 · unverdicted · novelty 6.0

High-Entropy Sum (HES) selects high-quality reasoning data for LLMs by summing entropy of the top highest-entropy tokens, matching full-dataset performance with top 20% in SFT and outperforming baselines in RFT and RL.

Let the Target Select for Itself: Data Selection via Target-Aligned Paths

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Target-aligned data selection via normalized endpoint loss drop on a validation-induced reference path achieves competitive performance with reduced computational overhead.

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts

cs.CL · 2026-04-09 · conditional · novelty 6.0

Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.

LLM-AutoDP: Automatic Data Processing via LLM Agents for Model Fine-tuning

cs.LG · 2026-01-28 · unverdicted · novelty 6.0

LLM agents iteratively generate and optimize data processing strategies for fine-tuning, delivering over 80% win rates versus unprocessed data and 65% versus LLM-based AutoML baselines while cutting search time by up to 10x.

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 6.0

Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.

DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks

cs.LG · 2025-02-01 · unverdicted · novelty 6.0

DUET is a global-to-local method that optimizes LLM training data mixtures via Bayesian optimization guided by influence-based selection and feedback from unseen evaluation tasks, with a regret bound showing convergence to the optimal mixture.

Toward Communication-Efficient Space Data Centers: Bottlenecks, Architectures, and New Paradigms

cs.NI · 2026-05-12 · unverdicted · novelty 5.0

Semantic communication in a multi-layer heterogeneous space data center framework can substantially reduce uplink pressure for orbital AI by sending compact representations rather than raw data.

Rigorous Interpretation Is a Form of Evaluation

cs.CY · 2026-05-06 · unverdicted · novelty 5.0

Rigorous interpretability can function as a principled form of model evaluation if its claims are falsifiable, reproducible, and predictive.

Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap

cs.CL · 2025-08-06 · unverdicted · novelty 5.0

Selecting preference pairs whose DPO implicit reward gap is small yields better LLM alignment than random or baseline selection while using only 10% of the data.

Retrieval-Augmented Generation for AI-Generated Content: A Survey

cs.CV · 2024-02-29 · accept · novelty 5.0

A survey classifying RAG foundations for AIGC, summarizing enhancements, cross-modal applications, benchmarks, limitations, and future directions.

An Empirical Study on Influence-Based Pretraining Data Selection for Code Large Language Models

cs.SE · 2026-04-09 · unverdicted · novelty 4.0

Data-influence-score filtering using validation-set loss on downstream coding tasks improves Code-LLM performance, with the most beneficial training data varying significantly across different programming tasks.

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

cs.CV · 2025-02-14 · unverdicted · novelty 4.0

Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.

Data Difficulty and the Generalization--Extrapolation Tradeoff in LLM Fine-Tuning

cs.LG · 2026-05-13

citing papers explorer

Showing 14 of 14 citing papers.

Selective Contrastive Learning For Gloss Free Sign Language Translation cs.CL · 2026-04-24 · unverdicted · none · ref 13
A pair selection strategy based on negative similarity dynamics strengthens contrastive supervision in gloss-free sign language translation by reducing noisy negatives.
Unified Data Selection for LLM Reasoning cs.CL · 2026-05-21 · unverdicted · none · ref 49
High-Entropy Sum (HES) selects high-quality reasoning data for LLMs by summing entropy of the top highest-entropy tokens, matching full-dataset performance with top 20% in SFT and outperforming baselines in RFT and RL.
Let the Target Select for Itself: Data Selection via Target-Aligned Paths cs.LG · 2026-05-10 · unverdicted · none · ref 53
Target-aligned data selection via normalized endpoint loss drop on a validation-induced reference path achieves competitive performance with reduced computational overhead.
Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts cs.CL · 2026-04-09 · conditional · none · ref 92
Loss-based pruning of training data to limit facts and flatten their frequency distribution enables a 110M-parameter GPT-2 model to memorize 1.3 times more entity facts than standard training, matching a 1.3B-parameter model on the full dataset.
LLM-AutoDP: Automatic Data Processing via LLM Agents for Model Fine-tuning cs.LG · 2026-01-28 · unverdicted · none · ref 48
LLM agents iteratively generate and optimize data processing strategies for fine-tuning, delivering over 80% win rates versus unprocessed data and 65% versus LLM-based AutoML baselines while cutting search time by up to 10x.
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models cs.CL · 2025-08-21 · unverdicted · none · ref 23
Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.
DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks cs.LG · 2025-02-01 · unverdicted · none · ref 27
DUET is a global-to-local method that optimizes LLM training data mixtures via Bayesian optimization guided by influence-based selection and feedback from unseen evaluation tasks, with a regret bound showing convergence to the optimal mixture.
Toward Communication-Efficient Space Data Centers: Bottlenecks, Architectures, and New Paradigms cs.NI · 2026-05-12 · unverdicted · none · ref 11
Semantic communication in a multi-layer heterogeneous space data center framework can substantially reduce uplink pressure for orbital AI by sending compact representations rather than raw data.
Rigorous Interpretation Is a Form of Evaluation cs.CY · 2026-05-06 · unverdicted · none · ref 45
Rigorous interpretability can function as a principled form of model evaluation if its claims are falsifiable, reproducible, and predictive.
Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap cs.CL · 2025-08-06 · unverdicted · none · ref 19
Selecting preference pairs whose DPO implicit reward gap is small yields better LLM alignment than random or baseline selection while using only 10% of the data.
Retrieval-Augmented Generation for AI-Generated Content: A Survey cs.CV · 2024-02-29 · accept · none · ref 130
A survey classifying RAG foundations for AIGC, summarizing enhancements, cross-modal applications, benchmarks, limitations, and future directions.
An Empirical Study on Influence-Based Pretraining Data Selection for Code Large Language Models cs.SE · 2026-04-09 · unverdicted · none · ref 40
Data-influence-score filtering using validation-set loss on downstream coding tasks improves Code-LLM performance, with the most beneficial training data varying significantly across different programming tasks.
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model cs.CV · 2025-02-14 · unverdicted · none · ref 223
Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.
Data Difficulty and the Generalization--Extrapolation Tradeoff in LLM Fine-Tuning cs.LG · 2026-05-13 · unreviewed · ref 14

Less: Selecting influential data for targeted instruction tuning

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer