D2 pruning: Message passing for balancing diversity and difﬁculty in data pruning

Adyasha Maharana, Prateek Y adav, Mohit Bansal · 2023 · arXiv 2310.07931

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Seeing Realism from Simulation: Efficient Video Transfer for Vision-Language-Action Data Augmentation

cs.CV · 2026-05-04 · unverdicted · novelty 6.0

A video transfer pipeline augments simulated VLA data into realistic videos while preserving actions, yielding consistent performance gains on robot benchmarks such as 8% on Robotwin 2.0.

Select Smarter, Not More: Prompt-Aware Evaluation Scheduling with Submodular Guarantees

cs.AI · 2026-04-13 · unverdicted · novelty 6.0

POES frames prompt evaluation as online adaptive testing and uses a provably submodular objective to pick informative examples, delivering 6.2% higher average accuracy and 35-60% token savings versus naive full-set scoring.

DOSE: Data Selection for Multi-Modal LLMs via Off-the-Shelf Models

cs.CV · 2026-04-18 · unverdicted · novelty 5.0

Off-the-shelf models assess quality and alignment to select diverse multimodal training data, letting models trained on the filtered subset match or exceed full-dataset results on standard benchmarks.

citing papers explorer

Showing 3 of 3 citing papers.

Seeing Realism from Simulation: Efficient Video Transfer for Vision-Language-Action Data Augmentation cs.CV · 2026-05-04 · unverdicted · none · ref 18
A video transfer pipeline augments simulated VLA data into realistic videos while preserving actions, yielding consistent performance gains on robot benchmarks such as 8% on Robotwin 2.0.
Select Smarter, Not More: Prompt-Aware Evaluation Scheduling with Submodular Guarantees cs.AI · 2026-04-13 · unverdicted · none · ref 34
POES frames prompt evaluation as online adaptive testing and uses a provably submodular objective to pick informative examples, delivering 6.2% higher average accuracy and 35-60% token savings versus naive full-set scoring.
DOSE: Data Selection for Multi-Modal LLMs via Off-the-Shelf Models cs.CV · 2026-04-18 · unverdicted · none · ref 4
Off-the-shelf models assess quality and alignment to select diverse multimodal training data, letting models trained on the filtered subset match or exceed full-dataset results on standard benchmarks.

D2 pruning: Message passing for balancing diversity and difﬁculty in data pruning

fields

years

verdicts

representative citing papers

citing papers explorer