pith. sign in

Title resolution pending

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.CL 1

years

2021 1

verdicts

UNVERDICTED 1

representative citing papers

Deduplicating Training Data Makes Language Models Better

cs.CL · 2021-07-14 · unverdicted · novelty 6.0

Deduplicating training datasets reduces language model verbatim memorization by 10x, improves training efficiency, and enables more accurate evaluation by cutting train-test overlap.

citing papers explorer

Showing 1 of 1 citing paper.

  • Deduplicating Training Data Makes Language Models Better cs.CL · 2021-07-14 · unverdicted · none · ref 46

    Deduplicating training datasets reduces language model verbatim memorization by 10x, improves training efficiency, and enables more accurate evaluation by cutting train-test overlap.