pith. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.CL 3

years

2026 2 2024 1

representative citing papers

Generating Pretraining Tokens from Organic Data for Data-Bound Scaling

cs.CL · 2026-05-18 · unverdicted · novelty 6.0

SynPro uses RL-optimized rephrasing and reformatting of organic data to generate synthetic pretraining tokens that deliver 3.7-5.2x the effective learning of simple repetition and can exceed training on unique data at 1.1B scale.

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

cs.CL · 2024-10-23 · conditional · novelty 6.0

Adapting autoregressive models via continual pre-training yields diffusion language models from 127M to 7B parameters that outperform prior diffusion models and compete with their autoregressive counterparts on language, reasoning, and commonsense benchmarks.

citing papers explorer

Showing 3 of 3 citing papers.

  • TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment cs.CL · 2026-05-13 · unverdicted · none · ref 77

    TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.

  • Generating Pretraining Tokens from Organic Data for Data-Bound Scaling cs.CL · 2026-05-18 · unverdicted · none · ref 49

    SynPro uses RL-optimized rephrasing and reformatting of organic data to generate synthetic pretraining tokens that deliver 3.7-5.2x the effective learning of simple repetition and can exceed training on unique data at 1.1B scale.

  • Scaling Diffusion Language Models via Adaptation from Autoregressive Models cs.CL · 2024-10-23 · conditional · none · ref 46

    Adapting autoregressive models via continual pre-training yields diffusion language models from 127M to 7B parameters that outperform prior diffusion models and compete with their autoregressive counterparts on language, reasoning, and commonsense benchmarks.