The lambada dataset: Word prediction requiring a broad discourse context

Paperno, D · 2016

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

NITP: Next Implicit Token Prediction for LLM Pre-training

cs.CL · 2026-05-24 · unverdicted · novelty 6.0

NITP augments standard next-token prediction with implicit semantic prediction in representation space using shallow-layer self-supervision, reporting consistent downstream gains on 0.5B-9B models including 5.7% on MMLU-Pro for a 9B MoE.

Language Generation as Optimal Control: Closed-Loop Diffusion in Latent Control Space

cs.CL · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

Manta-LM approximates the HJB equation via flow matching in latent control space to realize closed-loop optimal control for language generation.

citing papers explorer

Showing 2 of 2 citing papers after filters.

NITP: Next Implicit Token Prediction for LLM Pre-training cs.CL · 2026-05-24 · unverdicted · none · ref 36
NITP augments standard next-token prediction with implicit semantic prediction in representation space using shallow-layer self-supervision, reporting consistent downstream gains on 0.5B-9B models including 5.7% on MMLU-Pro for a 9B MoE.
Language Generation as Optimal Control: Closed-Loop Diffusion in Latent Control Space cs.CL · 2026-05-14 · unverdicted · none · ref 35 · 2 links
Manta-LM approximates the HJB equation via flow matching in latent control space to realize closed-loop optimal control for language generation.

The lambada dataset: Word prediction requiring a broad discourse context

fields

years

verdicts

representative citing papers

citing papers explorer