Reasoning with latent tokens in diffusion language models.arXiv preprint arXiv:2602.03769, 2026

Andre He, Sean Welleck, Daniel Fried · 2026 · arXiv 2602.03769

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

AdaState: Self-Evolving Anchors for Streaming Video Generation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

AdaState replaces the static first-frame KV anchor with an evolving hidden latent that the model denoises alongside content, treating time as relative to enable recurrence and richer dynamics in streaming video generation.

Fixed-Point Masked Generative Modeling

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

FP-MGMs with consistency loss and three-state reuse (CoFRe) reduce parameters by up to 38.8% and improve low-budget perplexity and FID versus standard masked generative models on text and images.

Looped Diffusion Language Models

cs.LG · 2026-05-25 · conditional · novelty 6.0

LoopMDM loops early-middle layers in masked diffusion models to match same-size MDM performance with up to 3.3x fewer training FLOPs and outperform on reasoning tasks by up to 8.5 points on GSM8K.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AdaState: Self-Evolving Anchors for Streaming Video Generation cs.CV · 2026-05-28 · unverdicted · none · ref 10
AdaState replaces the static first-frame KV anchor with an evolving hidden latent that the model denoises alongside content, treating time as relative to enable recurrence and richer dynamics in streaming video generation.

Reasoning with latent tokens in diffusion language models.arXiv preprint arXiv:2602.03769, 2026

fields

years

verdicts

representative citing papers

citing papers explorer