pith. sign in

Attention sinks in diffusion language models.arXiv preprint arXiv:2510.15731, 2025

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 1 baseline 1 other 1

citation-polarity summary

years

2026 7

verdicts

UNVERDICTED 7

representative citing papers

Registers Matter for Pixel-Space Diffusion Transformers

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

Register tokens enhance pixel-space DiT training and output quality via cleaner high-noise feature maps, and a dual-stream design adds further gains with little overhead.

Attention Sinks in Diffusion Transformers: A Causal Analysis

cs.CV · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

Suppressing attention sinks in diffusion transformers does not degrade CLIP-T alignment at moderate levels but induces sink-specific perceptual shifts six times larger than equal-budget random masking.

citing papers explorer

Showing 7 of 7 citing papers.