FocusDiT masks non-critical query tokens before they enter the FFN in DiT models, directing capacity toward complex visual details and reporting improved text-to-image results.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
FocusDiT: Masking Queries in Diffusion Transformers for Fine-grained Image Generation
FocusDiT masks non-critical query tokens before they enter the FFN in DiT models, directing capacity toward complex visual details and reporting improved text-to-image results.