RobuQ delivers the first stable DiT image generation at W1.58A2 average bits via Hadamard-based robust activation quantization and layer-wise mixed-precision activations.
High-resolution image synthesis with latent diffusion models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
PixArt-α matches commercial text-to-image quality with a diffusion transformer trained in 675 A100 GPU days through decomposed training stages, cross-attention text injection, and vision-language model dense captions.
citing papers explorer
-
RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization
RobuQ delivers the first stable DiT image generation at W1.58A2 average bits via Hadamard-based robust activation quantization and layer-wise mixed-precision activations.