Variable codebook sizes that increase along the sequence in visual tokenizers reduce generation FID scores significantly for autoregressive models on ImageNet.
Goodfellow and Wojciech Zaremba and Vicki Cheung and Alec Radford and Xi Chen , editor =
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation
Variable codebook sizes that increase along the sequence in visual tokenizers reduce generation FID scores significantly for autoregressive models on ImageNet.