Variable codebook sizes that increase along the sequence in visual tokenizers reduce generation FID scores significantly for autoregressive models on ImageNet.
Autoregressive Image Generation without Vector Quantization , url =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
NEAT achieves state-of-the-art 3D molecular generation on QM9 and GEOM-Drugs via a neighborhood-guided autoregressive set transformer that ensures atom-level permutation invariance and offers a significant speed advantage.
citing papers explorer
-
Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation
Variable codebook sizes that increase along the sequence in visual tokenizers reduce generation FID scores significantly for autoregressive models on ImageNet.
-
NEAT: Neighborhood-Guided, Efficient, Autoregressive Set Transformer for 3D Molecular Generation
NEAT achieves state-of-the-art 3D molecular generation on QM9 and GEOM-Drugs via a neighborhood-guided autoregressive set transformer that ensures atom-level permutation invariance and offers a significant speed advantage.