Spherical leech quantization for visual tokenization and generation, 2025

Yue Zhao, Hanwen Jiang, Zhenlin Xu, Chutong Yang, Ehsan Adeli, Philipp Krähenbühl · 2025 · arXiv 2512.14697

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Structure over Pixels: Learning Variable-Length Visual Programs

cs.CV · 2026-05-26 · unverdicted · novelty 7.0

STROP learns variable-length discrete visual programs for images by training a length head against frozen DINOv3 features in a four-phase curriculum while bypassing pixel reconstruction.

IDEAL: In-DEpth ALignment Makes A Discrete Representation AutoEncoder

cs.CV · 2026-06-09 · unverdicted · novelty 6.0

IDEAL improves discrete representation autoencoders by jointly aligning quantized tokens with shallow and deep VFM features, reporting 0.61 rFID on ImageNet and 1.89 gFID for autoregressive image generation.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Structure over Pixels: Learning Variable-Length Visual Programs cs.CV · 2026-05-26 · unverdicted · none · ref 26
STROP learns variable-length discrete visual programs for images by training a length head against frozen DINOv3 features in a four-phase curriculum while bypassing pixel reconstruction.
IDEAL: In-DEpth ALignment Makes A Discrete Representation AutoEncoder cs.CV · 2026-06-09 · unverdicted · none · ref 70
IDEAL improves discrete representation autoencoders by jointly aligning quantized tokens with shallow and deep VFM features, reporting 0.61 rFID on ImageNet and 1.89 gFID for autoregressive image generation.

Spherical leech quantization for visual tokenization and generation, 2025

fields

years

verdicts

representative citing papers

citing papers explorer