STREAM applies stochastic Riemannian flow matching on VFM-derived unit hypersphere latents with a novel anisotropic decoder to achieve SOTA reconstruction and generation on breast and colorectal cancer histopathology datasets.
Image tokenizer needs post-training.arXiv preprint arXiv:2509.12474
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 4years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
unclear 1representative citing papers
Projecting VAE latents to a fixed spherical radius and replacing linear interpolation with spherical linear interpolation improves class-conditional ImageNet-256 FID while leaving the diffusion architecture unchanged.
Qwen-Image-VAE-2.0 achieves state-of-the-art high-compression image reconstruction and superior diffusability for diffusion models, with a new text-rich document benchmark.
VibeToken enables autoregressive image generation at arbitrary resolutions using 64 tokens for 1024x1024 images with 3.94 gFID, constant 179G FLOPs, and better efficiency than diffusion or fixed AR baselines.
citing papers explorer
-
VibeToken: Scaling 1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations
VibeToken enables autoregressive image generation at arbitrary resolutions using 64 tokens for 1024x1024 images with 3.94 gFID, constant 179G FLOPs, and better efficiency than diffusion or fixed AR baselines.