A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.
Denoising vision transformer autoencoder with spectral self-regularization
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.
citing papers explorer
-
Spectral Vision Transformer for Efficient Tokenization with Limited Data
A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.
-
What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion
Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.