Decouples Sphere Encoder into fixed pretrained encoder and spherical latent denoiser, yielding higher quality and faster inference than the joint original on Animal-Faces, Oxford-Flowers and ImageNet-1K.
Image generation with a sphere encoder
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Projecting VAE latents to a fixed spherical radius and replacing linear interpolation with spherical linear interpolation improves class-conditional ImageNet-256 FID while leaving the diffusion architecture unchanged.
Latent diffusability is quantified by decomposing the MMSE rate along diffusion trajectories into Fisher Information and Fisher Information Rate, with three geometric penalties (dimensional compression, tangential distortion, curvature injection) identified as sources of failure.
citing papers explorer
-
Efficient Image Synthesis with Sphere Latent Encoder
Decouples Sphere Encoder into fixed pretrained encoder and spherical latent denoiser, yielding higher quality and faster inference than the joint original on Animal-Faces, Oxford-Flowers and ImageNet-1K.
-
Aligning Latent Geometry for Spherical Flow Matching in Image Generation
Projecting VAE latents to a fixed spherical radius and replacing linear interpolation with spherical linear interpolation improves class-conditional ImageNet-256 FID while leaving the diffusion architecture unchanged.
-
Understanding Latent Diffusability via Fisher Geometry
Latent diffusability is quantified by decomposing the MMSE rate along diffusion trajectories into Fisher Information and Fisher Information Rate, with three geometric penalties (dimensional compression, tangential distortion, curvature injection) identified as sources of failure.