Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.
Diffuse and disperse: Image generation with representation regularization
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9representative citing papers
STREAM applies stochastic Riemannian flow matching on VFM-derived unit hypersphere latents with a novel anisotropic decoder to achieve SOTA reconstruction and generation on breast and colorectal cancer histopathology datasets.
Matching in semantic SSL feature space via Sinkhorn divergence enables effective one-step generation on ImageNet by inducing compact geometry for distribution matching, with training and evaluation features best kept distinct.
RAE v2 reaches gFID 1.06 on ImageNet-256 in 80 epochs by combining multi-layer encoder sums, complementary REPA targets, and free guidance via output reparameterization.
A semantic progress signal from SSL discrepancy slope enables three stage-aware mechanisms that improve training efficiency and performance in audio diffusion models over static baselines.
Continuous adversarial flow models replace MSE in flow matching with adversarial training via a discriminator, improving guidance-free FID on ImageNet from 8.26 to 3.63 for SiT and similar gains for JiT and text-to-image benchmarks.
MPDiT uses a hierarchical multi-patch design in transformers to lower computation in diffusion models by handling coarse global features first then fine local details, plus faster-converging embeddings.
Premier learns user-specific embeddings to modulate text-to-image generation, outperforming prior methods on preference alignment, text consistency, and expert ratings even with limited history.
Med-DisSeg uses a dispersive loss on batch representations plus adaptive multi-scale decoding to achieve state-of-the-art fine-grained segmentation on five medical imaging datasets.
citing papers explorer
-
Continuous Language Diffusion as a Decoder-Interface Problem
Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.