Forty-first International Conference on Machine Learning , year=

Scaling rectified flow transformers for high-resolution image synthesis , author=

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

representative citing papers

cs.LG · 2026-04-23 · unverdicted · novelty 8.0

Quotient-space diffusion models generate correct symmetric distributions by removing redundancy on the quotient space, simplifying learning and improving results on small molecules and proteins under SE(3) symmetry.

AnyBand-Diff: A Unified Remote Sensing Image Generation and Band Repair Framework with Spectral Priors

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

AnyBand-Diff is a spectral-prior-guided diffusion model that unifies remote sensing image generation and band repair while maintaining radiometric fidelity through physics-guided sampling and multi-scale losses.

ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

ElasticDiT introduces an elastic DiT architecture with adjustable spatial compression and block depth plus Shift Sparse Block Attention and a distilled VAE to enable a single model to cover multiple fidelity-latency points for high-resolution image generation on mobile devices.

A unified perspective on fine-tuning and sampling with diffusion and flow models

stat.ML · 2026-04-30 · unverdicted · novelty 6.0

A unified framework for exponential tilting in diffusion and flow models that includes bias-variance decompositions showing finite gradient variance for some methods, norm bounds on adjoint ODEs, and adapted losses with new Crooks and Jarzynski identities.

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

cs.CV · 2025-02-10 · unverdicted · novelty 6.0

TripoSG generates high-fidelity 3D meshes from input images via a large-scale rectified flow transformer and hybrid-trained 3D VAE on a custom 2-million-sample dataset, claiming state-of-the-art fidelity and generalization.

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

cs.CV · 2024-08-12 · unverdicted · novelty 6.0

CogVideoX generates coherent 10-second text-to-video outputs at high resolution using a 3D VAE, expert adaptive LayerNorm transformer, progressive training, and a custom data pipeline, claiming state-of-the-art results.

citing papers explorer

Showing 6 of 6 citing papers.

Quotient-Space Diffusion Models cs.LG · 2026-04-23 · unverdicted · none · ref 67
Quotient-space diffusion models generate correct symmetric distributions by removing redundancy on the quotient space, simplifying learning and improving results on small molecules and proteins under SE(3) symmetry.
AnyBand-Diff: A Unified Remote Sensing Image Generation and Band Repair Framework with Spectral Priors cs.CV · 2026-05-14 · unverdicted · none · ref 105
AnyBand-Diff is a spectral-prior-guided diffusion model that unifies remote sensing image generation and band repair while maintaining radiometric fidelity through physics-guided sampling and multi-scale losses.
ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices cs.CV · 2026-05-15 · unverdicted · none · ref 30
ElasticDiT introduces an elastic DiT architecture with adjustable spatial compression and block depth plus Shift Sparse Block Attention and a distilled VAE to enable a single model to cover multiple fidelity-latency points for high-resolution image generation on mobile devices.
A unified perspective on fine-tuning and sampling with diffusion and flow models stat.ML · 2026-04-30 · unverdicted · none · ref 48
A unified framework for exponential tilting in diffusion and flow models that includes bias-variance decompositions showing finite gradient variance for some methods, norm bounds on adjoint ODEs, and adapted losses with new Crooks and Jarzynski identities.
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models cs.CV · 2025-02-10 · unverdicted · none · ref 110
TripoSG generates high-fidelity 3D meshes from input images via a large-scale rectified flow transformer and hybrid-trained 3D VAE on a custom 2-million-sample dataset, claiming state-of-the-art fidelity and generalization.
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer cs.CV · 2024-08-12 · unverdicted · none · ref 7
CogVideoX generates coherent 10-second text-to-video outputs at high resolution using a 3D VAE, expert adaptive LayerNorm transformer, progressive training, and a custom data pipeline, claiming state-of-the-art results.

Forty-first International Conference on Machine Learning , year=

fields

years

verdicts

representative citing papers

citing papers explorer