Title resolution pending

Scalable Diffusion Models with Transformers , author= · 2023

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Diffusion Domain Expansion: Learning to Coordinate Pre-trained Diffusion Models

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

DDE introduces a compact coordinator network that combines denoised outputs from pre-trained diffusion models to enable generation in larger domains and complex conditioning settings.

Scaling Categorical Flow Maps

cs.LG · 2026-05-08 · unverdicted · novelty 5.0

Categorical flow matching models scale to 1.7B parameters on 2.1T tokens, enabling 4-step text generation with competitive quality and benchmark performance.

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

cs.CV · 2025-02-14 · unverdicted · novelty 4.0

Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.

HL-OutPaint: Coarse-to-Fine Video Outpainting for High-Resolution Long-Range Videos

cs.CV · 2026-05-17

Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers

cs.CV · 2026-05-14

Spherical Flows for Sampling Categorical Data

stat.ML · 2026-05-07 · 2 refs

citing papers explorer

Showing 6 of 6 citing papers.

Diffusion Domain Expansion: Learning to Coordinate Pre-trained Diffusion Models cs.LG · 2026-05-22 · unverdicted · none · ref 61
DDE introduces a compact coordinator network that combines denoised outputs from pre-trained diffusion models to enable generation in larger domains and complex conditioning settings.
Scaling Categorical Flow Maps cs.LG · 2026-05-08 · unverdicted · none · ref 27
Categorical flow matching models scale to 1.7B parameters on 2.1T tokens, enabling 4-step text generation with competitive quality and benchmark performance.
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model cs.CV · 2025-02-14 · unverdicted · none · ref 288
Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.
HL-OutPaint: Coarse-to-Fine Video Outpainting for High-Resolution Long-Range Videos cs.CV · 2026-05-17 · unreviewed · ref 32
Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers cs.CV · 2026-05-14 · unreviewed · ref 8
Spherical Flows for Sampling Categorical Data stat.ML · 2026-05-07 · unreviewed · ref 62 · 2 links

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer