Scaling laws for diffusion transformers.CoRR, abs/2410.08184

Zhengyang Liang, Hao He, Ceyuan Yang, Bo Dai · 2024 · arXiv 2410.08184

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Iterative Inference-time Scaling with Adaptive Frequency Steering for Image Super-Resolution

cs.CV · 2025-12-29 · unverdicted · novelty 7.0

IAFS is a training-free iterative inference-time scaling framework that uses adaptive frequency-aware particle fusion to resolve the perception-fidelity conflict in diffusion super-resolution models, outperforming prior scaling strategies.

Normalizing Flows with Iterative Denoising

cs.CV · 2026-04-21 · unverdicted · novelty 6.0

iTARFlow augments normalizing flows with diffusion-style iterative denoising during sampling while preserving end-to-end likelihood training, reaching competitive results on ImageNet 64/128/256.

Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value

cs.LG · 2025-06-16 · conditional · novelty 6.0

Derives closed-form optimal loss for unified diffusion models, provides variance-controlled estimators, and shows improved diagnosis, training schedules, and power-law scaling after subtracting the optimal value.

Representation Gap: Explaining the Unreasonable Effectiveness of Neural Networks from a Geometric Perspective

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

Derives an asymptotic equivalent for the Representation Gap in equivariant diffusion models, showing it depends primarily on the intrinsic dimension of the task.

Scaling Properties of Continuous Diffusion Spoken Language Models

cs.CL · 2026-04-27 · unverdicted · novelty 5.0

Continuous diffusion spoken language models follow scaling laws for loss and phoneme divergence and generate emotive multi-speaker speech at 16B scale, though long-form coherence stays difficult.

Turbo-GS: Accelerating 3D Gaussian Fitting for High-Quality Radiance Fields

cs.CV · 2024-12-18 · unverdicted · novelty 5.0

Turbo-GS accelerates 3D Gaussian Splatting training via dilated rendering of pixel subsets, convergence-aware Gaussian budget allocation, and combined positional-appearance error densification to enable faster 4K fitting with preserved or improved rendering quality.

Motif-Video 2B: Technical Report

cs.CV · 2026-04-14 · unverdicted · novelty 4.0 · 2 refs

Motif-Video 2B reaches 83.76% on VBench, outperforming a 14B-parameter model with 7x fewer parameters and far less training data through shared cross-attention and a three-part backbone.

citing papers explorer

Showing 7 of 7 citing papers.

Iterative Inference-time Scaling with Adaptive Frequency Steering for Image Super-Resolution cs.CV · 2025-12-29 · unverdicted · none · ref 20
IAFS is a training-free iterative inference-time scaling framework that uses adaptive frequency-aware particle fusion to resolve the perception-fidelity conflict in diffusion super-resolution models, outperforming prior scaling strategies.
Normalizing Flows with Iterative Denoising cs.CV · 2026-04-21 · unverdicted · none · ref 10
iTARFlow augments normalizing flows with diffusion-style iterative denoising during sampling while preserving end-to-end likelihood training, reaching competitive results on ImageNet 64/128/256.
Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value cs.LG · 2025-06-16 · conditional · none · ref 32
Derives closed-form optimal loss for unified diffusion models, provides variance-controlled estimators, and shows improved diagnosis, training schedules, and power-law scaling after subtracting the optimal value.
Representation Gap: Explaining the Unreasonable Effectiveness of Neural Networks from a Geometric Perspective cs.LG · 2026-05-20 · unverdicted · none · ref 11
Derives an asymptotic equivalent for the Representation Gap in equivariant diffusion models, showing it depends primarily on the intrinsic dimension of the task.
Scaling Properties of Continuous Diffusion Spoken Language Models cs.CL · 2026-04-27 · unverdicted · none · ref 81
Continuous diffusion spoken language models follow scaling laws for loss and phoneme divergence and generate emotive multi-speaker speech at 16B scale, though long-form coherence stays difficult.
Turbo-GS: Accelerating 3D Gaussian Fitting for High-Quality Radiance Fields cs.CV · 2024-12-18 · unverdicted · none · ref 28
Turbo-GS accelerates 3D Gaussian Splatting training via dilated rendering of pixel subsets, convergence-aware Gaussian budget allocation, and combined positional-appearance error densification to enable faster 4K fitting with preserved or improved rendering quality.
Motif-Video 2B: Technical Report cs.CV · 2026-04-14 · unverdicted · none · ref 21 · 2 links
Motif-Video 2B reaches 83.76% on VBench, outperforming a 14B-parameter model with 7x fewer parameters and far less training data through shared cross-attention and a three-part backbone.

Scaling laws for diffusion transformers.CoRR, abs/2410.08184

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer