Mixed citations

No other representation component is needed: Diffusion transformers can provide representation guidance by themselves

Jiang, D · 2025 · arXiv 2505.02831

Mixed citation behavior. Most common role is background (40%).

9 Pith papers citing it

Background 40% of classified citations

read on arXiv browse 9 citing papers

citation-role summary

background 3 method 1 other 1

citation-polarity summary

background 2 support 1 unclear 1 use method 1

representative citing papers

STRIDE: Training-Free Diversity Guidance via PCA-Directed Feature Perturbation in Single-Step Diffusion Models

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

STRIDE boosts diversity in one-step diffusion models by injecting PCA-aligned pink noise into transformer features while preserving text alignment and quality.

Don't Retrain, Align: Adapting Autoregressive LMs to Diffusion LMs via Representation Alignment

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

Layer-wise representation alignment lets diffusion language models reuse semantic structures from frozen autoregressive models, accelerating training by up to 4x without architectural changes beyond the attention mask.

Improved Baselines with Representation Autoencoders

cs.CV · 2026-05-18 · conditional · novelty 6.0

RAE v2 reaches gFID 1.06 on ImageNet-256 in 80 epochs by combining multi-layer encoder sums, complementary REPA targets, and free guidance via output reparameterization.

Stage-adaptive audio diffusion modeling

cs.SD · 2026-05-06 · unverdicted · novelty 6.0

A semantic progress signal from SSL discrepancy slope enables three stage-aware mechanisms that improve training efficiency and performance in audio diffusion models over static baselines.

Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning

cs.GR · 2026-03-25 · conditional · novelty 6.0

Realiz3D decouples visual domain from 3D controls in diffusion models via domain-aware residual adapters to enable photorealistic controllable generation.

Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds

cs.LG · 2026-04-28 · unverdicted · novelty 5.0

Aligning the DDIM forward diffusion process with flow-matching manifold evolution enables high-quality generation without time conditioning, and class-conditional synthesis is possible with an unconditional denoiser by using separate time spaces per class.

FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution

cs.CV · 2025-12-01 · unverdicted · novelty 5.0

FRAMER improves real-world super-resolution by decomposing features into low- and high-frequency bands via FFT, applying intra- and inter-contrastive losses with adaptive modulators, and using the final layer as teacher for intermediate layers during diffusion denoising.

Elucidating Representation Degradation Problem in Diffusion Model Training

cs.LG · 2026-05-11 · unverdicted · novelty 4.0

Diffusion models suffer representation degradation at high noise due to recoverability mismatch; ERD mitigates this by dynamic optimization reallocation, accelerating convergence across backbones.

D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models

cs.CV · 2026-05-06 · 2 refs

citing papers explorer

Showing 9 of 9 citing papers.

STRIDE: Training-Free Diversity Guidance via PCA-Directed Feature Perturbation in Single-Step Diffusion Models cs.CV · 2026-05-12 · unverdicted · none · ref 21
STRIDE boosts diversity in one-step diffusion models by injecting PCA-aligned pink noise into transformer features while preserving text alignment and quality.
Don't Retrain, Align: Adapting Autoregressive LMs to Diffusion LMs via Representation Alignment cs.LG · 2026-05-07 · unverdicted · none · ref 6
Layer-wise representation alignment lets diffusion language models reuse semantic structures from frozen autoregressive models, accelerating training by up to 4x without architectural changes beyond the attention mask.
Improved Baselines with Representation Autoencoders cs.CV · 2026-05-18 · conditional · none · ref 27
RAE v2 reaches gFID 1.06 on ImageNet-256 in 80 epochs by combining multi-layer encoder sums, complementary REPA targets, and free guidance via output reparameterization.
Stage-adaptive audio diffusion modeling cs.SD · 2026-05-06 · unverdicted · none · ref 5
A semantic progress signal from SSL discrepancy slope enables three stage-aware mechanisms that improve training efficiency and performance in audio diffusion models over static baselines.
Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning cs.GR · 2026-03-25 · conditional · none · ref 16
Realiz3D decouples visual domain from 3D controls in diffusion models via domain-aware residual adapters to enable photorealistic controllable generation.
Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds cs.LG · 2026-04-28 · unverdicted · none · ref 12
Aligning the DDIM forward diffusion process with flow-matching manifold evolution enables high-quality generation without time conditioning, and class-conditional synthesis is possible with an unconditional denoiser by using separate time spaces per class.
FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution cs.CV · 2025-12-01 · unverdicted · none · ref 19
FRAMER improves real-world super-resolution by decomposing features into low- and high-frequency bands via FFT, applying intra- and inter-contrastive losses with adaptive modulators, and using the final layer as teacher for intermediate layers during diffusion denoising.
Elucidating Representation Degradation Problem in Diffusion Model Training cs.LG · 2026-05-11 · unverdicted · none · ref 20
Diffusion models suffer representation degradation at high noise due to recoverability mismatch; ERD mitigates this by dynamic optimization reallocation, accelerating convergence across backbones.
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models cs.CV · 2026-05-06 · unreviewed · ref 39 · 2 links

No other representation component is needed: Diffusion transformers can provide representation guidance by themselves

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer