Recognition: unknown
Continuous Adversarial Flow Models
Pith reviewed 2026-05-10 15:22 UTC · model grok-4.3
The pith
Training continuous flow models with a learned discriminator rather than mean-squared error produces samples better aligned with the target data distribution.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Continuous adversarial flow models are continuous-time flow models trained with an adversarial objective supplied by a learned discriminator instead of the fixed mean-squared-error criterion used in flow matching. This change induces a different generalized distribution that empirically aligns better with the target data distribution. The approach is proposed primarily for post-training existing flow-matching models such as SiT and JiT, although it can also be used to train models from scratch, and is validated by substantial FID reductions on ImageNet 256px generation together with gains on GenEval and DPG for text-to-image tasks.
What carries the argument
A learned discriminator that replaces the fixed mean-squared-error loss and supplies an adversarial training signal for the continuous-time flow model.
Load-bearing premise
The learned discriminator must supply a stable, non-collapsing training signal that genuinely improves alignment with the target distribution rather than introducing new artifacts or instability.
What would settle it
Applying the post-training procedure to a flow model such as SiT on ImageNet and observing no reduction in guidance-free FID or the appearance of training collapse would indicate that the adversarial objective does not deliver the claimed alignment benefit.
Figures
read the original abstract
We propose continuous adversarial flow models, a type of continuous-time flow model trained with an adversarial objective. Unlike flow matching, which uses a fixed mean-squared-error criterion, our approach introduces a learned discriminator to guide training. This change in objective induces a different generalized distribution, which empirically produces samples that are better aligned with the target data distribution. Our method is primarily proposed for post-training existing flow-matching models, although it can also train models from scratch. On the ImageNet 256px generation task, our post-training substantially improves the guidance-free FID of latent-space SiT from 8.26 to 3.63 and of pixel-space JiT from 7.17 to 3.57. It also improves guided generation, reducing FID from 2.06 to 1.53 for SiT and from 1.86 to 1.80 for JiT. We further evaluate our approach on text-to-image generation, where it achieves improved results on both the GenEval and DPG benchmarks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces continuous adversarial flow models, a variant of continuous-time flow models trained via an adversarial objective using a learned discriminator rather than the fixed MSE loss of flow matching. The method is primarily intended for post-training existing flow-matching models (e.g., SiT and JiT) to induce a different generalized distribution that better aligns with the target data. Empirical results on ImageNet 256px report substantial FID reductions (guidance-free: SiT 8.26→3.63, JiT 7.17→3.57; guided: SiT 2.06→1.53, JiT 1.86→1.80) and improved scores on GenEval and DPG for text-to-image generation.
Significance. If the reported FID gains are causally attributable to the adversarial objective rather than additional optimization steps, the approach could provide a practical post-training refinement technique for flow-based generative models, potentially improving sample quality without relying on classifier-free guidance. The empirical gains on standard benchmarks are notable, but the absence of controls for training duration leaves the mechanistic contribution of the discriminator unestablished.
major comments (2)
- [Experimental results on ImageNet (post-training protocol)] The central claim that the adversarial discriminator induces a better-aligned generalized distribution (and thus the observed FID drops) is not supported by a necessary control: an ablation continuing the identical base model (SiT or JiT) for the same number of post-training steps using only the original flow-matching/MSE objective, identical optimizer, schedule, and batch size. Without this, the improvements (e.g., SiT guidance-free FID 8.26 to 3.63) could arise from extra gradient steps alone, rendering the discriminator incidental.
- [Method and training details] The manuscript provides insufficient training details and ablations for the discriminator (architecture, training schedule relative to the flow model, loss weighting, stability measures) and for the post-training procedure itself. This makes it impossible to assess whether the learned discriminator supplies a stable, non-collapsing signal or introduces new artifacts, directly undermining evaluation of the weakest assumption in the central claim.
minor comments (2)
- [Method] Notation for the continuous-time flow and adversarial objective should be clarified with explicit equations distinguishing the discriminator-augmented loss from standard flow matching.
- [Text-to-image experiments] The text-to-image results on GenEval and DPG would benefit from reporting the exact base models, guidance scales, and number of post-training steps for direct comparison.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments. We agree that additional controls and details are needed to strengthen the claims regarding the contribution of the adversarial objective. We address each major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Experimental results on ImageNet (post-training protocol)] The central claim that the adversarial discriminator induces a better-aligned generalized distribution (and thus the observed FID drops) is not supported by a necessary control: an ablation continuing the identical base model (SiT or JiT) for the same number of post-training steps using only the original flow-matching/MSE objective, identical optimizer, schedule, and batch size. Without this, the improvements (e.g., SiT guidance-free FID 8.26 to 3.63) could arise from extra gradient steps alone, rendering the discriminator incidental.
Authors: We agree that this control is necessary to establish causality. In the revised manuscript, we will add results from continuing training of the base SiT and JiT models for the same number of post-training steps using only the original flow-matching MSE objective, with identical optimizer, learning rate schedule, and batch size. These results will be reported alongside the adversarial post-training outcomes to isolate the effect of the discriminator. revision: yes
-
Referee: [Method and training details] The manuscript provides insufficient training details and ablations for the discriminator (architecture, training schedule relative to the flow model, loss weighting, stability measures) and for the post-training procedure itself. This makes it impossible to assess whether the learned discriminator supplies a stable, non-collapsing signal or introduces new artifacts, directly undermining evaluation of the weakest assumption in the central claim.
Authors: We acknowledge that the current manuscript lacks sufficient implementation details. In the revision, we will expand the Methods section with the discriminator architecture, the precise training schedule (including how it interleaves with the flow model), loss weighting coefficients, and any regularization or stability techniques used. We will also include targeted ablations on these hyperparameters to demonstrate that the discriminator provides a stable training signal without introducing artifacts. revision: yes
Circularity Check
No circularity in empirical claims or derivation
full rationale
The paper introduces an adversarial objective for continuous flow models and supports its claims through direct empirical comparisons of FID scores on ImageNet 256px and other benchmarks (e.g., SiT guidance-free FID dropping from 8.26 to 3.63). These are measured outcomes against external baselines rather than quantities derived from fitted parameters, self-referential equations, or load-bearing self-citations. No self-definitional steps, fitted-input predictions, uniqueness theorems, or ansatzes smuggled via prior work appear in the abstract or described method; the central result is an observed improvement from the changed training objective, which remains independently falsifiable via the reported metrics.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Ai, Y., Han, J., Zhuang, S., Mao, W., Hu, X., Yang, Z., Yang, Z., Huang, H., Yue, X., Chen, H.: Bitdance: Scaling autoregressive generative models with binary tokens. arXiv preprint arXiv:2602.14041 (2026) 11
-
[2]
Towards Principled Methods for Training Generative Adversarial Networks
Arjovsky, M., Bottou, L.: Towards principled methods for training generative ad- versarial networks. arXiv preprint arXiv:1701.04862 (2017) 5
work page Pith review arXiv 2017
-
[3]
In: International conference on machine learning
Arjovsky,M.,Chintala,S.,Bottou,L.:Wassersteingenerativeadversarialnetworks. In: International conference on machine learning. pp. 214–223. Pmlr (2017) 37
2017
-
[4]
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016) 8, 39
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[5]
Black Forest Labs: FLUX.2: Analyzing and enhancing the latent space of FLUX – representation comparison (2025),https://bfl.ai/research/representation- comparison14
2025
-
[6]
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Cai, H., Cao, S., Du, R., Gao, P., Hoi, S., Hou, Z., Huang, S., Jiang, D., Jin, X., Li, L., et al.: Z-image: An efficient image generation foundation model with single-stream diffusion transformer. arXiv preprint arXiv:2511.22699 (2025) 2, 11, 27
work page internal anchor Pith review arXiv 2025
-
[7]
Flow matching on general geometries.arXiv preprint arXiv:2302.03660, 2023
Chen, R.T., Lipman, Y.: Flow matching on general geometries. arXiv preprint arXiv:2302.03660 (2023) 2, 14
-
[8]
Advances in neural information processing systems31(2018) 2
Chen, R.T., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary dif- ferential equations. Advances in neural information processing systems31(2018) 2
2018
-
[9]
Pixelflow: Pixel-space generative models with flow.arXiv preprint arXiv:2504.07963, 2025
Chen, S., Ge, C., Zhang, S., Sun, P., Luo, P.: Pixelflow: Pixel-space generative models with flow. arXiv preprint arXiv:2504.07963 (2025) 11
-
[10]
Training Deep Nets with Sublinear Memory Cost
Chen, T., Xu, B., Zhang, C., Guestrin, C.: Training deep nets with sublinear mem- ory cost. arXiv preprint arXiv:1604.06174 (2016) 8
work page internal anchor Pith review arXiv 2016
-
[11]
arXiv preprint arXiv:2510.08799 (2025) 14
Choudhury, R., Lin, S., Wang, J., Chen, H., Zhao, Q., Cheng, F., Jiang, L., Kitani, K., Jeni, L.A.: Skipsr: Faster super resolution with token skipping. arXiv preprint arXiv:2510.08799 (2025) 14
-
[12]
Advances in neural information processing systems35, 2406–2422 (2022) 14
De Bortoli, V., Mathieu, E., Hutchinson, M., Thornton, J., Teh, Y.W., Doucet, A.: Riemannian score-based generative modelling. Advances in neural information processing systems35, 2406–2422 (2022) 14
2022
-
[13]
Emerging Properties in Unified Multimodal Pretraining
Deng, C., Zhu, D., Li, K., Gou, C., Li, F., Wang, Z., Zhong, S., Yu, W., Nie, X., Song, Z., et al.: Emerging properties in unified multimodal pretraining. arXiv preprint arXiv:2505.14683 (2025) 11
work page internal anchor Pith review arXiv 2025
-
[14]
Advances in neural information processing systems34, 8780–8794 (2021) 2, 11, 14, 20
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. Advances in neural information processing systems34, 8780–8794 (2021) 2, 11, 14, 20
2021
-
[15]
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Gao, Y., Guo, H., Hoang, T., Huang, W., Jiang, L., Kong, F., Li, H., Li, J., Li, L., Li, X., et al.: Seedance 1.0: Exploring the boundaries of video generation models. arXiv preprint arXiv:2506.09113 (2025) 2
work page internal anchor Pith review arXiv 2025
-
[16]
Advances in Neural Information Processing Systems36, 52132–52152 (2023) 3, 11, 12, 28 16 S
Ghosh, D., Hajishirzi, H., Schmidt, L.: Geneval: An object-focused framework for evaluating text-to-image alignment. Advances in Neural Information Processing Systems36, 52132–52152 (2023) 3, 11, 12, 28 16 S. Lin et al
2023
-
[17]
Advances in neural in- formation processing systems27(2014) 2, 14, 20, 21
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in neural in- formation processing systems27(2014) 2, 14, 20, 21
2014
-
[18]
Explaining and Harnessing Adversarial Examples
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014) 2
work page internal anchor Pith review arXiv 2014
-
[19]
Advances in neural information processing systems 30(2017) 37
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. Advances in neural information processing systems 30(2017) 37
2017
-
[20]
Advances in neural information processing systems30(2017) 9
Heusel,M.,Ramsauer,H.,Unterthiner,T.,Nessler,B.,Hochreiter,S.:Ganstrained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems30(2017) 9
2017
-
[21]
In: NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications (2021) 2, 10, 14
Ho, J., Salimans, T.: Classifier-free diffusion guidance. In: NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications (2021) 2, 10, 14
2021
-
[22]
In: International Conference on Machine Learning
Hoogeboom, E., Heek, J., Salimans, T.: simple diffusion: End-to-end diffusion for high resolution images. In: International Conference on Machine Learning. pp. 13213–13232. PMLR (2023) 10, 11
2023
-
[23]
In: Proceedings of the Computer Vision and Pattern Recognition Conference
Hoogeboom, E., Mensink, T., Heek, J., Lamerigts, K., Gao, R., Salimans, T.: Sim- pler diffusion: 1.5 fid on imagenet512 with pixel-space diffusion. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 18062–18071 (2025) 11
2025
-
[24]
arXiv preprint arXiv:2312.08825 (2023) 14
Hu, V.T., Chen, Y., Caron, M., Asano, Y.M., Snoek, C.G., Ommer, B.: Guided diffusion from self-supervised diffusion features. arXiv preprint arXiv:2312.08825 (2023) 14
-
[25]
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Hu, X., Wang, R., Fang, Y., Fu, B., Cheng, P., Yu, G.: Ella: Equip diffusion models with llm for enhanced semantic alignment. arXiv preprint arXiv:2403.05135 (2024) 3, 11, 12, 28
work page internal anchor Pith review arXiv 2024
-
[26]
Advances in Neural Information Processing Systems 37, 44177–44215 (2024) 2, 5
Huang, N., Gokaslan, A., Kuleshov, V., Tompkin, J.: The gan is dead; long live the gan! a modern gan baseline. Advances in Neural Information Processing Systems 37, 44177–44215 (2024) 2, 5
2024
-
[27]
In: International conference on machine learning
Hudson, D.A., Zitnick, L.: Generative adversarial transformers. In: International conference on machine learning. pp. 4487–4499. PMLR (2021) 5
2021
-
[28]
arXiv preprint arXiv:2509.24935 (2025) 2
Hyun, S., Lee, M., Heo, J.P.: Scalable gans with transformers. arXiv preprint arXiv:2509.24935 (2025) 2
-
[29]
arXiv preprint arXiv:1807.00734 , year =
Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard gan. arXiv preprint arXiv:1807.00734 (2018) 5
-
[30]
In: European Conference on Computer Vision
Kang, M., Zhang, R., Barnes, C., Paris, S., Kwak, S., Park, J., Shechtman, E., Zhu, J.Y., Park, T.: Distilling diffusion models into conditional gans. In: European Conference on Computer Vision. pp. 428–447. Springer (2024) 14
2024
-
[31]
In: International Conference on Learning Representations (2018) 5
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for im- proved quality, stability, and variation. In: International Conference on Learning Representations (2018) 5
2018
-
[32]
Advances in neural information processing systems33, 12104–12114 (2020) 5, 38
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. Advances in neural information processing systems33, 12104–12114 (2020) 5, 38
2020
-
[33]
Advances in Neural Information Processing Systems37, 52996–53021 (2024) 2, 14
Karras, T., Aittala, M., Kynkäänniemi, T., Lehtinen, J., Aila, T., Laine, S.: Guid- ing a diffusion model with a bad version of itself. Advances in Neural Information Processing Systems37, 52996–53021 (2024) 2, 14
2024
-
[34]
In: International Conference on Machine Learning
Kim, D., Kim, Y., Kwon, S.J., Kang, W., Moon, I.C.: Refining generative pro- cess with discriminator guidance in score-based diffusion models. In: International Conference on Machine Learning. pp. 16567–16598. PMLR (2023) 14 Continuous Adversarial Flow Models 17
2023
-
[35]
Adam: A Method for Stochastic Optimization
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) 10
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[36]
arXiv preprint arXiv:2006.15704 , author =
Li, S., Zhao, Y., Varma, R., Salpekar, O., Noordhuis, P., Li, T., Paszke, A., Smith, J., Vaughan, B., Damania, P., et al.: Pytorch distributed: Experiences on acceler- ating data parallel training. arXiv preprint arXiv:2006.15704 (2020) 8
-
[37]
Back to Basics: Let Denoising Generative Models Denoise
Li, T., He, K.: Back to basics: Let denoising generative models denoise. arXiv preprint arXiv:2511.13720 (2025) 3, 9, 11
work page internal anchor Pith review arXiv 2025
-
[38]
Lim, J.H., Ye, J.C.: Geometric gan. arXiv preprint arXiv:1705.02894 (2017) 21
-
[39]
In: Proceedings of the IEEE/CVF winter conference on applications of computer vision
Lin,S.,Liu,B.,Li,J.,Yang,X.:Commondiffusionnoiseschedulesandsamplesteps are flawed. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp. 5404–5411 (2024) 14
2024
-
[40]
Sdxl- lightning: Progressive adversarial diffusion distillation
Lin, S., Wang, A., Yang, X.: Sdxl-lightning: Progressive adversarial diffusion dis- tillation. arXiv preprint arXiv:2402.13929 (2024) 2, 14
-
[41]
In: Forty-second International Conference on Machine Learning (2025) 2, 14, 27
Lin, S., Xia, X., Ren, Y., Yang, C., Xiao, X., Jiang, L.: Diffusion adversarial post- training for one-step video generation. In: Forty-second International Conference on Machine Learning (2025) 2, 14, 27
2025
-
[42]
Lin, S., Yang, C., He, H., Jiang, J., Ren, Y., Xia, X., Zhao, Y., Xiao, X., Jiang, L.: Autoregressive adversarial post-training for real-time interactive video generation. arXiv preprint arXiv:2506.09350 (2025) 2, 14
-
[43]
Lin,S.,Yang,C.,Lin,Z.,Chen,H.,Fan,H.:Adversarialflowmodels.arXivpreprint arXiv:2511.22475 (2025) 2, 4, 5, 37
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[44]
arXiv preprint arXiv:2401.00110 (2023) 2, 4, 14
Lin, S., Yang, X.: Diffusion model with perceptual loss. arXiv preprint arXiv:2401.00110 (2023) 2, 4, 14
-
[45]
arXiv preprint arXiv:2403.12706 (2024) 2, 14
Lin, S., Yang, X.: Animatediff-lightning: Cross-model diffusion distillation. arXiv preprint arXiv:2403.12706 (2024) 2, 14
-
[46]
In: The Eleventh International Conference on Learning Representations (2023) 1, 3
Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: The Eleventh International Conference on Learning Representations (2023) 1, 3
2023
-
[47]
In: The Eleventh International Conference on Learning Representations (ICLR) (2023) 3
Liu, X., Gong, C., Liu, Q.: Flow straight and fast: Learning to generate and transfer data with rectified flow. In: The Eleventh International Conference on Learning Representations (ICLR) (2023) 3
2023
-
[48]
Decoupled Weight Decay Regularization
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017) 11
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[49]
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models
Lu, C., Song, Y.: Simplifying, stabilizing and scaling continuous-time consistency models. arXiv preprint arXiv:2410.11081 (2024) 8
work page internal anchor Pith review arXiv 2024
-
[50]
In: European Conference on Computer Vision
Ma, N., Goldstein, M., Albergo, M.S., Boffi, N.M., Vanden-Eijnden, E., Xie, S.: Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers. In: European Conference on Computer Vision. pp. 23–40. Springer (2024) 3, 9, 11
2024
-
[51]
In: Proceedings of the IEEE international conference on computer vision
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares gen- erative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. pp. 2794–2802 (2017) 6, 20, 21, 35
2017
-
[52]
Advances in neural information processing systems33, 2503–2515 (2020) 14
Mathieu, E., Nickel, M.: Riemannian continuous normalizing flows. Advances in neural information processing systems33, 2503–2515 (2020) 14
2020
-
[53]
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for gans do ac- tually converge? In: International conference on machine learning. pp. 3481–3490. PMLR (2018) 37
2018
-
[54]
Advances in neural information processing systems29(2016) 14 18 S
Nowozin, S., Cseke, B., Tomioka, R.: f-gan: Training generative neural samplers us- ing variational divergence minimization. Advances in neural information processing systems29(2016) 14 18 S. Lin et al
2016
-
[55]
DINOv2: Learning Robust Visual Features without Supervision
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al.: Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023) 10, 11
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[56]
Peebles,W.,Xie,S.:Scalablediffusionmodelswithtransformers.In:Proceedingsof the IEEE/CVF international conference on computer vision. pp. 4195–4205 (2023) 11
2023
-
[57]
Advances in neural information processing systems37, 117340–117362 (2024) 2, 14
Ren, Y., Xia, X., Lu, Y., Zhang, J., Wu, J., Xie, P., Wang, X., Xiao, X.: Hyper-sd: Trajectory segmented consistency model for efficient image synthesis. Advances in neural information processing systems37, 117340–117362 (2024) 2, 14
2024
-
[58]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022) 10, 14
2022
-
[59]
Advances in neural information processing systems30(2017) 5, 37
Roth, K., Lucchi, A., Nowozin, S., Hofmann, T.: Stabilizing training of genera- tive adversarial networks through regularization. Advances in neural information processing systems30(2017) 5, 37
2017
-
[60]
International journal of computer vision115(3), 211–252 (2015) 3, 9
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recog- nition challenge. International journal of computer vision115(3), 211–252 (2015) 3, 9
2015
-
[61]
Advances in neural information processing systems29(2016) 9
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. Advances in neural information processing systems29(2016) 9
2016
-
[62]
In: SIGGRAPH Asia 2024 Conference Papers
Sauer, A., Boesel, F., Dockhorn, T., Blattmann, A., Esser, P., Rombach, R.: Fast high-resolution image synthesis with latent adversarial diffusion distillation. In: SIGGRAPH Asia 2024 Conference Papers. pp. 1–11 (2024) 2, 14
2024
-
[63]
In: European Conference on Computer Vision
Sauer, A., Lorenz, D., Blattmann, A., Rombach, R.: Adversarial diffusion distilla- tion. In: European Conference on Computer Vision. pp. 87–103. Springer (2024) 2, 14
2024
-
[64]
In: ACM SIGGRAPH 2022 conference proceedings
Sauer, A., Schwarz, K., Geiger, A.: Stylegan-xl: Scaling stylegan to large diverse datasets. In: ACM SIGGRAPH 2022 conference proceedings. pp. 1–10 (2022) 2
2022
-
[65]
Seaweed-7b: Cost-effective training of video generation foundation model
Seawead, T., Yang, C., Lin, Z., Zhao, Y., Lin, S., Ma, Z., Guo, H., Chen, H., Qi, L., Wang, S., et al.: Seaweed-7b: Cost-effective training of video generation foundation model. arXiv preprint arXiv:2504.08685 (2025) 2
-
[66]
Seedance, T., Chen, H., Chen, S., Chen, X., Chen, Y., Chen, Y., Chen, Z., Cheng, F., Cheng, T., Cheng, X., et al.: Seedance 1.5 pro: A native audio-visual joint generation foundation model. arXiv preprint arXiv:2512.13507 (2025) 2
-
[67]
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Seedream, T., Chen, Y., Gao, Y., Gong, L., Guo, M., Guo, Q., Guo, Z., Hou, X., Huang, W., Huang, Y., et al.: Seedream 4.0: Toward next-generation multimodal image generation. arXiv preprint arXiv:2509.20427 (2025) 2
work page internal anchor Pith review arXiv 2025
-
[68]
In: Interna- tional Conference on Learning Representations (2021) 2, 14
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score- based generative modeling through stochastic differential equations. In: Interna- tional Conference on Learning Representations (2021) 2, 14
2021
-
[69]
arXiv preprint arXiv:2601.16208 (2026),https://arxiv.org/abs/2601.16208
Tong, S., Zheng, B., Wang, Z., Tang, B., Ma, N., Brown, E., Yang, J., Fergus, R., LeCun,Y.,Xie,S.:Scalingtext-to-imagediffusiontransformerswithrepresentation autoencoders. arXiv preprint arXiv:2601.16208 (2026) 14
-
[70]
Advances in neural information pro- cessing systems30(2017) 8 Continuous Adversarial Flow Models 19
Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez,A.N.,Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information pro- cessing systems30(2017) 8 Continuous Adversarial Flow Models 19
2017
-
[71]
Advances in neural information processing systems37, 83951–84009 (2024) 14
Wang, F.Y., Huang, Z., Bergman, A.W., Shen, D., Gao, P., Lingelbach, M., Sun, K., Bian, W., Song, G., Liu, Y., et al.: Phased consistency models. Advances in neural information processing systems37, 83951–84009 (2024) 14
2024
-
[72]
arXiv preprint arXiv:2506.05301 (2025)
Wang, J., Lin, S., Lin, Z., Ren, Y., Wei, M., Yue, Z., Zhou, S., Chen, H., Zhao, Y., Yang, C., et al.: Seedvr2: One-step video restoration via diffusion adversarial post-training. arXiv preprint arXiv:2506.05301 (2025) 14
-
[73]
Wang, R., He, K.: Diffuse and disperse: Image generation with representation reg- ularization. arXiv preprint arXiv:2506.09027 (2025) 11
-
[74]
Pixnerd: Pixel neural field diffusion.arXiv preprint arXiv:2507.23268,
Wang, S., Gao, Z., Zhu, C., Huang, W., Wang, L.: Pixnerd: Pixel neural field diffusion. arXiv preprint arXiv:2507.23268 (2025) 11
-
[75]
DDT: Decoupled diffusion Transformer.arXiv:2504.05741, 2025
Wang, S., Tian, Z., Huang, W., Wang, L.: Ddt: Decoupled diffusion transformer. arXiv preprint arXiv:2504.05741 (2025) 11
-
[76]
arXiv preprint arXiv:2510.01184 (2025) 14
Xu, Y., Wu, Y., Park, S., Zhou, Z., Tulsiani, S.: Temporal score rescaling for tem- perature sampling in diffusion and flow models. arXiv preprint arXiv:2510.01184 (2025) 14
-
[77]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Xu, Y., Zhao, Y., Xiao, Z., Hou, T.: Ufogen: You forward once large scale text-to- image generation via diffusion gans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8196–8206 (2024) 2, 14
2024
-
[78]
Advances in neural information processing systems37, 47455–47487 (2024) 2, 14
Yin, T., Gharbi, M., Park, T., Zhang, R., Shechtman, E., Durand, F., Freeman, B.: Improved distribution matching distillation for fast image synthesis. Advances in neural information processing systems37, 47455–47487 (2024) 2, 14
2024
-
[79]
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Yu, S., Kwak, S., Jang, H., Jeong, J., Huang, J., Shin, J., Xie, S.: Representation alignment for generation: Training diffusion transformers is easier than you think. arXiv preprint arXiv:2410.06940 (2024) 11
work page internal anchor Pith review arXiv 2024
-
[80]
Advances in neural information processing systems32(2019) 8, 39
Zhang, B., Sennrich, R.: Root mean square layer normalization. Advances in neural information processing systems32(2019) 8, 39
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.