arxiv: 2605.09071 · v1 · submitted 2026-05-09 · 💻 cs.CV

Recognition: no theorem link

Probability-Flow Distillation: Exact Wasserstein Gradient Flow for High-Fidelity 3D Generation

Rohith Ramanan , A. N. Rajagopalan

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:21 UTC · model grok-4.3

classification 💻 cs.CV

keywords probability flow distillationwasserstein gradient flowscore distillation samplingtext-to-3D generationdiffusion models3D generationmode collapse

0 comments

The pith

Probability-Flow Distillation exactly matches the Wasserstein gradient flow to produce higher-fidelity 3D models from 2D priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Score Distillation Sampling and its inversion variant often yield over-smoothed or mode-collapsed 3D outputs because they rely on a posterior mean estimator. The paper shows this estimator equals a single-step Euler approximation of the deterministic reverse diffusion trajectory, which prevents full capture of the target distribution. Probability-Flow Distillation replaces that estimator with the complete probability-flow trajectory. The authors prove the resulting objective is identical to the Wasserstein gradient flow, which supplies principled dynamics for matching distributions. This change produces 3D assets that retain finer details and align more closely with the intended output distribution.

Core claim

Probability-Flow Distillation (PFD) corresponds exactly to a Wasserstein gradient flow, thereby inducing principled distribution-matching dynamics that address the mode collapse and incomplete sampling seen in prior score distillation methods for text-to-3D generation.

What carries the argument

Probability-Flow Distillation, the extension of score-distillation-via-inversion that substitutes the full probability-flow ODE for the posterior-mean estimator and is shown to equal the Wasserstein gradient flow.

Load-bearing premise

The mathematical equivalence between PFD and the Wasserstein gradient flow holds without additional approximations or hidden steps.

What would settle it

A direct derivation of the PFD gradient that fails to match the standard Wasserstein gradient flow equation, or a controlled comparison of 3D models generated by PFD versus SDI on identical prompts that shows no gain in detail fidelity or distribution coverage.

Figures

Figures reproduced from arXiv: 2605.09071 by A. N. Rajagopalan, Rohith Ramanan.

**Figure 2.** Figure 2: Comparison of different distillation gradients. (a) [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of 2D particle evolution targeting a concentric circles dataset. The rows depict [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Illustration of the proposed text-to-3D generation pipeline based on PFD. The DDIM-ODE [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparison of 3D objects generated by SDS, SDI, VSD, CSD, and PFD. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: (a,b) SDI at 64 × 64 and 512 × 512, (c,d) PFD at 64 × 64 and 512 × 512 E.2 Effect of the CFG Scale Observation at varying scales. We analyze the effect of the CFG scale on generation quality. As the guidance scale increases, the generated outputs become increasingly noisy and distorted (see [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: ). This behavior arises because larger guidance scales amplify the guidance term, making the discretized DDIM-ODE less stable during optimization. In practice, moderate guidance scales in the range of 5 to 12 provide a good balance between convergence speed and prompt alignment without causing instability. Accordingly, we use a CFG scale of 7.5 in all experiments [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 8.** Figure 8: Effect of forward and reverse CFG scales [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

**Figure 9.** Figure 9: Effect of time annealing during late-stage optimization on generated 3D assets. [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗

**Figure 10.** Figure 10: Illustration of the Janus problem. Generated 3D objects exhibit inconsistent geometry [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗

read the original abstract

Score Distillation Sampling (SDS) and its variants have been widely used for text-to-3D generation by distilling 2D image diffusion priors. However, the standard SDS objective is prone to severe mode collapse, frequently yielding over-smoothed and over-saturated results. Although recent advancements, such as Score Distillation via Inversion (SDI), mitigate these artifacts and produce visually sharper models, they ultimately fail to faithfully capture the full target distribution. In this work, we show that the bottleneck limiting the sampling capacity of SDI stems from its reliance on the posterior mean estimator, which is mathematically equivalent to a single-step Euler approximation of the deterministic reverse DDIM trajectory. To address this, we propose a naturally motivated extension termed Probability-Flow Distillation (PFD). We establish that PFD corresponds exactly to a Wasserstein gradient flow, thereby inducing principled distribution-matching dynamics. Finally, we show that PFD can synthesize 3D assets with fine-grained, high-fidelity details and achieve improved quality compared to existing methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PFD claims an exact Wasserstein gradient flow match to improve 3D distillation over SDI, but the discrete case may still carry approximation gaps.

read the letter

The main takeaway is that this paper replaces SDI's posterior-mean estimator with a probability-flow version and asserts that the new objective is exactly the Wasserstein gradient flow on the measure space. That framing is the clearest novelty: it turns the distillation step into something that should follow the true transport dynamics instead of a single Euler step on the DDIM trajectory. If the derivation holds, it directly targets the mode-collapse and over-smoothing problems that SDS and SDI both show in practice. The authors are right to flag the single-step approximation as the bottleneck; that diagnosis is clean and matches what people have observed in text-to-3D outputs. The work is useful for anyone already using score-distillation pipelines who wants a more distribution-aware loss without adding heavy new machinery. The math connection to optimal transport is a natural extension of existing diffusion literature, and the claim that PFD induces principled matching dynamics is worth testing. The soft spot is the leap from continuous-time equivalence to the actual implemented algorithm. The stress-test note is on point here: once you discretize the flow solver, sample camera poses, and rely on estimated scores, residual errors can creep back in, just as they did with SDI. The abstract states the exact match but does not walk through the steps or show that the Monte-Carlo estimates preserve the velocity field without bias. Without those details or quantitative ablations, the fidelity gains remain hard to attribute solely to the Wasserstein property. This paper is for groups working on diffusion-based 3D generation who already know SDS/SDI well. A reader looking for a grounded alternative to ad-hoc fixes will find something to try. It deserves a serious referee because the central idea is a substantive attempt to ground the method in transport theory rather than another loss tweak, even if the exactness claim needs verification in the full derivation and experiments.

Referee Report

2 major / 2 minor

Summary. The paper identifies limitations in Score Distillation Sampling (SDS) and its variant Score Distillation via Inversion (SDI) for text-to-3D generation, attributing SDI's failure to capture the full target distribution to its reliance on the posterior mean estimator, which is equivalent to a single-step Euler approximation of the deterministic reverse DDIM trajectory. It introduces Probability-Flow Distillation (PFD) and claims that PFD corresponds exactly to a Wasserstein gradient flow on the space of measures, thereby inducing principled distribution-matching dynamics that yield 3D assets with fine-grained, high-fidelity details and improved quality over prior methods.

Significance. If the claimed exact equivalence between PFD and the Wasserstein gradient flow holds without unaccounted discretization or approximation residuals, the work would supply a theoretically grounded alternative to heuristic distillation objectives, potentially enabling more faithful matching to the target distribution in 3D generation tasks. This could strengthen the link between optimal transport theory and practical diffusion-prior distillation, with implications for reducing mode collapse and over-smoothing artifacts.

major comments (2)

[Abstract and §3] Abstract and §3 (method derivation): The central claim that 'PFD corresponds exactly to a Wasserstein gradient flow' is load-bearing for the paper's contribution, yet the provided abstract supplies no derivation steps, no explicit velocity-field matching, and no accounting for residual terms arising from score estimation, Monte-Carlo sampling over camera poses, or discretization of the probability-flow ODE. The manuscript must exhibit the precise steps showing that the PFD objective and its gradient reproduce the WGF velocity field identically, without Euler-like truncation errors analogous to those identified for SDI.
[§4] §4 (experiments) and the discrete implementation: The skeptic concern that the exact match may hold only in the continuous-time limit must be addressed by showing that the practical PFD update rule (including any multi-step flow solver) introduces no residual discretization error that would undermine the 'principled distribution-matching dynamics' assertion; otherwise the improvement in fidelity cannot be attributed to the WGF equivalence.

minor comments (2)

[Abstract] The abstract states that SDI 'ultimately fail[s] to faithfully capture the full target distribution' but provides no quantitative metrics or ablation details supporting this; the full manuscript should include such evidence to ground the motivation.
[§2] Notation for the probability-flow ODE and the posterior mean estimator should be introduced with explicit equation numbers early in the method section to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of our theoretical claims and their practical implications. We address each major comment below and describe the revisions we will implement.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (method derivation): The central claim that 'PFD corresponds exactly to a Wasserstein gradient flow' is load-bearing for the paper's contribution, yet the provided abstract supplies no derivation steps, no explicit velocity-field matching, and no accounting for residual terms arising from score estimation, Monte-Carlo sampling over camera poses, or discretization of the probability-flow ODE. The manuscript must exhibit the precise steps showing that the PFD objective and its gradient reproduce the WGF velocity field identically, without Euler-like truncation errors analogous to those identified for SDI.

Authors: We agree that the derivation of the exact equivalence requires greater explicitness. In the revised manuscript we will expand Section 3 with a complete step-by-step derivation: starting from the probability-flow ODE, we show that the PFD loss gradient exactly recovers the Wasserstein gradient flow velocity field on the space of measures when the expectation is taken over camera poses. We will add an explicit velocity-field matching lemma and clarify that, in the continuous-time setting, score estimation is treated as exact (standard in the diffusion literature) and that PFD avoids the single-step Euler truncation that characterizes SDI. Residual terms from finite discretization and Monte-Carlo sampling will be isolated and analyzed in a new subsection. We will also revise the abstract to reference these key steps. revision: yes
Referee: [§4] §4 (experiments) and the discrete implementation: The skeptic concern that the exact match may hold only in the continuous-time limit must be addressed by showing that the practical PFD update rule (including any multi-step flow solver) introduces no residual discretization error that would undermine the 'principled distribution-matching dynamics' assertion; otherwise the improvement in fidelity cannot be attributed to the WGF equivalence.

Authors: We acknowledge that the continuous-time equivalence does not automatically extend to the discrete solver without further analysis. In the revision we will add to Section 4 both a theoretical error bound for the multi-step probability-flow ODE integrator and empirical measurements of discretization residuals across the step sizes used in our experiments. These additions will demonstrate that the observed fidelity gains are consistent with reduced deviation from the Wasserstein gradient flow relative to SDI. We will explicitly state that the term 'exact' applies to the continuous limit while the practical algorithm remains a high-fidelity approximation. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected; derivation presented as independent establishment

full rationale

The provided abstract motivates PFD as a natural extension addressing the explicit single-step Euler limitation of SDI's posterior mean estimator. It then states that PFD 'corresponds exactly' to the Wasserstein gradient flow as an established result. No equations, self-citations, or definitional reductions are quoted that would make the correspondence tautological by construction. No fitted parameters are renamed as predictions, no uniqueness theorems are imported, and no ansatz is smuggled via prior work. The central claim therefore remains a non-circular derivation step within the given text.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard diffusion-model assumptions (DDIM reverse process, posterior mean estimator) and the mathematical identification of probability flow with Wasserstein gradient flow; no new free parameters or invented entities are introduced in the abstract.

axioms (2)

domain assumption The deterministic reverse DDIM trajectory can be approximated by its posterior mean estimator in a single Euler step.
Invoked to identify the bottleneck in SDI.
standard math Wasserstein gradient flow provides the correct continuous dynamics for matching the target distribution in distillation.
Used to establish that PFD induces principled distribution-matching.

pith-pipeline@v0.9.0 · 5486 in / 1373 out tokens · 61044 ms · 2026-05-12T02:21:59.997954+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · 1 internal anchor

[1]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, volume 33, pages 6840–6851, 2020

work page 2020
[2]

High-Resolution Image Synthesis with Latent Diffusion Models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High- resolution image synthesis with latent diffusion models.CoRR, abs/2112.10752, 2021. URL https://arxiv.org/abs/2112.10752

work page internal anchor Pith review Pith/arXiv arXiv 2021
[3]

Fleet, and Mohammad Norouzi

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Raphael Gontijo-Lopes, Burcu Karagol Ayan, Tim Salimans, Jonathan Ho, David J. Fleet, and Mohammad Norouzi. Photorealistic text-to-image diffusion models with deep language understanding. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Ky...

work page 2022
[4]

Elucidating the design space of diffusion-based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=k7FuTOWMOc7

work page 2022
[5]

Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=PqvMRDCJT9t

work page 2023
[6]

Barron, and Ben Mildenhall

Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. Dreamfusion: Text-to-3d using 2d diffusion.arXiv, 2022

work page 2022
[7]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InECCV, 2020

work page 2020
[8]

URL https://doi.org/10.1145/3528223

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ACM Trans. Graph., 41(4):102:1–102:15, July 2022. doi: 10.1145/3528223.3530127. URL https://doi.org/10.1145/3528223. 3530127

work page doi:10.1145/3528223.3530127 2022
[9]

Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis

Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, and Sanja Fidler. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. InAdvances in Neural Information Processing Systems (NeurIPS), 2021

work page 2021
[10]

3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4), July

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4), July

work page
[11]

URLhttps://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

work page
[12]

2023 , url =

Amir Hertz, Kfir Aberman, and Daniel Cohen-Or. Delta denoising score. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 2328–2337, 2023. doi: 10.1109/ ICCV51070.2023.00221

work page arXiv 2023
[13]

Noise-free score distillation

Oren Katzir, Or Patashnik, Daniel Cohen-Or, and Dani Lischinski. Noise-free score distillation. InThe Twelfth International Conference on Learning Representations, 2024. URL https: //openreview.net/forum?id=dlIMcmlAdk

work page 2024
[14]

Magic3d: High-resolution text-to-3d content creation

Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. Magic3d: High-resolution text-to-3d content creation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 300–309, 2023

work page 2023
[15]

Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation

Rui Chen, Yongwei Chen, Ningxin Jiao, and Kui Jia. Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 22246–22256, October 2023

work page 2023
[16]

Gaussiandreamer: Fast generation from text to 3d gaussians by bridging 2d and 3d diffusion models

Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, and Xinggang Wang. Gaussiandreamer: Fast generation from text to 3d gaussians by bridging 2d and 3d diffusion models. InCVPR, 2024

work page 2024
[17]

Text-to-3d with classifier score distillation

Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, and XIAOJUAN QI. Text-to-3d with classifier score distillation. InThe Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=ktG8Tun1Cy. 10

work page 2024
[18]

Pro- lificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation

Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, and Jun Zhu. Pro- lificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

work page 2023
[19]

LoRA: Low-rank adaptation of large language models

Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. URL https://openreview. net/forum?id=nZeVKeeFYf9

work page 2022
[20]

Score distillation via reparametrized ddim.Advances in Neural Information Processing Systems, 37:26011–26044, 2024

Artem Lukoianov, Haitz S’aez de Oc’ariz Borde, Kristjan Greenewald, Vitor Guizilini, Timur Bagautdinov, Vincent Sitzmann, and Justin M Solomon. Score distillation via reparametrized ddim.Advances in Neural Information Processing Systems, 37:26011–26044, 2024

work page 2024
[21]

Jacobs, Alexei A

David McAllister, Songwei Ge, Jia-Bin Huang, David W. Jacobs, Alexei A. Efros, Aleksander Holynski, and Angjoo Kanazawa. Rethinking score distillation as a bridge between image distributions. InAdvances in Neural Information Processing Systems, 2024

work page 2024
[22]

Dual diffusion implicit bridges for image-to-image translation

Xuan Su, Jiaming Song, Chenlin Meng, and Stefano Ermon. Dual diffusion implicit bridges for image-to-image translation. InThe Eleventh International Conference on Learning Representa- tions, 2023. URLhttps://openreview.net/forum?id=5HLoTvVGDe

work page 2023
[23]

Anderson

Brian D.O. Anderson. Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982. ISSN 0304-4149. doi: https://doi.org/10.1016/ 0304-4149(82)90051-5. URL https://www.sciencedirect.com/science/article/ pii/0304414982900515

work page arXiv 1982
[24]

A connection between score matching and denoising autoencoders.Neural computation, 23(7):1661–1674, 2011

Pascal Vincent. A connection between score matching and denoising autoencoders.Neural Computation, 23(7):1661–1674, 2011. doi: 10.1162/NECO_a_00142

work page doi:10.1162/neco_a_00142 2011
[25]

Score-based generative modeling through stochastic differential equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021. URL https://openreview. net/forum?id=PxTIG12RRHS

work page 2021
[26]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021. URL https://openreview. net/forum?id=St1giarCHLP

work page 2021
[27]

Diffusion models beat GANs on image synthesis

Prafulla Dhariwal and Alexander Quinn Nichol. Diffusion models beat GANs on image synthesis. In A. Beygelzimer, Y . Dauphin, P. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum? id=AAWuCvzaVt

work page 2021
[28]

Assran, Q

Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, and Daniel Cohen-Or. Null-text inversion for editing real images using guided diffusion models. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6038–6047, 2023. doi: 10.1109/ CVPR52729.2023.00585

work page arXiv 2023
[29]

American Mathematical Society, Providence, RI, 2003

Cédric Villani.Topics in Optimal Transportation, volume 58 ofGraduate Studies in Mathemat- ics. American Mathematical Society, Providence, RI, 2003

work page 2003
[30]

An introduction to optimal transport and wasserstein gradient flows,

Alessio Figalli. An introduction to optimal transport and wasserstein gradient flows,

work page
[31]

Optimal Transport on Quantum Structures

URL https://people.math.ethz.ch/~afigalli/lecture-notes-pdf/ An-introduction-to-optimal-transport-and-Wasserstein-gradient-flows. pdf. Lecture notes from the School “Optimal Transport on Quantum Structures”, Erd˝os Center, Alfréd Rényi Institute of Mathematics, September 19–23, 2022

work page 2022
[32]

Birkhäuser, 2008

Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gradient Flows in Metric Spaces and in the Space of Probability Measures. Birkhäuser, 2008

work page 2008
[33]

Langevin dynamics — Wikipedia, the free encyclopedia

Wikipedia contributors. Langevin dynamics — Wikipedia, the free encyclopedia. https: //en.wikipedia.org/w/index.php?title=Langevin_dynamics&oldid=1348039185

work page
[34]

Classifier-free diffusion guidance

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance. InNeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021. URLhttps://openreview. net/forum?id=qw8AKxfYbI

work page 2021
[35]

Yeh, and Greg Shakhnarovich

Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A. Yeh, and Greg Shakhnarovich. Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. InProceedings 11 of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7047–7056, 2023

work page 2023
[36]

HIFA: High-fidelity text-to-3d generation with advanced diffusion guidance

Junzhe Zhu, Peiye Zhuang, and Sanmi Koyejo. HIFA: High-fidelity text-to-3d generation with advanced diffusion guidance. InThe Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=IZMPWmcS3H

work page 2024
[37]

Lucid- dreamer: Towards high-fidelity text-to-3d generation via interval score matching

Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiaogang Xu, and Yingcong Chen. Lucid- dreamer: Towards high-fidelity text-to-3d generation via interval score matching. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6517– 6526, June 2024

work page 2024
[38]

Zero-1-to-3: Zero-shot one image to 3d object

Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl V ondrick. Zero-1-to-3: Zero-shot one image to 3d object. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9298–9309, October 2023

work page 2023
[39]

Syncdreamer: Generating multiview-consistent images from a single-view image

Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, and Wenping Wang. Syncdreamer: Generating multiview-consistent images from a single-view image. InThe Twelfth International Conference on Learning Representations, 2024. URL https: //openreview.net/forum?id=MN3yH2ovHb

work page 2024
[40]

In: CVPR

Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, and Wenping Wang. Wonder3d: Single image to 3d using cross-domain diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. doi: 10.1109/CVPR52733.2024. 00951

work page doi:10.1109/cvpr52733.2024 2024
[41]

MVDream: Multi-view diffusion for 3d generation

Yichun Shi, Peng Wang, Jianglong Ye, Long Mai, Kejie Li, and Xiao Yang. MVDream: Multi-view diffusion for 3d generation. InThe Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=FUgrjq2pbB

work page 2024
[42]

Re-imagine the negative prompt algorithm: Transform 2d diffusion into 3d, alleviate janus problem and beyond.arXiv preprint arXiv:2304.04968, 2023

Mohammadreza Armandpour, Huangjie Zheng, Ali Sadeghian, Amir Sadeghian, and Mingyuan Zhou. Re-imagine the negative prompt algorithm: Transform 2d diffusion into 3d, alleviate janus problem and beyond.arXiv preprint arXiv:2304.04968, 2023

work page arXiv 2023
[43]

Stein variational gradient descent: A general purpose bayesian inference algorithm

Qiang Liu and Dilin Wang. Stein variational gradient descent: A general purpose bayesian inference algorithm. InAdvances in Neural Information Processing Systems 29 (NeurIPS 2016), 2016

work page 2016
[44]

Chapter 10 - geometry in sampling methods: A review on man- ifold mcmc and particle-based variational inference methods

Chang Liu and Jun Zhu. Chapter 10 - geometry in sampling methods: A review on man- ifold mcmc and particle-based variational inference methods. In Arni S.R. Srinivasa Rao, G. Alastair Young, and C.R. Rao, editors,Advancements in Bayesian Methods and Imple- mentation, volume 47 ofHandbook of Statistics, pages 239–293. Elsevier, 2022. doi: https: //doi.org/...

work page doi:10.1016/bs.host.2022.07.004 2022
[45]

A unified particle- optimization framework for scalable bayesian sampling

Changyou Chen, Ruiyi Zhang, Wenlin Wang, Bai Li, and Liqun Chen. A unified particle- optimization framework for scalable bayesian sampling. InConference on Uncertainty in Artifi- cial Intelligence, 2018. URL https://api.semanticscholar.org/CorpusID:44111731

work page 2018
[46]

CLIPScore: A reference-free evaluation metric for image captioning

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. CLIPScore: A reference-free evaluation metric for image captioning. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors,Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7514–7528, Online and Punta Cana, D...

work page 2021
[47]

Exploring clip for assessing the look and feel of images

Jianyi Wang, Kelvin CK Chan, and Chen Change Loy. Exploring clip for assessing the look and feel of images. InAAAI, 2023

work page 2023
[48]

Torch- metrics - measuring reproducibility in pytorch.Journal of Open Source Software, 7(70):4101,

Nicki Skafte Detlefsen, Jiri Borovec, Justus Schock, Ananya Harsh Jha, Teddy Koker, Luca Di Liello, Daniel Stancl, Changsheng Quan, Maxim Grechkin, and William Falcon. Torch- metrics - measuring reproducibility in pytorch.Journal of Open Source Software, 7(70):4101,

work page
[49]

URLhttps://doi.org/10.21105/joss.04101

doi: 10.21105/joss.04101. URLhttps://doi.org/10.21105/joss.04101

work page doi:10.21105/joss.04101
[50]

Imagereward: learning and evaluating human preferences for text-to-image generation

Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Imagereward: learning and evaluating human preferences for text-to-image generation. 12 InProceedings of the 37th International Conference on Neural Information Processing Systems, pages 15903–15935, 2023

work page 2023
[51]

Greiner and J

W. Greiner and J. Reinhardt.Field Quantization. Springer, 1996. ISBN 9783540591795. URL https://books.google.co.in/books?id=VvBAvf0wSrIC

work page 1996
[52]

Engel and R.M

E. Engel and R.M. Dreizler.Density Functional Theory: An Advanced Course. Theoretical and Mathematical Physics. Springer Berlin Heidelberg, 2011. ISBN 9783642140891. URL https://books.google.co.in/books?id=o9byjwEACAAJ

work page 2011
[53]

Society for Industrial and Applied Mathemat- ics, second edition, 2002

Philip Hartman.Ordinary Differential Equations. Society for Industrial and Applied Mathemat- ics, second edition, 2002. doi: 10.1137/1.9780898719222. URL https://epubs.siam.org/ doi/abs/10.1137/1.9780898719222

work page doi:10.1137/1.9780898719222 2002
[54]

threestudio: A uni- fied framework for 3d content generation

Yuan-Chen Guo, Ying-Tian Liu, Ruizhi Shao, Christian Laforte, Vikram V oleti, Guan Luo, Chia-Hao Chen, Zi-Xin Zou, Chen Wang, Yan-Pei Cao, and Song-Hai Zhang. threestudio: A uni- fied framework for 3d content generation. https://github.com/threestudio-project/ threestudio, 2023

work page 2023
[55]

DPM-solver++: Fast solver for guided sampling of diffusion probabilistic models, 2023

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. DPM-solver++: Fast solver for guided sampling of diffusion probabilistic models, 2023. URL https:// openreview.net/forum?id=4vGwQqviud5

work page 2023
[56]

DeepFloyd IF.https://github.com/deep-floyd/IF, 2023

StabilityAI. DeepFloyd IF.https://github.com/deep-floyd/IF, 2023

work page 2023
[57]

A marble bust of a mouse

Patrick von Platen, Suraj Patil, Anton Lozhkov, Pedro Cuenca, Nathan Lambert, Kashif Rasul, Mishig Davaadorj, Dhruv Nair, Sayak Paul, William Berman, Yiyi Xu, Steven Liu, and Thomas Wolf. Diffusers: State-of-the-art diffusion models. https://github.com/huggingface/ diffusers, 2022. A Proof of Theorem 1 Assumption and Definitions.We assume that the underly...

work page 2022