Divergence is Uncertainty: A Closed-Form Posterior Covariance for Flow Matching
Pith reviewed 2026-05-22 10:28 UTC · model grok-4.3
The pith
Flow matching uncertainty reduces exactly to the divergence of the learned velocity field.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By extending Tweedie's formula from the denoising setting to the flow matching interpolant, we derive an exact, closed-form expression for the posterior covariance at every point along the generative trajectory. The result depends on a single quantity, namely the divergence of the learned velocity field, which can be computed post-hoc on any pre-trained flow matching model, requiring no retraining and no architectural modification.
What carries the argument
The closed-form posterior covariance obtained by extending Tweedie's formula to the flow matching interpolant, expressed solely in terms of the divergence of the learned velocity field.
If this is right
- Uncertainty can be obtained for any pre-trained flow matching model without retraining or auxiliary heads.
- One-step generators such as MeanFlow produce end-to-end generation uncertainty in a single forward pass.
- Per-pixel uncertainty maps concentrate on high-variation regions such as digit boundaries.
- A scalar uncertainty score derived from the same expression tracks actual prediction error.
- Uncertainty evaluation requires orders of magnitude less compute than ensembling or Monte Carlo dropout.
Where Pith is reading between the lines
- The same divergence-based relation may supply closed-form uncertainty for other continuous-time generative models.
- These uncertainty maps could be used to prioritize regions for refinement or to weight samples in downstream tasks.
- Direct comparison of the derived covariance against ground-truth variance on controlled synthetic data would provide a sharper test.
Load-bearing premise
The mathematical extension of Tweedie's formula from the denoising setting to the continuous flow matching interpolant holds.
What would settle it
Train a flow matching model on MNIST, compute the formula's covariance using the velocity divergence at selected points, then compare it directly to the empirical covariance measured across many independent generated samples at those same points.
Figures
read the original abstract
Flow matching has become a leading framework for generative modeling, but quantifying the uncertainty of its samples remains an open problem. Existing approaches retrain the model with auxiliary variance heads, maintain costly ensembles, or propagate approximate covariance through many integration steps, trading off training cost, inference cost, or accuracy. We show that none of these trade-offs is necessary. By extending Tweedie's formula from the denoising setting to the flow matching interpolant, we derive an exact, closed-form expression for the posterior covariance at every point along the generative trajectory. The result depends on a single quantity, namely the divergence of the learned velocity field, which can be computed post-hoc on any pre-trained flow matching model, requiring no retraining and no architectural modification. For one-step generators such as MeanFlow, the same formula yields the end-to-end generation uncertainty in a single forward pass, eliminating the multi-step variance propagation required by all prior methods. Experiments on MNIST confirm that the resulting per-pixel uncertainty maps are semantically meaningful, concentrating on digit boundaries where inter-sample variation is highest, and that the scalar uncertainty score tracks actual prediction error, all at roughly $10^4 \times$ less total compute than ensembling or Monte Carlo dropout.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that by extending Tweedie's formula from the denoising setting to the flow-matching interpolant x_t = (1-t)x_0 + t x_1, one obtains an exact closed-form expression for the posterior covariance at any point along the trajectory; this expression depends only on the divergence of the learned velocity field v_t and can be evaluated post-hoc on any pre-trained flow-matching model. Experiments on MNIST are presented to show that the resulting per-pixel uncertainty maps are semantically meaningful and that a scalar uncertainty score correlates with prediction error, all at far lower compute than ensembles.
Significance. If the central derivation holds, the result would be a practical advance for uncertainty quantification in flow matching, eliminating the need for retraining, auxiliary heads, or multi-step covariance propagation. The post-hoc nature and applicability to one-step generators such as MeanFlow are attractive. The MNIST results provide initial evidence that the uncertainty is interpretable, but the overall significance is conditional on the exactness of the mathematical extension.
major comments (2)
- [Section 3 (derivation)] The derivation that extends Tweedie's formula to the flow-matching interpolant and concludes that posterior covariance is exactly the divergence of v_t (Section 3, around the statement following Eq. (7) or equivalent): the posterior covariance at an intermediate t generally depends on the full Jacobian of the flow map, not solely on its trace (divergence). Please supply the explicit steps showing how Jacobian determinant or eigenvalue contributions reduce to the scalar divergence, including any isotropy or straight-path assumptions required for the reduction to be exact.
- [Section 3] The claim that the formula is 'exact' and 'closed-form' for general learned velocity fields (abstract and Section 3): if the reduction relies on the velocity field satisfying additional structure beyond the standard flow-matching objective, this should be stated explicitly as a modeling assumption, because the ODE marginalization otherwise retains off-diagonal covariance terms.
minor comments (2)
- [Experiments] The MNIST experiments are described only at a high level; adding quantitative metrics (e.g., correlation coefficients between uncertainty score and actual error) and a comparison against a simple baseline such as MC dropout would strengthen the empirical support.
- [Notation and preliminaries] Notation: ensure consistent use of subscripts for time t and clarify whether the divergence is evaluated on the learned or ground-truth velocity field in the theoretical statements.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments. We address each major comment below and have revised the manuscript to improve the presentation and explicitness of the derivation in Section 3.
read point-by-point responses
-
Referee: [Section 3 (derivation)] The derivation that extends Tweedie's formula to the flow-matching interpolant and concludes that posterior covariance is exactly the divergence of v_t (Section 3, around the statement following Eq. (7) or equivalent): the posterior covariance at an intermediate t generally depends on the full Jacobian of the flow map, not solely on its trace (divergence). Please supply the explicit steps showing how Jacobian determinant or eigenvalue contributions reduce to the scalar divergence, including any isotropy or straight-path assumptions required for the reduction to be exact.
Authors: We thank the referee for this observation. The original manuscript presented the core extension concisely. Under the straight-path interpolant x_t = (1-t)x_0 + t x_1 that defines standard flow matching, the map from x_t to the posterior over x_0 is an affine transformation whose Jacobian is a scalar multiple of the identity. All eigenvalues are therefore identical, and the determinant and eigenvalue contributions to the posterior covariance reduce exactly to the trace of the Jacobian, which is the divergence of v_t. We have added the full step-by-step derivation, including the explicit invocation of the straight-path assumption and the resulting isotropy, to the revised Section 3. revision: yes
-
Referee: [Section 3] The claim that the formula is 'exact' and 'closed-form' for general learned velocity fields (abstract and Section 3): if the reduction relies on additional structure beyond the standard flow-matching objective, this should be stated explicitly as a modeling assumption, because the ODE marginalization otherwise retains off-diagonal covariance terms.
Authors: We appreciate the referee highlighting the need for clarity. The derivation is exact for any velocity field obtained from the standard flow-matching objective on the linear interpolant; no further structure is imposed. Because the interpolant is linear, the conditional distributions along the trajectory yield a posterior covariance whose off-diagonal contributions are eliminated by the change-of-variables and the definition of the learned velocity as the conditional expectation, leaving only the divergence (trace) as the closed-form scalar expression. Per-pixel uncertainty maps are obtained by evaluating the relevant diagonal contributions component-wise. We have partially revised Section 3 to state these modeling choices explicitly while preserving the claim of exactness under the standard flow-matching setup. revision: partial
Circularity Check
Derivation extends external Tweedie's formula to flow-matching interpolant without self-definition or fitted-input reduction
full rationale
The paper's central result is obtained by extending Tweedie's formula (an external identity from the denoising literature) to the flow-matching interpolant x_t = (1-t)x_0 + t x_1, yielding posterior covariance expressed solely in terms of the divergence of the already-trained velocity field v_t. This quantity is computed post-hoc on a pre-trained model and is not a newly fitted parameter or a self-defined quantity. No equations in the abstract or described derivation reduce the claimed covariance back to the input velocity field by construction, nor do they rely on load-bearing self-citations, uniqueness theorems from the same authors, or smuggled ansatzes. The derivation remains self-contained once the validity of the Tweedie extension is granted; any doubt about missing Jacobian terms concerns correctness rather than circularity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Tweedie's formula extends exactly to the flow matching interpolant
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By extending Tweedie’s formula ... the result depends on a single quantity, namely the divergence of the learned velocity field
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Michael S Albergo and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions. arXiv preprint arXiv:2303.08797, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
Tweedie moment projected diffusions for inverse problems.arXiv preprint arXiv:2310.06721, 2024
Benjamin Boys, Mark Girolami, Jakiw Pidstrigach, Sebastian Reich, Alan Mosca, and O Deniz Akyildiz. Tweedie moment projected diffusions for inverse problems.arXiv preprint arXiv:2310.06721, 2024
-
[3]
Hao Chen, Rui Yin, Yifan Chen, Qi Chen, and Chao Li. Learning patient-specific disease dynamics with latent flow matching for longitudinal imaging generation.arXiv preprint arXiv:2512.09185, 2025
-
[4]
Bradley Efron. Tweedie’s formula and selection bias.Journal of the American Statistical Association, 106(496):1602–1614, 2011
work page 2011
-
[5]
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim En- tezari, Jonas M¨uller, Harry Saini, Yam Levi, Dominik Lorber, Dustin Podell, Robin Rombach, et al. Scaling rectified flow transformers for high-resolution image synthesis.Interna- tional Conference on Machine Learning, 2024
work page 2024
-
[6]
Dropout as a Bayesian approximation: Representing model uncertainty in deep learn- ing
Yarin Gal and Zoubin Ghahramani. Dropout as a Bayesian approximation: Representing model uncertainty in deep learn- ing. InInternational Conference on Machine Learning, pages 1050–1059, 2016
work page 2016
-
[7]
Zhengyang Geng, Mingyang Deng, Xingjian Bai, Zico Kolter, and Kaiming He. Mean flows for one-step generative model- ing.Advances in Neural Information Processing Systems, 38: 75460–75482, 2025. 8
work page 2025
-
[8]
Quantifying epistemic uncertainty in diffusion models.arXiv preprint arXiv:2602.09170, 2026
Aditi Gupta, Raphael A Meyer, Yotam Yaniv, Elynn Chen, and N Benjamin Erichson. Quantifying epistemic uncertainty in diffusion models.arXiv preprint arXiv:2602.09170, 2026
-
[9]
Flow Matching with Uncertainty Quantification and Guidance
Juyeop Han, Lukas Lao Beyer, and Sertac Karaman. Flow matching with uncertainty quantification and guidance.arXiv preprint arXiv:2602.10326, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[10]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851, 2020
work page 2020
-
[11]
Michael F Hutchinson. A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines.Com- munications in Statistics—Simulation and Computation, 18 (3):1059–1076, 1989
work page 1989
-
[12]
Generative uncertainty in diffusion models.arXiv preprint arXiv:2502.20946, 2025
Metod Jazbec, Eliot Wong-Toi, Guoxuan Xia, Dan Zhang, Eric Nalisnick, and Stephan Mandt. Generative uncertainty in diffusion models.arXiv preprint arXiv:2502.20946, 2025
-
[13]
Siqi Kou, Lei Gan, Dequan Wang, Chongxuan Li, and Zhijie Deng. Bayesdiff: Estimating pixel-wise uncertainty in diffu- sion via bayesian inference.arXiv preprint arXiv:2310.11142, 2024
-
[14]
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estima- tion using deep ensembles.Advances in Neural Information Processing Systems, 30, 2017
work page 2017
-
[15]
Yann LeCun, L ´eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recog- nition.Proceedings of the IEEE, 86(11):2278–2324, 1998
work page 1998
-
[16]
Flow Matching for Generative Modeling
Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximil- ian Nickel, and Matthew Le. Flow matching for generative modeling.arXiv preprint arXiv:2210.02747, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[17]
Chen Liu, Ke Xu, Liangbo L Shen, Guillaume Huguet, Zilong Wang, Alexander Tong, Danilo Bzdok, Jay Stewart, Jay C Wang, Lucian V Del Priore, and Smita Krishnaswamy. Im- ageflownet: Forecasting multiscale image-level trajectories of disease progression with irregularly-sampled longitudinal medical images.arXiv preprint arXiv:2406.14794, 2024
-
[18]
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[19]
On the posterior distribu- tion in denoising: Application to uncertainty quantification
Hila Manor and Tomer Michaeli. On the posterior distribu- tion in denoising: Application to uncertainty quantification. International Conference on Learning Representations, 2024
work page 2024
-
[20]
Movie Gen: A Cast of Media Foundation Models
Adam Polyak et al. Movie Gen: A cast of media foundation models.arXiv preprint arXiv:2410.13720, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[21]
Severi Rissanen, Markus Heinonen, and Arno Solin. Free hunch: Denoiser covariance estimation for diffusion models without extra costs.arXiv preprint arXiv:2410.11149, 2024
-
[22]
An empirical Bayes approach to statistics
Herbert E Robbins. An empirical Bayes approach to statistics. Proceedings of the Third Berkeley Symposium on Mathemati- cal Statistics and Probability, 1956
work page 1956
-
[23]
Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models.International Conference on Learning Representations, 2022
work page 2022
-
[24]
Eigenscore: Ood detection using covariance in diffusion models.arXiv preprint arXiv:2510.07206, 2025
Shirin Shoushtari, Yi Wang, Xiao Shi, M Salman Asif, and Ulugbek S Kamilov. Eigenscore: OOD detection using posterior covariance in diffusion models.arXiv preprint arXiv:2510.07206, 2025
-
[25]
Score-based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. International Conference on Learning Representations, 2021
work page 2021
-
[26]
Consistency models.International Conference on Machine Learning, 2023
Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models.International Conference on Machine Learning, 2023
work page 2023
-
[27]
Pascal Vincent. A connection between score matching and denoising autoencoders.Neural Computation, 23(7):1661– 1674, 2011
work page 2011
-
[28]
Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging
Dongxia Wu, Yuhui Zhang, Serena Yeung-Levy, Emma Lundberg, and Emily B Fox. Uncertainty quantification for distribution-to-distribution flow matching in scientific imag- ing.arXiv preprint arXiv:2603.21717, 2026. 9
work page internal anchor Pith review Pith/arXiv arXiv 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.