pith. sign in

arxiv: 1907.09987 · v1 · pith:2MJKXJQNnew · submitted 2019-07-22 · 📊 stat.ML · cs.LG· eess.IV· physics.comp-ph

Bayesian Inference with Generative Adversarial Network Priors

Pith reviewed 2026-05-24 18:16 UTC · model grok-4.3

classification 📊 stat.ML cs.LGeess.IVphysics.comp-ph
keywords Bayesian inferenceGenerative Adversarial Networksprior distributionuncertainty quantificationinverse problemshigh-dimensional fieldsheat conduction
0
0 comments X

The pith

GAN generator provides an approximate prior for Bayesian inference in high-dimensional fields with complex distributions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Bayesian inference struggles when the field to infer has a large number of discrete values or when its prior distribution lacks a simple mathematical description. The paper trains a GAN on multiple samples of the field so that the generator learns to produce new samples from an approximation to that distribution. The generator then replaces the usual prior inside the Bayesian update step that incorporates noisy measurements. This is shown on the task of recovering an initial temperature field from a later noisy temperature measurement in a heat conduction model. A sympathetic reader would care because the method removes the need to write down an explicit prior while still delivering posterior samples for uncertainty estimates.

Core claim

The paper shows that once a GAN is trained on samples of a field, its generator can be used directly as the prior in a Bayesian inference procedure. This approximates the distribution of the field of interest and allows the update to be performed even when the field has a large discrete dimension and the prior is complex, as illustrated in the heat conduction example where the initial temperature is inferred from later noisy temperature data.

What carries the argument

The GAN generator, which maps the components of a low-dimensional latent vector to an approximation of the distribution of the high-dimensional field of interest, serving as the prior in the Bayesian update.

Load-bearing premise

The samples generated by the trained GAN are distributed closely enough to the true prior that the Bayesian posterior computed with them remains a good approximation to the true posterior.

What would settle it

Compare the posterior mean and variance obtained using the GAN prior against those from an exact Bayesian update on a test problem where the true prior distribution is known and easy to sample from.

Figures

Figures reproduced from arXiv: 1907.09987 by Assad A Oberai, Dhruv Patel.

Figure 1
Figure 1. Figure 1: Schematic diagram of a GAN. The other component of a GAN is a discriminator, which is also composed of successive non-linear transformations. However, these transformations are designed to down-scale the original input. The final few layers of the discriminator are fully connected neural networks which lead to a simple classifier (like a soft-max, for example). The discriminator maps an input field, x, to … view at source ↗
Figure 2
Figure 2. Figure 2: Eigenvalues of the discretized forward operator [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Sample images from set S used to train the GAN. We train a Wasserstein GAN (WGAN) with gradient penalty term on the set S to create a generator to produce synthetic images of the initial temperature field. The detailed architecture of the generator (g) and the discriminator(d) are shown in Appendix A. The generator consists of 3 resid￾ual blocks (see [59]) and 4 convolutional layers and the discriminator c… view at source ↗
Figure 4
Figure 4. Figure 4: Sample images produced by trained GAN. Next we generate the target field that we wish to infer and the corre￾sponding measurement. As shown in Figure 5a, this comprises of a square patch with edge = L/2 centered on center of the total domain. This field is passed through the forward map to generate the noise-free version of the mea￾sured field, which is shown in Figure 5b. Thereafter, iid Gaussian noise wi… view at source ↗
Figure 5
Figure 5. Figure 5: The target field and the measurement. Once the generator of the GAN is trained and the measured field has been computed, we apply the algorithms developed in the previous section to probe the posterior distribution. We first use these to determine the MAP estimate for the posterior dis￾tribution of the latent vector (denoted by z map). In order to evaluate this estimate we use a gradient-based algorithm (B… view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of the true field with the the inferred fields. [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: MCMC estimate of point-wise standard deviation. [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
read the original abstract

Bayesian inference is used extensively to infer and to quantify the uncertainty in a field of interest from a measurement of a related field when the two are linked by a physical model. Despite its many applications, Bayesian inference faces challenges when inferring fields that have discrete representations of large dimension, and/or have prior distributions that are difficult to represent mathematically. In this manuscript we consider the use of Generative Adversarial Networks (GANs) in addressing these challenges. A GAN is a type of deep neural network equipped with the ability to learn the distribution implied by multiple samples of a given field. Once trained on these samples, the generator component of a GAN maps the iid components of a low-dimensional latent vector to an approximation of the distribution of the field of interest. In this work we demonstrate how this approximate distribution may be used as a prior in a Bayesian update, and how it addresses the challenges associated with characterizing complex prior distributions and the large dimension of the inferred field. We demonstrate the efficacy of this approach by applying it to the problem of inferring and quantifying uncertainty in the initial temperature field in a heat conduction problem from a noisy measurement of the temperature at later time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes using a trained GAN generator as an approximate prior for Bayesian inference on high-dimensional fields with complex distributions. The generator maps low-dimensional latent vectors to field samples; Bayesian updating is performed via MCMC sampling in latent space. The approach is demonstrated on inferring the initial temperature field in a 1D heat conduction problem from noisy temperature measurements at a later time, yielding plausible posterior fields and uncertainty estimates.

Significance. If the GAN distribution is sufficiently close to the true prior, the method addresses a practical barrier in Bayesian inference by replacing hand-specified priors with data-driven ones while reducing the effective dimension of the sampling problem. The use of standard MCMC in latent space (rather than custom samplers) and the reporting of plausible posterior fields are concrete strengths that support feasibility for similar inverse problems.

minor comments (3)
  1. The abstract states that the GAN 'addresses the challenges associated with characterizing complex prior distributions,' but the manuscript should explicitly note (e.g., in the discussion or conclusions) that this holds only to the extent that the trained generator matches the empirical distribution of the training samples; no quantitative distance metric between GAN samples and held-out prior samples is reported.
  2. Notation for the latent-space posterior (p(z | data)) versus the induced field posterior should be clarified in §3 or §4 to avoid ambiguity when the generator is non-invertible.
  3. Figure captions for the heat-conduction results should include the number of MCMC samples retained after burn-in and the acceptance rate, as these directly affect the reliability of the reported posterior means and variances.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation of minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The manuscript trains a GAN generator on external samples to approximate a complex high-dimensional prior, then performs standard MCMC sampling in the latent space of that fixed generator to obtain the posterior. No equation in the provided text defines a quantity in terms of itself, renames a fitted parameter as a prediction, or relies on a self-citation chain for a uniqueness claim. The Bayesian update step uses the trained generator as an external black-box map; the heat-conduction demonstration reports posterior statistics without claiming that any derived quantity is forced by the training data alone. The derivation chain therefore remains independent of its own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that a trained GAN can serve as a usable prior and on standard assumptions about neural network training and Bayesian updating.

free parameters (1)
  • GAN latent dimension
    Chosen to control the generator mapping; value not stated in abstract.
axioms (2)
  • domain assumption A GAN trained on samples can approximate the distribution of the field of interest sufficiently well for use as a prior.
    Invoked when the abstract states that the generator maps latent vectors to an approximation of the distribution.
  • standard math Standard Bayesian update remains valid when the prior is replaced by samples from a generator network.
    Implicit in the claim that the GAN distribution may be used as a prior in a Bayesian update.

pith-pipeline@v0.9.0 · 5739 in / 1247 out tokens · 19706 ms · 2026-05-24T18:16:16.715784+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 2 internal anchors

  1. [1]

    Kaipio, E

    J. Kaipio, E. Somersalo, Statistical and computational inverse problems, volume 160, Springer Science & Business Media, 2006

  2. [2]

    Dashti, A

    M. Dashti, A. M. Stuart, The bayesian approach to inverse problems, Handbook of Uncertainty Quantification (2016) 1–118

  3. [3]

    Polpo, J

    A. Polpo, J. Stern, F. Louzada, R. Izbicki, H. Takada (Eds.), Bayesian Inference and Maximum Entropy Methods in Science and Engineer- ing, volume 239 of Springer Proceedings in Mathematics & Statistics , Springer International Publishing, Cham, 2018

  4. [4]

    W. P. Gouveia, J. A. Scales, Resolution of seismic waveform inversion: Bayes versus Occam, Inverse Problems 13 (1997) 323–349

  5. [5]

    Malinverno, Parsimonious Bayesian Markov chain Monte Carlo in- version in a nonlinear geophysical problem, Geophysical Journal Inter- national 151 (2002) 675–688

    A. Malinverno, Parsimonious Bayesian Markov chain Monte Carlo in- version in a nonlinear geophysical problem, Geophysical Journal Inter- national 151 (2002) 675–688

  6. [6]

    Martin, L

    J. Martin, L. C. Wilcox, C. Burstedde, O. Ghattas, A Stochastic Newton MCMC Method for Large-Scale Statistical Inverse Problems with Ap- plication to Seismic Inversion, SIAM Journal on Scientific Computing 34 (2012) A1460–A1487

  7. [7]

    Isaac, N

    T. Isaac, N. Petra, G. Stadler, O. Ghattas, Scalable and efficient algo- rithms for the propagation of uncertainty from data through inference to prediction for large-scale problems, with application to flow of the 20 Antarctic ice sheet, Journal of Computational Physics 296 (2015) 348– 368

  8. [8]

    Jackson, M

    C. Jackson, M. K. Sen, P. L. Stoffa, C. Jackson, M. K. Sen, P. L. Stoffa, An Efficient Stochastic Bayesian Approach to Optimal Parameter and Uncertainty Estimation for Climate Model Predictions, Journal of Cli- mate 17 (2004) 2828–2841

  9. [9]

    H. N. Najm, B. J. Debusschere, Y. M. Marzouk, S. Widmer, O. P. Le Ma ˜A R⃝tre, Uncertainty quantification in chemical systems, Interna- tional Journal for Numerical Methods in Engineering 80 (2009) 789–814

  10. [10]

    J. Wang, N. Zabaras, Hierarchical bayesian models for inverse problems in heat conduction, Inverse Problems 21 (2004) 183–206

  11. [11]

    T. J. Loredo, From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics, in: Maximum Entropy and Bayesian Methods, Springer Netherlands, Dordrecht, 1990, pp. 81–142

  12. [12]

    Asensio Ramos, M

    A. Asensio Ramos, M. J. Mart´ ınez Gonz´ alez, J. A. Rubi˜ no-Mart´ ın, Bayesian inversion of Stokes profiles, Astronomy & Astrophysics 476 (2007) 959–970

  13. [13]

    T. J. Sabin, C. A. L. Bailer-Jones, P. J. Withers, Accelerated learning using Gaussian process models to predict static recrystallization in an Al-Mg alloy, Modelling and Simulation in Materials Science and Engi- neering 8 (2000) 687–706

  14. [14]

    Siltanen, V

    S. Siltanen, V. Kolehmainen, S. J rvenp, J. P. Kaipio, P. Koistinen, M. Lassas, J. Pirttil, E. Somersalo, Statistical inversion for medical x- ray tomography with few radiographs: I. General theory, Physics in Medicine and Biology 48 (2003) 1437–1463

  15. [15]

    Kolehmainen, A

    V. Kolehmainen, A. Vanne, S. Siltanen, S. Jarvenpaa, J. Kaipio, M. Las- sas, M. Kalke, Parallelized Bayesian inversion for three-dimensional den- tal X-ray imaging, IEEE Transactions on Medical Imaging 25 (2006) 218–228

  16. [16]

    Tarantola, Inverse problem theory and methods for model parameter estimation, volume 89, siam, 2005

    A. Tarantola, Inverse problem theory and methods for model parameter estimation, volume 89, siam, 2005. 21

  17. [17]

    Fahrmeir, S

    L. Fahrmeir, S. Lang, Bayesian inference for generalized additive mixed models based on Markov random field priors, Journal of the Royal Statistical Society: Series C (Applied Statistics) 50 (2001) 201–220

  18. [18]

    A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numerica 19 (2010) 451–559

  19. [19]

    Y. M. Marzouk, H. N. Najm, Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems, Journal of Computational Physics 228 (2009) 1862–1902

  20. [20]

    Calvetti, E

    D. Calvetti, E. Somersalo, Hypermodels in the Bayesian imaging frame- work, Inverse Problems 24 (2008) 034013

  21. [21]

    Bui-Thanh, C

    T. Bui-Thanh, C. Burstedde, O. Ghattas, J. Martin, G. Stadler, L. C. Wilcox, Extreme-scale UQ for Bayesian inverse problems governed by PDEs, in: 2012 International Conference for High Performance Com- puting, Networking, Storage and Analysis, IEEE, 2012, pp. 1–11

  22. [22]

    C. Han, B. P. Carlin, Markov chain monte carlo methods for computing bayes factors: A comparative review, Journal of the American Statistical Association 96 (2001) 1122–1132

  23. [23]

    M. D. Parno, Y. M. Marzouk, Transport map accelerated markov chain monte carlo, SIAM/ASA Journal on Uncertainty Quantification 6 (2018) 645–682

  24. [24]

    Goodfellow, J

    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Advances in neural information processing systems, pp. 2672–2680

  25. [25]

    Arjovsky, S

    M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN (2017)

  26. [26]

    Gulrajani, F

    I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. C. Courville, Im- proved training of wasserstein gans, in: Advances in neural information processing systems, pp. 5767–5777

  27. [27]

    Makhzani, J

    A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, B. Frey, Adversarial Autoencoders (2015)

  28. [28]

    Dumoulin, I

    V. Dumoulin, I. Belghazi, B. Poole, O. Mastropietro, A. Lamb, M. Ar- jovsky, A. Courville, Adversarially Learned Inference (2016). 22

  29. [29]

    Mescheder, S

    L. Mescheder, S. Nowozin, A. Geiger, Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Net- works (2017)

  30. [30]

    Brock, J

    A. Brock, J. Donahue, K. Simonyan, Large Scale GAN Training for High Fidelity Natural Image Synthesis (2018)

  31. [31]

    Karras, T

    T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive Growing of GANs for Improved Quality, Stability, and Variation (2017)

  32. [32]

    MaskGAN: Better Text Generation via Filling in the______

    W. Fedus, I. Goodfellow, A. M. Dai, Maskgan: better text generation via filling in the , arXiv preprint arXiv:1801.07736 (2018)

  33. [33]

    Tulyakov, M.-Y

    S. Tulyakov, M.-Y. Liu, X. Yang, J. Kautz, Mocogan: Decomposing motion and content for video generation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1526–1535

  34. [34]

    L. Ma, X. Jia, Q. Sun, B. Schiele, T. Tuytelaars, L. Van Gool, Pose Guided Person Image Generation (2017)

  35. [35]

    Wang, M.-Y

    T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, B. Catanzaro, High-Resolution Image Synthesis and Semantic Manipulation with Con- ditional GANs (2017)

  36. [36]

    Vauhkonen, J

    M. Vauhkonen, J. P. Kaipio, E. Somersalo, P. A. Karjalainen, Electri- cal impedance tomography with basis constraints, Inverse Problems 13 (1997) 523–530

  37. [37]

    Calvetti, E

    D. Calvetti, E. Somersalo, Priorconditioners for linear systems, Inverse Problems 21 (2005) 1397–1418

  38. [38]

    Lieberman, K

    C. Lieberman, K. Willcox, O. Ghattas, Parameter and State Model Reduction for Large-Scale Statistical Inverse Problems, SIAM Journal on Scientific Computing 32 (2010) 2523–2542

  39. [39]

    Adler, O

    J. Adler, O. ¨Oktem, Solving ill-posed inverse problems using iterative deep neural networks, Inverse Problems 33 (2017) 124007

  40. [40]

    K. H. Jin, M. T. McCann, E. Froustey, M. Unser, Deep Convolutional Neural Network for Inverse Problems in Imaging, IEEE Transactions on Image Processing 26 (2017) 4509–4522. 23

  41. [41]

    Patel, R

    D. Patel, R. Tibrewala, A. Vega, L. Dong, N. Hugenberg, A. A. Oberai, Circumventing the solution of inverse problems in mechanics through deep learning: Application to elasticity imaging, Computer Methods in Applied Mechanics and Engineering 353 (2019) 448–466

  42. [42]

    J. H. R. Chang, C.-L. Li, B. Barnab´ as, P. . Oczos, B. V. K. V. Kumar, A. C. Sankaranarayanan, One Network to Solve Them All-Solving Linear Inverse Problems using Deep Projection Models, Technical Report, ????

  43. [43]

    Kupyn, V

    O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, J. Matas, De- blurGAN: Blind Motion Deblurring Using Conditional Adversarial Net- works, 2018

  44. [44]

    Q. Yang, P. Yan, Y. Zhang, H. Yu, Y. Shi, X. Mou, M. K. Kalra, Y. Zhang, L. Sun, G. Wang, Low-Dose CT Image Denoising Using a Generative Adversarial Network With Wasserstein Distance and Percep- tual Loss, IEEE Transactions on Medical Imaging 37 (2018) 1348–1357

  45. [45]

    Ledig, L

    C. Ledig, L. Theis, F. Husz´ ar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, W. Shi Twitter, Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, Technical Report, ????

  46. [46]

    Anirudh, J

    R. Anirudh, J. J. Thiagarajan, B. Kailkhura, T. Bremer, An Unsuper- vised Approach to Solving Inverse Problems using Generative Adversar- ial Networks (2018)

  47. [47]

    Isola, J.-Y

    P. Isola, J.-Y. Zhu, T. Zhou, A. A. Efros, Image-to-image translation with conditional adversarial networks, arxiv (2016)

  48. [48]

    J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Computer Vision (ICCV), 2017 IEEE International Conference on

  49. [49]

    T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim, Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (2017)

  50. [50]

    S. Lunz, O. ¨Oktem, C.-B. Sch¨ onlieb, Adversarial regularizers in inverse problems, in: Advances in Neural Information Processing Systems, pp. 8507–8516. 24

  51. [51]

    A. Bora, A. Jalal, E. Price, A. G. Dimakis, Compressed sensing using generative models, in: Proceedings of the 34th International Conference on Machine Learning-Volume 70, JMLR. org, pp. 537–546

  52. [52]

    A. Bora, E. Price, A. G. Dimakis, Ambientgan: Generative models from lossy measurements., ICLR 2 (2018) 5

  53. [53]

    Kabkab, P

    M. Kabkab, P. Samangouei, R. Chellappa, Task-aware compressed sens- ing with generative adversarial networks, in: Thirty-Second AAAI Con- ference on Artificial Intelligence

  54. [54]

    Y. Wu, M. Rosca, T. Lillicrap, Deep compressed sensing, arXiv preprint arXiv:1905.06723 (2019)

  55. [55]

    V. Shah, C. Hegde, Solving linear inverse problems using gan priors: An algorithm with provable guarantees, in: 2018 IEEE International Con- ference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 4609–4613

  56. [56]

    Adler, O

    J. Adler, O. ¨Oktem, Deep bayesian inversion, arXiv preprint arXiv:1811.05910 (2018)

  57. [57]

    Gal, Uncertainty in deep learning, Ph.D

    Y. Gal, Uncertainty in deep learning, Ph.D. thesis, PhD thesis, Univer- sity of Cambridge, 2016

  58. [58]

    Y. Gal, Z. Ghahramani, Dropout as a Bayesian Approximation: Repre- senting Model Uncertainty in Deep Learning (2015)

  59. [59]

    K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition (2015)

  60. [60]

    Abadi, A

    M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kud- lur, J. Levenberg, D. Man´ e, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanho...

  61. [61]

    Metropolis, A

    N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, E. Teller, Equation of State Calculations by Fast Computing Machines, The Journal of Chemical Physics 21 (1953) 1087–1092

  62. [62]

    W. K. Hastings, Monte Carlo Sampling Methods Using Markov Chains and Their Applications, Biometrika 57 (1970) 97

  63. [63]

    Y. F. Atchad´ e, An Adaptive Version for the Metropolis Adjusted Langevin Algorithm with a Truncated Drift, Methodology and Com- puting in Applied Probability 8 (2006) 235–254

  64. [64]

    M. D. Hoffman, A. Gelman, The No-U-Turn Sampler: Adaptively Set- ting Path Lengths in Hamiltonian Monte Carlo, Technical Report, 2014

  65. [65]

    Brooks, A

    S. Brooks, A. Gelman, G. Jones, X.-L. Meng, R. M. Neal, MCMC using Hamiltonian dynamics, Technical Report, 2012. 26 Appendix A. Architecture details The architecture of the generator component of the GAN is shown in Figure A.8, and the architecture of the discriminator is shown in Figure A.9. Some notes regarding the nomenclature used in these figures: • C...