pith. sign in

arxiv: 2104.03436 · v2 · submitted 2021-04-08 · 🧮 math.ST · stat.ME· stat.TH

Synthetic likelihood in misspecified models

Pith reviewed 2026-05-24 13:22 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH
keywords synthetic likelihoodmodel misspecificationBayesian inferenceposterior distributionrobust inferencemultimodalityasymptotic normality
0
0 comments X

The pith

Bayesian synthetic likelihood posteriors can become multimodal or asymptotically non-Gaussian under model misspecification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines the Bayesian synthetic likelihood posterior when the assumed model differs from the true data generating process. It demonstrates that this posterior exhibits a range of non-standard behaviors, such as multimodality and failure to be asymptotically Gaussian, with the specific behavior depending on the degree of misspecification. These findings matter because synthetic likelihood is commonly applied to complex models where direct likelihood evaluation is impossible. The work shows that likelihood tempering, a standard robustness technique, does not succeed here, while recently proposed robust synthetic likelihood methods can produce reliable posterior inference even under misspecification. All illustrations rely on a simple running example.

Core claim

When the model is misspecified, the Bayesian synthetic likelihood posterior can display multimodality and asymptotic non-Gaussianity. Likelihood tempering fails for synthetic likelihood, but recently proposed robust synthetic likelihood approaches can ameliorate this behavior and deliver reliable posterior inference under model misspecification.

What carries the argument

The Bayesian synthetic likelihood posterior, constructed by approximating the likelihood via summaries of simulated data from the assumed model.

If this is right

  • The posterior may fail to concentrate around any single value as data volume grows.
  • Standard likelihood tempering cannot be relied upon to restore robustness.
  • Robust synthetic likelihood variants become necessary for stable inference.
  • Posterior behavior varies systematically with the level of misspecification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners may need to inspect synthetic likelihood posteriors for multiple modes before trusting point estimates.
  • The results suggest that simulation-based inference methods require misspecification diagnostics tailored to the summary statistics used.
  • Similar non-standard behavior could appear in other simulation-based approaches that rely on matching simulated and observed summaries.

Load-bearing premise

The non-standard behaviors observed in the simple running example generalize to the complex models where synthetic likelihood is typically used.

What would settle it

Apply Bayesian synthetic likelihood to a complex simulation model with controlled levels of misspecification and check whether the posterior becomes multimodal or loses asymptotic normality as the misspecification increases.

Figures

Figures reproduced from arXiv: 2104.03436 by Christopher Drovandi, David J. Nott, David T. Frazier.

Figure 1
Figure 1. Figure 1: BSL posteriors for θ in the misspecified MA(1) model across fifty replicated data sets. In the remainder of this paper, we elaborate on the above behavior and formally characterize the asymptotic behavior of the BSL posterior when the model generating the simulated data is misspecified. The remainder of the paper is organized as follows. In Section two, we discuss the relevant concept of model misspecifica… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of “exact” synthetic likelihood posterior under different levels of model misspec [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Tempered BSL Posteriors for θ in the misspecified MA(1) model across ten replicated data sets. 4.2 Robustifying BSL As the running example has concretely illustrated, in cases where the model is significantly misspeci￾fied, and due to the nature of ln gn(Sn|θ), the inference problem can become ill-posed: the population nonlinear SL score equations ∂ ∂θ kΣ(θ) −1/2 {b(θ) − b0}k2 = 0, can exhibit multiple sol… view at source ↗
Figure 4
Figure 4. Figure 4: r-BSL Posteriors for θ in the misspecified MA(1) model across six different levels of model misspecification. The solid line corresponds to n = 100, the dashed line to n = 500 and the dotted line to n = 1000. 4.2.2 A Robust Adjustment Approach While Frazier and Drovandi (2021) demonstrate that the r-BSL approach delivers reliable inference even in highly-misspecified models, it requires conducting posterio… view at source ↗
Figure 5
Figure 5. Figure 5: Estimated BSL posterior densities for the summary statistic vector [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Estimated BSL and adjusted BSL posterior densities for the summary statistic vector [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Posterior predictive distribution of the summary statistics when applying standard and robust [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Estimated posterior distributions for the components of [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Estimated posterior distributions for the components of [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 11
Figure 11. Figure 11: Kernel estimates of posterior predictive densities for bootstrap estimated variances for [PITH_FULL_IMAGE:figures/full_fig_p042_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: This figure includes the same information as in Figure [PITH_FULL_IMAGE:figures/full_fig_p042_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Estimated BSL posterior densities for the summary statistic vector [PITH_FULL_IMAGE:figures/full_fig_p043_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: This figure includes the same information as in Figure [PITH_FULL_IMAGE:figures/full_fig_p043_14.png] view at source ↗
read the original abstract

Bayesian synthetic likelihood is a widely used approach for conducting Bayesian analysis in complex models where evaluation of the likelihood is infeasible but simulation from the assumed model is tractable. We analyze the behaviour of the Bayesian synthetic likelihood posterior when the assumed model differs from the actual data generating process. We demonstrate that the Bayesian synthetic likelihood posterior can display a wide range of non-standard behaviours depending on the level of model misspecification, including multimodality and asymptotic non-Gaussianity. Our results suggest that likelihood tempering, a common approach for robust Bayesian inference, fails for synthetic likelihood whilst recently proposed robust synthetic likelihood approaches can ameliorate this behavior and deliver reliable posterior inference under model misspecification. All results are illustrated using a simple running example.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper analyzes the Bayesian synthetic likelihood (SL) posterior under model misspecification. It claims that this posterior can exhibit a range of non-standard behaviors, including multimodality and asymptotic non-Gaussianity, depending on the degree of misspecification. Likelihood tempering is shown to fail in this setting, while recently proposed robust SL methods succeed in delivering reliable posterior inference. All claims and illustrations are based on a single simple running example.

Significance. If the observed pathologies generalize beyond the simple example, the work would be significant for highlighting risks in standard SL and tempering under misspecification and for supporting robust alternatives in simulation-based inference. The paper does not provide machine-checked proofs or reproducible code for the claims.

major comments (2)
  1. [Abstract / running example] Abstract and running-example section: the central claims (multimodality, asymptotic non-Gaussianity, failure of tempering, success of robust SL) are demonstrated exclusively in a simple low-dimensional model where misspecification can be varied directly. No theorem, high-dimensional numerical example, or standard SL application (e.g., g-and-k or SDE) is provided to show that these behaviors persist when summary statistics are high-dimensional or the simulator is complex and only partially misspecified.
  2. [Abstract] The weakest-assumption paragraph and the abstract together indicate that the paper assumes behaviors observed in the simple example generalize to the complex models that motivate SL; this assumption is load-bearing for the practical implications but is not tested or proved.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive report. We address the major comments point by point below. Our response acknowledges the illustrative nature of the running example while clarifying the paper's scope and intent.

read point-by-point responses
  1. Referee: [Abstract / running example] Abstract and running-example section: the central claims (multimodality, asymptotic non-Gaussianity, failure of tempering, success of robust SL) are demonstrated exclusively in a simple low-dimensional model where misspecification can be varied directly. No theorem, high-dimensional numerical example, or standard SL application (e.g., g-and-k or SDE) is provided to show that these behaviors persist when summary statistics are high-dimensional or the simulator is complex and only partially misspecified.

    Authors: We agree that all demonstrations are confined to the simple low-dimensional running example. This choice was deliberate to permit direct and transparent variation of the misspecification level, enabling clear illustration of the resulting posterior pathologies without confounding factors. The manuscript does not contain a general theorem because its contribution is to identify and exhibit these non-standard behaviors rather than to establish their universality. We will add a dedicated discussion paragraph on the potential extension of these phenomena to higher-dimensional or partially misspecified settings, though we do not claim to resolve that extension here. revision: partial

  2. Referee: [Abstract] The weakest-assumption paragraph and the abstract together indicate that the paper assumes behaviors observed in the simple example generalize to the complex models that motivate SL; this assumption is load-bearing for the practical implications but is not tested or proved.

    Authors: The abstract and the weakest-assumption paragraph are written to indicate that the pathologies can arise under misspecification, thereby motivating caution with standard SL and tempering. We do not assert or prove that the behaviors always generalize. We will revise both the abstract and the relevant paragraph to state more explicitly that the results are obtained in a controlled illustrative example and that further study in complex simulators is needed to assess prevalence. revision: yes

Circularity Check

0 steps flagged

No circularity; results are direct simulation in running example

full rationale

The paper states that all results are illustrated using a simple running example and contains no claimed general derivation, theorem, or prediction that reduces by construction to fitted inputs, self-citations, or ansatzes. The demonstrations of multimodality and non-Gaussianity are obtained by direct computation on the example under controlled misspecification levels; no step equates a 'prediction' to a quantity defined from the same data or prior author work. Self-citations, if present for robust SL methods, are not load-bearing for the core observations. The chain is therefore self-contained as an illustrative study.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract only; no free parameters, axioms, or invented entities are identifiable from the provided text.

pith-pipeline@v0.9.0 · 5650 in / 1107 out tokens · 24016 ms · 2026-05-24T13:22:28.247663+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

  1. [1]

    Allingham, D., King, R., and Mengersen, K. (2009). B ayesian estimation of quantile distributions. Statistics and Computing , 19:189--201

  2. [2]

    J., and Drovandi, C

    An, Z., Nott, D. J., and Drovandi, C. (2020). Robust B ayesian synthetic likelihood via a semi-parametric approach. Statistics and Computing , 30(3):543--557

  3. [3]

    BSL: An R Package for Efficient Parameter Estimation for Simulation-Based Models via Bayesian Synthetic Likelihood

    An, Z., South, L. F., and Drovandi, C. (2019). BSL : An R package for efficient parameter estimation for simulation-based models via B ayesian synthetic likelihood. arXiv preprint arXiv:1907.10940

  4. [4]

    Bhattacharya, A., Pati, D., Yang, Y., et al. (2019). B ayesian fractional posteriors. The Annals of Statistics , 47(1):39--66

  5. [5]

    G., Holmes, C

    Bissiri, P. G., Holmes, C. C., and Walker, S. G. (2016). A general framework for updating belief distributions. Journal of the Royal Statistical Society. Series B, Statistical methodology , 78(5):1103

  6. [6]

    Chen, C.-F. (1985). On asymptotic normality of limiting density functions with B ayesian implications. Journal of the Royal Statistical Society: Series B (Methodological) , 47(3):540--546

  7. [7]

    and Graves, S

    Croissant, Y. and Graves, S. (2020). Ecdat: Data Sets for Econometrics . R package version 0.3-7

  8. [8]

    De Gooijer, J. (1981). An investigation of the moments of the sample autocovariances and autocorrelations for general arma processes. Journal of Statistical Computation and Simulation , 12(3-4):175--192

  9. [9]

    Drovandi, C. C. and Pettitt, A. N. (2011). Likelihood-free B ayesian estimation of multivariate quantile distributions. 55(9):2541?2556

  10. [10]

    Frazier, D. T. and Drovandi, C. (2021). Robust approximate B ayesian inference with synthetic likelihood. Journal of Computational and Graphical Statistics , pages 1--39

  11. [11]

    T., Drovandi, C., and Loaiza-Maya, R

    Frazier, D. T., Drovandi, C., and Loaiza-Maya, R. (2020a). Robust approximate B ayesian computation: An adjustment approach. arXiv preprint arXiv:2008.04099

  12. [12]

    T., Nott, D

    Frazier, D. T., Nott, D. J., Drovandi, C., and Kohn, R. (2021). B ayesian inference using synthetic likelihood: asymptotics and adjustments. arXiv preprint arXiv:1902.04827

  13. [13]

    T., Robert, C

    Frazier, D. T., Robert, C. P., and Rousseau, J. (2020b). Model misspecification in approximate B ayesian computation: consequences and diagnostics. Journal of the Royal Statistical Society: Series B (Statistical Methodology)

  14. [14]

    Gr \"u nwald, P., Van Ommen, T., et al. (2017). Inconsistency of B ayesian inference for misspecified linear models, and a proposal for repairing it. B ayesian Analysis , 12(4):1069--1103

  15. [15]

    A., MacGillivray, H., and Mengersen, K

    Haynes, M. A., MacGillivray, H., and Mengersen, K. (1997). Robustness of ranking and selection rules using generalised g-and-k distributions. Journal of Statistical Planning and Inference , 65(1):45--66

  16. [16]

    Huggins, J. H. and Miller, J. W. (2019). Using bagged posteriors for robust inference and model criticism. arXiv preprint arXiv:1912.07104

  17. [17]

    and van der Vaart, A

    Kleijn, B. and van der Vaart, A. (2012). The B ernstein-von- M ises theorem under misspecification. Electron. J. Statist. , 6:354--381

  18. [18]

    S., Robert, C

    Marin, J.-M., Pillai, N. S., Robert, C. P., and Rousseau, J. (2014). Relevant statistics for B ayesian model choice. Journal of the Royal Statistical Society: Series B: Statistical Methodology , pages 833--859

  19. [19]

    P., and Ryder, R

    Marin, J.-M., Pudlo, P., Robert, C. P., and Ryder, R. J. (2012). Approximate B ayesian computational methods. Statistics and Computing , 22(6):1167--1180

  20. [20]

    Miller, J. W. and Dunson, D. B. (2019). Robust B ayesian inference via coarsening. Journal of the American Statistical Association , 114(527):1113--1125

  21. [21]

    M\" u ller, U. K. (2013). Risk of B ayesian inference in misspecified models, and the sandwich covariance matrix. Econometrica , 81(5):1805--1849

  22. [22]

    Prangle, D. (2020). gk: An R Package for the g-and-k and generalized g-and-h distributions . The R Journal , 12(1):7--20

  23. [23]

    F., Drovandi, C

    Price, L. F., Drovandi, C. C., Lee, A., and Nott, D. J. (2018). B ayesian synthetic likelihood. Journal of Computational and Graphical Statistics , 27(1):1--11

  24. [24]

    W., Sisson, S

    Priddle, J. W., Sisson, S. A., Frazier, D. T., and Drovandi, C. (2019). Efficient B ayesian synthetic likelihood with whitening transformations. arXiv preprint arXiv:1909.04857

  25. [25]

    A., Fan, Y., and Beaumont, M

    Sisson, S. A., Fan, Y., and Beaumont, M. (2018). Handbook of Approximate B ayesian Computation . Chapman and Hall/CRC, New York

  26. [26]

    Wood, S. N. (2010). Statistical inference for noisy nonlinear ecological dynamic systems. Nature , 466(7310):1102--1104

  27. [27]

    and Jennrich, R

    Yuan, K.-H. and Jennrich, R. I. (1998). Asymptotics of estimating equations under natural conditions. Journal of Multivariate Analysis , 65(2):245--260