Plausible GMM: A Quasi-Bayesian Approach
Pith reviewed 2026-05-19 06:59 UTC · model grok-4.3
The pith
A prior over the size of moment violations lets quasi-Bayesian inference deliver concentrated posteriors and coverage for structural parameters even when the moments fail to hold exactly.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When researchers place a prior on the degree of misspecification in the moment conditions, the quasi-posterior concentrates on the structural parameters, supports approximately optimal Bayesian decision rules under that prior structure, and satisfies a form of frequentist coverage.
What carries the argument
The quasi-posterior obtained by weighting the GMM objective with a prior distribution placed on the size of the moment-condition violation.
If this is right
- Quasi-posteriors concentrate around the true structural parameters when the maintained prior on misspecification is correct.
- Quasi-posteriors support approximately optimal Bayesian decision rules under the chosen prior structure over misspecification.
- The procedure delivers frequentist coverage results for the structural parameters.
- Informative inference remains possible while allowing substantial departures from exact moment restrictions.
Where Pith is reading between the lines
- The same prior-based relaxation could be explored for other moment-based or extremum estimators beyond GMM.
- Applied researchers could report results across several priors to show how conclusions change with different beliefs about misspecification size.
- Computational methods for drawing from the quasi-posterior could be developed for larger models where analytic forms are unavailable.
Load-bearing premise
Researchers can credibly specify a prior distribution that accurately captures their beliefs about the potential degree of misspecification in the moment conditions.
What would settle it
Monte Carlo experiments that draw the true degree of misspecification from the researcher's prior, generate data accordingly, and then check whether the resulting quasi-posterior intervals attain the stated frequentist coverage rates.
Figures
read the original abstract
Structural estimation in economics often makes use of models formulated in terms of moment conditions. While these moment conditions are generally well-motivated, it is often unknown whether the moment restrictions hold exactly. We consider a framework where researchers model their belief about the potential degree of misspecification via a prior distribution and adopt a quasi-Bayesian approach for performing inference on structural parameters. We provide quasi-posterior concentration results, verify that quasi-posteriors can be used to obtain approximately optimal Bayesian decision rules under the maintained prior structure over misspecification, and provide a form of frequentist coverage results. We illustrate the approach through empirical examples where we obtain informative inference for structural objects allowing for substantial relaxations of the requirement that moment conditions hold exactly.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a quasi-Bayesian framework for GMM estimation in which researchers place a prior on the degree of misspecification in the moment conditions. It derives quasi-posterior concentration results around a pseudo-true parameter, shows that the resulting quasi-posteriors yield approximately optimal Bayesian decision rules under the maintained prior, and establishes a form of frequentist coverage. The approach is illustrated with empirical examples that relax the requirement of exact moment validity while still producing informative inference on structural parameters.
Significance. If the concentration, optimality, and coverage results are rigorously established, the paper offers a principled way to conduct inference in misspecified moment-condition models that is common in structural econometrics. Explicitly modeling beliefs about misspecification via a prior bridges Bayesian decision theory with frequentist guarantees and could be useful for applied work where exact validity of moments is implausible. The empirical illustrations help demonstrate practical relevance.
major comments (2)
- [theoretical results on quasi-posterior concentration] The quasi-posterior concentration results (abstract and theoretical sections) require that the prior over the misspecification parameter places positive mass in neighborhoods of the true misspecification value. If the researcher's prior support excludes or heavily downweights the actual degree of misspecification, concentration at the relevant pseudo-true parameter need not occur, which would undermine the subsequent claims on approximately optimal decision rules and frequentist coverage. This assumption should be stated explicitly as a theorem hypothesis and its implications for robustness discussed.
- [frequentist coverage results] The frequentist coverage result is described only as 'a form of' coverage. It is unclear whether this coverage is uniform over the parameter space or holds conditionally on the prior support condition; if the latter, the result is more limited than standard frequentist statements and should be qualified accordingly in the relevant theorem.
minor comments (2)
- [abstract and introduction] Clarify the precise definition of the quasi-posterior and the loss function used for the optimality result.
- [empirical illustrations] In the empirical examples, report the specific functional form and hyperparameters of the prior on misspecification and include sensitivity checks to prior choice.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [theoretical results on quasi-posterior concentration] The quasi-posterior concentration results (abstract and theoretical sections) require that the prior over the misspecification parameter places positive mass in neighborhoods of the true misspecification value. If the researcher's prior support excludes or heavily downweights the actual degree of misspecification, concentration at the relevant pseudo-true parameter need not occur, which would undermine the subsequent claims on approximately optimal decision rules and frequentist coverage. This assumption should be stated explicitly as a theorem hypothesis and its implications for robustness discussed.
Authors: We agree that the quasi-posterior concentration result requires the prior on the misspecification parameter to place positive mass in neighborhoods of the true value. This is a standard condition for consistency in quasi-Bayesian settings with misspecification. In the revision we will state the condition explicitly as a hypothesis in the relevant theorem and add a discussion of its implications, including the consequences for decision rules and coverage when the prior support is misspecified relative to the true degree of misspecification. revision: yes
-
Referee: [frequentist coverage results] The frequentist coverage result is described only as 'a form of' coverage. It is unclear whether this coverage is uniform over the parameter space or holds conditionally on the prior support condition; if the latter, the result is more limited than standard frequentist statements and should be qualified accordingly in the relevant theorem.
Authors: We appreciate the request for clarification. The coverage property we establish is conditional on the prior support condition for the misspecification parameter. We will revise the theorem statement and surrounding discussion to make this qualification explicit and to note that the result is not claimed to be uniform over the full parameter space. revision: yes
Circularity Check
No circularity: quasi-posterior results derived from standard Bayesian asymptotics on misspecified moments
full rationale
The paper constructs a quasi-Bayesian posterior by placing a prior on the degree of moment misspecification and then derives concentration, approximate decision-rule optimality, and frequentist coverage properties from this setup. These steps rely on standard large-sample arguments for quasi-likelihoods with an added prior layer; none of the reported theorems reduce by construction to a fitted parameter renamed as a prediction, a self-definitional loop, or a load-bearing self-citation whose content is merely restated. The framework therefore supplies independent analytic content relative to its inputs and external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Moment conditions hold approximately rather than exactly, with the degree of approximation governed by a researcher-specified prior.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We consider a framework where researchers model their belief about the potential degree of misspecification via a prior distribution and adopt a quasi-Bayesian approach... QT(θ, μ) = −T(bm(θ)−μ)′Ω̂T(θ)−1(bm(θ)−μ)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We provide quasi-posterior concentration results... Bernstein–von Mises type theorems
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Timothy B Armstrong and Michal Kolesár
URL https://economics.mit.edu/sites/default/files/2024-04/ TrueandPseudo-TrueParameters.pdf. Timothy B Armstrong and Michal Kolesár. Sensitivity analysis using approximate moment condition models. Quantitative Economics, 12(1):77–108,
work page 2024
-
[2]
Olivier Catoni. Statistical learning theory and stochastic optimization: Ecole d’Eté de Probabilités de Saint-Flour , XXXI-2001. Springer Science & Business Media,
work page 2001
-
[3]
Monte carlo confidence sets for identified sets
Xiaohong Chen, Timothy M Christensen, and Elie Tamer. Monte carlo confidence sets for identified sets. Econometrica, 86(6):1965–2018,
work page 1965
-
[4]
ISSN 0034-6535. doi: 10.1162/0034653041811734. URL https://doi.org/10.1162/0034653041811734. Victor Chernozhukov and Christian Hansen. An IV model of quantile treatment effects. Econo- metrica, 73(1):245–261,
-
[5]
ISBN 9780429192289. doi: 10.1201/b18308. Alastair R Hall and Atsushi Inoue. The large sample behavior of the generalized method of moments estimator in misspecified models. Journal of Econometrics, 114(2):361–394,
-
[6]
doi: https://doi.org/10.1016/bs.hoe.2020.05.002. URL https://www.sciencedirect. com/science/article/pii/S1573441220300027. Hyungsik Roger Moon and Frank Schorfheide. Bayesian and frequentist inference in partially identified models. Econometrica, 80(2):755–782,
-
[7]
Accuracy of Gaussian approximation in nonparametric Bernstein–von Mises theorem
Vladimir Spokoiny and Maxim Panov. Accuracy of Gaussian approximation in nonparametric Bernstein–von Mises theorem. arXiv preprint arXiv:1910.06028,
-
[8]
Plausible GMM: A Quasi-Bayesian Approach Supplementary Appendix Victor Chernozhukov*, Christian B. Hansen†, Lingwei Kong‡, Weining Wang§ September 17, 2025 *Department of Economics, Massachusetts Institute of Technology †Booth School of Business, University of Chicago ‡Faculty of Economics and Business, University of Groningen §Department of Economics, Un...
work page 2025
-
[9]
Corollary SA-1 delivers this expansion
We extend the method of ob- taining an expansion of a CUE estimator from Newey and Smith (2004) to our high-dimensional misspecified moment settings to verify that assumption. Corollary SA-1 delivers this expansion. Corollary SA-1. Suppose that the following assumptions hold with ξ > 2: (1) Assumption 1; (2) Assumption 6; (3) g (Zt , θ) is continuous in θ...
work page 2004
-
[10]
Then, the following expansion (SA-1) holds for any fixed µ ∈ Γ, °°bθ(µ) − θ(µ) − (G(θ(µ))⊤Ω(θ(µ), µ)−1G(θ(µ)))−1G(θ(µ))⊤Ω(θ(µ), µ)−1 ¡ bm ¡ θ(µ) ¢ − µ ¢°° = op(qT −1/2), (SA-1) where we consider bθ(µ) = argmin θ∈Θ{ bm(θ)−µ}⊤ bΩ(θ, µ)−1{ bm(θ)−µ}, Ω(θ(µ), µ) = E[(g (Zt , θ(µ))− µ)(g (Zt , θ(µ))−µ)⊤] and bΩ(θ(µ), µ) = T −1 PT t =1[(g (Zt , θ(µ))−µ)(g (Zt , ...
work page 2005
-
[11]
The unit ball of the dual of ¡ Rk, ∥ · ∥ ¢ is the set of linear functionsn x = (x1,. .. ,xk)T 7→ Pk j =1 λj x j : (Pk j =1 ¯¯λj ¯¯2)1/2 ≤ 1 o , and for λ1,. .. ,λk with ( Pk j =1 ¯¯λj ¯¯2)1/2 ≤ 1, with the following step, nX i =1 E "à kX j =1 λj Xi j !2# = nX i =1 E £ λ⊤Xi X ⊤ i λ ¤ ≤ n∥λ∥λmax(Σ) ≤ n∥ω∥. Hence in this case, Λ2 n in Theorem 3.1 Einmahl and...
work page 2008
-
[12]
□ SA-2. L INK TO OPTIMAL DECISION RULE We argue in this section that, similar to the results established in Andrews and Mikusheva (2022), under model misspecification, we can also establish that the quasi-posterior can be obtained as the limit of a sequence of posteriors under proper priors, and the resulting quasi- Bayes decision rule can correspond to t...
work page 2022
-
[13]
Denote bg (θ, µ) = 1p T PT t =1[g (Zt , θ) − µ]
While the score f controls the data distribution, our interest lies in the plausible pair (θ(µ), µ). Denote bg (θ, µ) = 1p T PT t =1[g (Zt , θ) − µ]. Assumption SA-14. (GP in the limit) Assume that under PT,f , bg (θ(µ), µ) weakly converges to a Gaussian process with mean function ¯m(·) and covariance function Σ(·), where the covariance function is contin...
work page 2022
-
[14]
In paticular, with the linear IV model setup, we illustrate the validity of Theorems 1-2
From the frequentist perspective, we validate our BvM theorem and related coverage results. In paticular, with the linear IV model setup, we illustrate the validity of Theorems 1-2. In addition to the linear moment case, Section SA-3.2 addresses simulations involving nonlinear moments and illustrates the frequentist justification of the unions of the cred...
work page 2012
-
[15]
In contrast, the lower panel (b.) reflects a plausible IV setting from Conley et al. (2012) with µ drawing from the prior of µ, π(µ) in the simulation data generating process (DGP), i.e., Fµ coincides with π(µ). The red dashed curves mark the marginal prior densities. The gray shadowed bars are the histograms of the realized values of βX , µ used in the D...
work page 2012
-
[16]
βτ,i ” . The column labeled “γ
In contrast, the lower panel (b.) uses DGPs with µ drawn from the prior π(µ). The red dashed curves mark the marginal prior densities. The gray shadowed bars are the histograms of the realized values of βX , µ used in the DGPs of simulations. The green solid curves mark the PGMM marginal posteriors. described in Section 3.1. We first draw µ = (µ1, µD, µW ...
work page 2003
-
[17]
74 γ T τ Methods βτ γ T τ Methods βτ 0 300 0.5 -1 0.9629 1 300 0.5 -1 0 0 300 0.5 0 0.9572 1 300 0.5 0 0.0009 0 300 0.5 1 0.9780 1 300 0.5 1 0.9555 0 300 0.8 -1 0.9573 1 300 0.8 -1 0 0 300 0.8 0 0.9574 1 300 0.8 0 0.0893 0 300 0.8 1 0.9796 1 300 0.8 1 0.9009 0 100 0.5 -1 0.9666 1 100 0.5 -1 0.0439 0 100 0.5 0 0.9653 1 100 0.5 0 0.1112 0 100 0.5 1 0.9807 1...
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.