A Bayesian Semiparametric Gaussian Copula Approach to a Multivariate Normality Test
Pith reviewed 2026-05-25 10:22 UTC · model grok-4.3
The pith
Placing a Dirichlet process on marginal distributions and using a Gaussian copula allows a Bayesian test for multivariate normality using relative belief ratio and energy distance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A Bayesian semiparametric copula approach models the underlying multivariate distribution F_true by placing the Dirichlet process on the unknown marginal distributions and utilizing a Gaussian copula model to capture the dependence structure. This leads to a Bayesian multivariate normality test developed by combining the relative belief ratio and the Energy distance, with several theoretical results derived and excellent performance shown in simulated examples and a real data set.
What carries the argument
Dirichlet process on marginal distributions of F_true paired with Gaussian copula for dependence, tested via relative belief ratio and energy distance.
If this is right
- The procedure provides a valid Bayesian test for multivariate normality.
- Nonparametric learning of marginals is possible while maintaining a parametric dependence structure.
- The test exhibits excellent performance in simulations and on real data.
- Theoretical results support the validity of the approach.
Where Pith is reading between the lines
- This method could be adapted to test for other specific multivariate distributions by selecting appropriate copulas.
- Performance in very high dimensions remains to be explored beyond the presented examples.
- Combining with other copula families might allow testing more complex dependence structures.
Load-bearing premise
The dependence structure of the true distribution is adequately captured by the Gaussian copula, even when marginals are modeled nonparametrically.
What would settle it
Simulate data from a multivariate normal distribution and verify whether the test consistently accepts the null hypothesis at the nominal level, or simulate data from a distribution with non-Gaussian dependence and check rejection rates.
Figures
read the original abstract
In this paper, a Bayesian semiparametric copula approach is used to model the underlying multivariate distribution $F_{true}$. First, the Dirichlet process is constructed on the unknown marginal distributions of $F_{true}$. Then a Gaussian copula model is utilized to capture the dependence structure of $F_{true}$. As a result, a Bayesian multivariate normality test is developed by combining the relative belief ratio and the Energy distance. Several interesting theoretical results of the approach are derived. Finally, through several simulated examples and a real data set, the proposed approach reveals excellent performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a Bayesian semiparametric model for the unknown joint distribution F_true by placing independent Dirichlet process priors on the univariate marginal distributions and linking them with a Gaussian copula to capture dependence. A test for multivariate normality is then constructed by combining the relative belief ratio with the energy distance. Several theoretical results are derived for the procedure, and its performance is illustrated on simulated examples and one real dataset, with claims of excellent results.
Significance. If the procedure were shown to control type-I error for the full multivariate normal hypothesis and to have power against general alternatives, the combination of nonparametric marginals with a copula-based dependence structure would provide a novel Bayesian goodness-of-fit test. The use of the relative belief ratio and energy distance is a reasonable choice within the model class, but the Gaussian-copula restriction limits the scope of any such guarantee.
major comments (2)
- [Abstract] Abstract and method description: the central claim that the procedure constitutes a valid test for multivariate normality is undermined by the modeling choice. The construction places independent DPs on the marginals and fixes a Gaussian copula for the joint; this family contains the MVN distributions (normal marginals plus Gaussian copula) but excludes all distributions whose copula is non-Gaussian. When data arise from normal marginals but a non-Gaussian copula, the model is misspecified and no argument is given that the relative-belief-ratio/energy-distance statistic still controls type-I error or possesses power against the correct alternative (any non-MVN). The theoretical results are derived inside the restricted model and therefore do not establish validity for the unrestricted MVN testing problem.
- [Abstract] Abstract: the statement that 'a Bayesian multivariate normality test is developed' is not supported by the subsequent construction. The test is derived under the maintained Gaussian-copula assumption; nothing in the provided description shows that the statistic reduces to a consistent test for the hypothesis that the copula is Gaussian and the marginals are normal simultaneously.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting the need to clarify the scope of the proposed procedure. The comments correctly identify that the model maintains a Gaussian copula assumption. We address each point below and will revise the manuscript to make the assumptions and limitations explicit.
read point-by-point responses
-
Referee: [Abstract] Abstract and method description: the central claim that the procedure constitutes a valid test for multivariate normality is undermined by the modeling choice. The construction places independent DPs on the marginals and fixes a Gaussian copula for the joint; this family contains the MVN distributions (normal marginals plus Gaussian copula) but excludes all distributions whose copula is non-Gaussian. When data arise from normal marginals but a non-Gaussian copula, the model is misspecified and no argument is given that the relative-belief-ratio/energy-distance statistic still controls type-I error or possesses power against the correct alternative (any non-MVN). The theoretical results are derived inside the restricted model and therefore do not establish validity for the unrestricted MVN testing problem.
Authors: We agree that the model restricts attention to distributions with Gaussian copula dependence. The procedure tests whether the marginal distributions are normal under this maintained copula structure; the multivariate normal is included as the case of normal marginals. All theoretical results, including those on the relative belief ratio and energy distance, are derived conditional on the semiparametric Gaussian copula model being correctly specified. No claim is made that the procedure controls type-I error or has power when the true copula is non-Gaussian, as the model is then misspecified. We will revise the abstract and method description to state explicitly that the test operates under the Gaussian copula assumption and is not a general test for the unrestricted multivariate normality hypothesis. revision: yes
-
Referee: [Abstract] Abstract: the statement that 'a Bayesian multivariate normality test is developed' is not supported by the subsequent construction. The test is derived under the maintained Gaussian-copula assumption; nothing in the provided description shows that the statistic reduces to a consistent test for the hypothesis that the copula is Gaussian and the marginals are normal simultaneously.
Authors: The abstract statement refers to a test for multivariate normality developed within the semiparametric model that fixes the Gaussian copula and places Dirichlet process priors on the marginals. The relative belief ratio combined with energy distance is used to assess whether the marginals are normal given the assumed copula. The construction does not test the copula assumption itself, nor does it provide a joint test for both Gaussian copula and normal marginals. We will revise the abstract to clarify that the test is conducted under the maintained Gaussian copula and does not claim consistency against alternatives that violate this assumption. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper specifies an explicit modeling choice (Dirichlet process on marginals + Gaussian copula for dependence) and then applies the relative belief ratio together with energy distance to construct the test. No quoted equations or derivation steps reduce the test statistic, posterior quantities, or theoretical results to fitted inputs by construction, nor do they rely on self-citation chains that substitute for independent justification. The Gaussian copula is an openly stated modeling assumption rather than an ansatz smuggled via prior work or a uniqueness theorem imported from the authors. Performance claims rest on external simulations and data rather than internal tautologies. The derivation chain therefore remains self-contained against the listed circularity patterns.
Axiom & Free-Parameter Ledger
free parameters (2)
- Dirichlet process concentration parameter
- Gaussian copula correlation matrix
axioms (2)
- standard math The Dirichlet process prior yields a valid random probability measure on the marginal distributions.
- domain assumption A Gaussian copula can represent the dependence structure of any continuous multivariate distribution.
Reference graph
Works this paper leans on
-
[1]
Al-Labadi, L., Baskurt, Z., and Evans, M. (2017). Goodness of fit for the logis- tic regression model using relative belief. Journal of Statistical Distributions and Applications, 4(1), 1. A BSPGC approach to a MVN test. 26
work page 2017
-
[2]
Al-Labadi, L., Baskurt, Z., and Evans, M. (2018). Statistical reasoning: choosing and checking the ingredients, inferences based on a measure of statistical evidence with some applications. Entropy, 20(4), 289
work page 2018
-
[3]
Al-Labadi, L. and Evans, M. (2017). Optimal robustness results for relative belief inferences and the relationship to prior-data conflict. Bayesian Analysis, 12(3), 705– 728
work page 2017
-
[4]
Al-Labadi, L. and Evans, M. (2018). Prior-based model checking. Canadian Journal of Statistics, 46(3), 380–398
work page 2018
- [5]
-
[6]
Al-Labadi, L., Patel, V., Vakiloroayaei, K., and Wan, C. (2019b). Kullback- Leibler divergence for Bayesian nonparametric model checking. Technical Report arXiv:1903.00669
work page internal anchor Pith review Pith/arXiv arXiv 1903
- [7]
-
[8]
Al-Labadi, L. and Zarepour, M. (2017). Two-sample Kolmogorov-Smirnov test us- ing a Bayesian nonparametric approach. Mathematical Methods of Statistics , 26(3), 212–225
work page 2017
-
[9]
Belalia, M., Bouezmarni, T., Lemyre, F. C. and Taamouti, A. (2017). Testing independence based on Bernstein empirical copula and copula density. Journal of Nonparametric Statistics, 29(2), 346–80
work page 2017
-
[10]
Chen, X., Fan, Y., and Tsyrennikov, V. (2006). Efficient estimation of semipara- metric multivariate copula models. Journal of the American Statistical Association , 101 (475), 1228–1240. A BSPGC approach to a MVN test. 27
work page 2006
-
[11]
Evans, M. (2015). Measuring Statistical Evidence Using Relative Belief . volume 144 of Monographs on Statistics and Applied Probability. CRC Press, Boca Raton, FL
work page 2015
-
[12]
Evans, M. and Moshonov, H. (2006). Checking for prior-data conflict. Bayesian Analysis, 1(4), 893–914
work page 2006
-
[13]
Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics , 1, 209–230
work page 1973
-
[14]
Fernandez, G. (2010). Data mining using SAS applications . USA: CRC Press, Boca Raton, FL, second edition
work page 2010
-
[15]
Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A practical approach . Chapman & Hall, London
work page 1988
-
[16]
Genest, C., R´ emillard, B. (2004). Tests of independence and randomness based on the empirical copula process. Test, 13(2), 335–369
work page 2004
-
[17]
Henze, N. and Visagie, J. (2019). Testing for normality in any dimension based on a partial differential equation involving the moment generating function. Technical Report arXiv:1901.03986
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[18]
Ishwaran, H., and Zarepour, M. (2003). Exact and approximate sum representa- tions for the Dirichlet process. The Canadian Journal of Statistics , 30, 269–283
work page 2003
-
[19]
Joe, H. (1997). Multivariate models and dependence concepts . Chapman & Hall, London
work page 1997
-
[20]
Kim, I. and Park, S. (2018). Likelihood ratio tests for multivariate normality. Communications in Statistics-Theory and Methods , 47(8), 1923–1934
work page 2018
-
[21]
Kojadinovic, I. and Holmes, M. (2009). Tests of independence among continu- ous random vectors based on Cram´ er-von Mises functionals of the empirical copula process. Journal of Multivariate Analysis , 100(6), 1137–1154. A BSPGC approach to a MVN test. 28
work page 2009
-
[22]
Madukaife, M. S. and Okafor, F. C. (2018). A powerful affine invariant test for multivariate normality based on interpoint distances of principal components. Com- munications in Statistics-Simulation and Computation , 47(5), 1264–1275
work page 2018
-
[23]
Non-parametric weighted tests for independence based on empirical copula process
Medovikov, I., (2016). Non-parametric weighted tests for independence based on empirical copula process. Journal of Statistical Computation and Simulation , 86(1), 105–121
work page 2016
-
[24]
Nelsen R. B. (2006). An Introduction to Copulas . Springer, New York, second edition
work page 2006
-
[25]
Rosen, O. and Thompson, W. K. (2015). Bayesian semiparametric copula estima- tion with application to psychiatric genetics. Biometrical Journal, 57, 468–484
work page 2015
-
[26]
Sancetta, A. and Satchell, S. (2004). The Bernstein copula and its Applications to Modeling and approximations of multivariate distributions. Econometric Theory, 20(3), 535–562
work page 2004
-
[27]
Segers, J., van den Akker, R. and Werker, B. J. M. (2014). Semiparametric Gaus- sian copula models: geometry and efficient rank-based estimation. The Annals of Statistics 42(5), 1911–1940
work page 2014
-
[28]
Sklar, M. (1959). Fonctions de r´ epartition ` an dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris , 8, 229–231
work page 1959
-
[29]
E-statistics: Energy of statistical samples
Sz´ ekely, G., (2003). E-statistics: Energy of statistical samples. Bowling Green State University, Department of Mathematics and Statistics Technical Report No. 03-05
work page 2003
-
[30]
Sz´ ekely, G. and Rizzo, M. (2013). Energy statistics: statistics based on distances. Journal of Statistical Planning and Inference , 143(8), 1249–1272
work page 2013
-
[31]
Trivedi, P. K., and Zimmer, D. M. (2005). Copula Modeling: An Introduction for Practitioners, Foundations and Trends in Econometrics, 1(1), 1–111. A BSPGC approach to a MVN test. 29
work page 2005
-
[32]
Zarepour, M., and Al-Labadi, L. (2012). On a rapid simulation of the Dirichlet process. Statistics & Probability Letters , 82(5), 916–924
work page 2012
-
[33]
Zhu, L., Klein, D .A., Frintrop, S., Cao, Z., Cremers, A .B. (2014). A multisize superpixel approach for salient object detection based on multivariate normal distri- bution estimation. IEEE Transactions on Image Processing , 23(12), 5094–5107. Appendix A Relevant Notations Table 7: Description of notations Notation:Description
work page 2014
-
[34]
c2 := (c, c)T, I2 :=(1 00 1 ), A2 :=(1 0.20.2 1 )andB2 :=(0.25 0.20.2 0.025 )
-
[35]
E(λ): An exponentional distribution with rateλ
-
[36]
tr: A t-Studen distribution withr degrees of freedom
-
[37]
B(α, β): A Beta distribution with shape 1 parameterα and shape 2 parameterβ
-
[38]
χr: A chi-square distribution withr degrees of freedom
-
[39]
PVII(1,1, r)⋆: A pearson typeV II (akat-Student) distribution with location parameter 1, scale parameter 1 andr degrees of freedom
-
[40]
F1⊗F2: A bivariate distribution with two independent marginal distributionsF1 andF2
-
[41]
tr(02, I2)†: A bivariatet-student distribution with location parameter02, scale parameterI2 andr degrees of freedom
-
[42]
LN2(02, B2)‡: A bivariate lognormal distribution with mean vector02 and covariance matrixB2. 10.S2(LN(0,0.25))†: A bivariate spherical distribution with lognormal distributionLN(0,0.25) for radii
-
[43]
N M IX1†: 0.9N2(02, I2) + 0.1N2(32, I2)
-
[44]
⋆ Required R package: PearsonDS
N M IX2†: 0.9N2(02, A2) + 0.1N2(02, I2). ⋆ Required R package: PearsonDS. † Required R package: distrEllipse. ‡ Required R package: compositions
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.