pith. sign in

arxiv: 1907.01736 · v2 · pith:H4KCUGPPnew · submitted 2019-07-03 · 📊 stat.ME · stat.AP· stat.CO

A Bayesian Semiparametric Gaussian Copula Approach to a Multivariate Normality Test

Pith reviewed 2026-05-25 10:22 UTC · model grok-4.3

classification 📊 stat.ME stat.APstat.CO
keywords Bayesian testsemiparametric modelGaussian copulaDirichlet processmultivariate normalityrelative belief ratioenergy distance
0
0 comments X

The pith

Placing a Dirichlet process on marginal distributions and using a Gaussian copula allows a Bayesian test for multivariate normality using relative belief ratio and energy distance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes a new Bayesian semiparametric method to test if a multivariate dataset comes from a normal distribution. The approach models the unknown marginal distributions with a Dirichlet process prior and captures the dependence with a Gaussian copula. These elements are combined with the relative belief ratio and energy distance to form the test. The method is shown to have good performance through simulations and application to real data. Readers might care because it offers a flexible way to check normality assumptions in statistical modeling without assuming specific marginal forms.

Core claim

A Bayesian semiparametric copula approach models the underlying multivariate distribution F_true by placing the Dirichlet process on the unknown marginal distributions and utilizing a Gaussian copula model to capture the dependence structure. This leads to a Bayesian multivariate normality test developed by combining the relative belief ratio and the Energy distance, with several theoretical results derived and excellent performance shown in simulated examples and a real data set.

What carries the argument

Dirichlet process on marginal distributions of F_true paired with Gaussian copula for dependence, tested via relative belief ratio and energy distance.

If this is right

  • The procedure provides a valid Bayesian test for multivariate normality.
  • Nonparametric learning of marginals is possible while maintaining a parametric dependence structure.
  • The test exhibits excellent performance in simulations and on real data.
  • Theoretical results support the validity of the approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This method could be adapted to test for other specific multivariate distributions by selecting appropriate copulas.
  • Performance in very high dimensions remains to be explored beyond the presented examples.
  • Combining with other copula families might allow testing more complex dependence structures.

Load-bearing premise

The dependence structure of the true distribution is adequately captured by the Gaussian copula, even when marginals are modeled nonparametrically.

What would settle it

Simulate data from a multivariate normal distribution and verify whether the test consistently accepts the null hypothesis at the nominal level, or simulate data from a distribution with non-Gaussian dependence and check rejection rates.

Figures

Figures reproduced from arXiv: 1907.01736 by Forough Fazeli Asl, Luai Al-Labadi, Zahra Saberi.

Figure 1
Figure 1. Figure 1: a: Boxplots of the Energy distance between [PITH_FULL_IMAGE:figures/full_fig_p021_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Marginal densities of Ftrue = N2(02, A2) and its posterior-based model with n = 1000 based on the Kendall’s τ for a = 1. the R package rococo is used to estimate R∗ based on the Gaussian rank correla￾tion coefficients. It follows from [PITH_FULL_IMAGE:figures/full_fig_p022_2.png] view at source ↗
read the original abstract

In this paper, a Bayesian semiparametric copula approach is used to model the underlying multivariate distribution $F_{true}$. First, the Dirichlet process is constructed on the unknown marginal distributions of $F_{true}$. Then a Gaussian copula model is utilized to capture the dependence structure of $F_{true}$. As a result, a Bayesian multivariate normality test is developed by combining the relative belief ratio and the Energy distance. Several interesting theoretical results of the approach are derived. Finally, through several simulated examples and a real data set, the proposed approach reveals excellent performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript develops a Bayesian semiparametric model for the unknown joint distribution F_true by placing independent Dirichlet process priors on the univariate marginal distributions and linking them with a Gaussian copula to capture dependence. A test for multivariate normality is then constructed by combining the relative belief ratio with the energy distance. Several theoretical results are derived for the procedure, and its performance is illustrated on simulated examples and one real dataset, with claims of excellent results.

Significance. If the procedure were shown to control type-I error for the full multivariate normal hypothesis and to have power against general alternatives, the combination of nonparametric marginals with a copula-based dependence structure would provide a novel Bayesian goodness-of-fit test. The use of the relative belief ratio and energy distance is a reasonable choice within the model class, but the Gaussian-copula restriction limits the scope of any such guarantee.

major comments (2)
  1. [Abstract] Abstract and method description: the central claim that the procedure constitutes a valid test for multivariate normality is undermined by the modeling choice. The construction places independent DPs on the marginals and fixes a Gaussian copula for the joint; this family contains the MVN distributions (normal marginals plus Gaussian copula) but excludes all distributions whose copula is non-Gaussian. When data arise from normal marginals but a non-Gaussian copula, the model is misspecified and no argument is given that the relative-belief-ratio/energy-distance statistic still controls type-I error or possesses power against the correct alternative (any non-MVN). The theoretical results are derived inside the restricted model and therefore do not establish validity for the unrestricted MVN testing problem.
  2. [Abstract] Abstract: the statement that 'a Bayesian multivariate normality test is developed' is not supported by the subsequent construction. The test is derived under the maintained Gaussian-copula assumption; nothing in the provided description shows that the statistic reduces to a consistent test for the hypothesis that the copula is Gaussian and the marginals are normal simultaneously.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the need to clarify the scope of the proposed procedure. The comments correctly identify that the model maintains a Gaussian copula assumption. We address each point below and will revise the manuscript to make the assumptions and limitations explicit.

read point-by-point responses
  1. Referee: [Abstract] Abstract and method description: the central claim that the procedure constitutes a valid test for multivariate normality is undermined by the modeling choice. The construction places independent DPs on the marginals and fixes a Gaussian copula for the joint; this family contains the MVN distributions (normal marginals plus Gaussian copula) but excludes all distributions whose copula is non-Gaussian. When data arise from normal marginals but a non-Gaussian copula, the model is misspecified and no argument is given that the relative-belief-ratio/energy-distance statistic still controls type-I error or possesses power against the correct alternative (any non-MVN). The theoretical results are derived inside the restricted model and therefore do not establish validity for the unrestricted MVN testing problem.

    Authors: We agree that the model restricts attention to distributions with Gaussian copula dependence. The procedure tests whether the marginal distributions are normal under this maintained copula structure; the multivariate normal is included as the case of normal marginals. All theoretical results, including those on the relative belief ratio and energy distance, are derived conditional on the semiparametric Gaussian copula model being correctly specified. No claim is made that the procedure controls type-I error or has power when the true copula is non-Gaussian, as the model is then misspecified. We will revise the abstract and method description to state explicitly that the test operates under the Gaussian copula assumption and is not a general test for the unrestricted multivariate normality hypothesis. revision: yes

  2. Referee: [Abstract] Abstract: the statement that 'a Bayesian multivariate normality test is developed' is not supported by the subsequent construction. The test is derived under the maintained Gaussian-copula assumption; nothing in the provided description shows that the statistic reduces to a consistent test for the hypothesis that the copula is Gaussian and the marginals are normal simultaneously.

    Authors: The abstract statement refers to a test for multivariate normality developed within the semiparametric model that fixes the Gaussian copula and places Dirichlet process priors on the marginals. The relative belief ratio combined with energy distance is used to assess whether the marginals are normal given the assumed copula. The construction does not test the copula assumption itself, nor does it provide a joint test for both Gaussian copula and normal marginals. We will revise the abstract to clarify that the test is conducted under the maintained Gaussian copula and does not claim consistency against alternatives that violate this assumption. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper specifies an explicit modeling choice (Dirichlet process on marginals + Gaussian copula for dependence) and then applies the relative belief ratio together with energy distance to construct the test. No quoted equations or derivation steps reduce the test statistic, posterior quantities, or theoretical results to fitted inputs by construction, nor do they rely on self-citation chains that substitute for independent justification. The Gaussian copula is an openly stated modeling assumption rather than an ansatz smuggled via prior work or a uniqueness theorem imported from the authors. Performance claims rest on external simulations and data rather than internal tautologies. The derivation chain therefore remains self-contained against the listed circularity patterns.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The approach rests on the standard properties of the Dirichlet process for nonparametric marginal modeling and the Gaussian copula for dependence; no new entities are invented. Hyperparameters of the Dirichlet process and the copula correlation matrix are free parameters that must be chosen or given priors.

free parameters (2)
  • Dirichlet process concentration parameter
    Controls the degree of nonparametric flexibility in the marginal distributions; must be specified or given a hyperprior.
  • Gaussian copula correlation matrix
    Encodes the dependence structure; estimated or given a prior within the Bayesian model.
axioms (2)
  • standard math The Dirichlet process prior yields a valid random probability measure on the marginal distributions.
    Invoked when constructing the unknown marginals of F_true.
  • domain assumption A Gaussian copula can represent the dependence structure of any continuous multivariate distribution.
    Used to link the learned marginals into a joint distribution F_true.

pith-pipeline@v0.9.0 · 5631 in / 1548 out tokens · 23105 ms · 2026-05-25T10:22:48.413018+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 2 internal anchors

  1. [1]

    Al-Labadi, L., Baskurt, Z., and Evans, M. (2017). Goodness of fit for the logis- tic regression model using relative belief. Journal of Statistical Distributions and Applications, 4(1), 1. A BSPGC approach to a MVN test. 26

  2. [2]

    Al-Labadi, L., Baskurt, Z., and Evans, M. (2018). Statistical reasoning: choosing and checking the ingredients, inferences based on a measure of statistical evidence with some applications. Entropy, 20(4), 289

  3. [3]

    and Evans, M

    Al-Labadi, L. and Evans, M. (2017). Optimal robustness results for relative belief inferences and the relationship to prior-data conflict. Bayesian Analysis, 12(3), 705– 728

  4. [4]

    and Evans, M

    Al-Labadi, L. and Evans, M. (2018). Prior-based model checking. Canadian Journal of Statistics, 46(3), 380–398

  5. [5]

    Al-Labadi, L., Fazeli Asl, F., and Saberi, Z. (2019a). A Bayesian nonparametric test for assessing multivariate normality. Technical Report arXiv:1904.02415

  6. [6]

    Al-Labadi, L., Patel, V., Vakiloroayaei, K., and Wan, C. (2019b). Kullback- Leibler divergence for Bayesian nonparametric model checking. Technical Report arXiv:1903.00669

  7. [7]

    Al-Labadi, L., and Wang, C. (2019). Measuring Bayesian robustness using R´ enyi’s divergence and relationship with Prior-Data conflict. Technical Report arXiv:1905.05945

  8. [8]

    and Zarepour, M

    Al-Labadi, L. and Zarepour, M. (2017). Two-sample Kolmogorov-Smirnov test us- ing a Bayesian nonparametric approach. Mathematical Methods of Statistics , 26(3), 212–225

  9. [9]

    Belalia, M., Bouezmarni, T., Lemyre, F. C. and Taamouti, A. (2017). Testing independence based on Bernstein empirical copula and copula density. Journal of Nonparametric Statistics, 29(2), 346–80

  10. [10]

    Chen, X., Fan, Y., and Tsyrennikov, V. (2006). Efficient estimation of semipara- metric multivariate copula models. Journal of the American Statistical Association , 101 (475), 1228–1240. A BSPGC approach to a MVN test. 27

  11. [11]

    Evans, M. (2015). Measuring Statistical Evidence Using Relative Belief . volume 144 of Monographs on Statistics and Applied Probability. CRC Press, Boca Raton, FL

  12. [12]

    and Moshonov, H

    Evans, M. and Moshonov, H. (2006). Checking for prior-data conflict. Bayesian Analysis, 1(4), 893–914

  13. [13]

    Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics , 1, 209–230

  14. [14]

    Fernandez, G. (2010). Data mining using SAS applications . USA: CRC Press, Boca Raton, FL, second edition

  15. [15]

    and Riedwyl, H

    Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A practical approach . Chapman & Hall, London

  16. [16]

    Genest, C., R´ emillard, B. (2004). Tests of independence and randomness based on the empirical copula process. Test, 13(2), 335–369

  17. [17]

    Testing for normality in any dimension based on a partial differential equation involving the moment generating function

    Henze, N. and Visagie, J. (2019). Testing for normality in any dimension based on a partial differential equation involving the moment generating function. Technical Report arXiv:1901.03986

  18. [18]

    Ishwaran, H., and Zarepour, M. (2003). Exact and approximate sum representa- tions for the Dirichlet process. The Canadian Journal of Statistics , 30, 269–283

  19. [19]

    Joe, H. (1997). Multivariate models and dependence concepts . Chapman & Hall, London

  20. [20]

    and Park, S

    Kim, I. and Park, S. (2018). Likelihood ratio tests for multivariate normality. Communications in Statistics-Theory and Methods , 47(8), 1923–1934

  21. [21]

    and Holmes, M

    Kojadinovic, I. and Holmes, M. (2009). Tests of independence among continu- ous random vectors based on Cram´ er-von Mises functionals of the empirical copula process. Journal of Multivariate Analysis , 100(6), 1137–1154. A BSPGC approach to a MVN test. 28

  22. [22]

    Madukaife, M. S. and Okafor, F. C. (2018). A powerful affine invariant test for multivariate normality based on interpoint distances of principal components. Com- munications in Statistics-Simulation and Computation , 47(5), 1264–1275

  23. [23]

    Non-parametric weighted tests for independence based on empirical copula process

    Medovikov, I., (2016). Non-parametric weighted tests for independence based on empirical copula process. Journal of Statistical Computation and Simulation , 86(1), 105–121

  24. [24]

    Nelsen R. B. (2006). An Introduction to Copulas . Springer, New York, second edition

  25. [25]

    and Thompson, W

    Rosen, O. and Thompson, W. K. (2015). Bayesian semiparametric copula estima- tion with application to psychiatric genetics. Biometrical Journal, 57, 468–484

  26. [26]

    and Satchell, S

    Sancetta, A. and Satchell, S. (2004). The Bernstein copula and its Applications to Modeling and approximations of multivariate distributions. Econometric Theory, 20(3), 535–562

  27. [27]

    and Werker, B

    Segers, J., van den Akker, R. and Werker, B. J. M. (2014). Semiparametric Gaus- sian copula models: geometry and efficient rank-based estimation. The Annals of Statistics 42(5), 1911–1940

  28. [28]

    Sklar, M. (1959). Fonctions de r´ epartition ` an dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris , 8, 229–231

  29. [29]

    E-statistics: Energy of statistical samples

    Sz´ ekely, G., (2003). E-statistics: Energy of statistical samples. Bowling Green State University, Department of Mathematics and Statistics Technical Report No. 03-05

  30. [30]

    and Rizzo, M

    Sz´ ekely, G. and Rizzo, M. (2013). Energy statistics: statistics based on distances. Journal of Statistical Planning and Inference , 143(8), 1249–1272

  31. [31]

    K., and Zimmer, D

    Trivedi, P. K., and Zimmer, D. M. (2005). Copula Modeling: An Introduction for Practitioners, Foundations and Trends in Econometrics, 1(1), 1–111. A BSPGC approach to a MVN test. 29

  32. [32]

    Zarepour, M., and Al-Labadi, L. (2012). On a rapid simulation of the Dirichlet process. Statistics & Probability Letters , 82(5), 916–924

  33. [33]

    Zhu, L., Klein, D .A., Frintrop, S., Cao, Z., Cremers, A .B. (2014). A multisize superpixel approach for salient object detection based on multivariate normal distri- bution estimation. IEEE Transactions on Image Processing , 23(12), 5094–5107. Appendix A Relevant Notations Table 7: Description of notations Notation:Description

  34. [34]

    c2 := (c, c)T, I2 :=(1 00 1 ), A2 :=(1 0.20.2 1 )andB2 :=(0.25 0.20.2 0.025 )

  35. [35]

    E(λ): An exponentional distribution with rateλ

  36. [36]

    tr: A t-Studen distribution withr degrees of freedom

  37. [37]

    B(α, β): A Beta distribution with shape 1 parameterα and shape 2 parameterβ

  38. [38]

    χr: A chi-square distribution withr degrees of freedom

  39. [39]

    PVII(1,1, r)⋆: A pearson typeV II (akat-Student) distribution with location parameter 1, scale parameter 1 andr degrees of freedom

  40. [40]

    F1⊗F2: A bivariate distribution with two independent marginal distributionsF1 andF2

  41. [41]

    tr(02, I2)†: A bivariatet-student distribution with location parameter02, scale parameterI2 andr degrees of freedom

  42. [42]

    10.S2(LN(0,0.25))†: A bivariate spherical distribution with lognormal distributionLN(0,0.25) for radii

    LN2(02, B2)‡: A bivariate lognormal distribution with mean vector02 and covariance matrixB2. 10.S2(LN(0,0.25))†: A bivariate spherical distribution with lognormal distributionLN(0,0.25) for radii

  43. [43]

    N M IX1†: 0.9N2(02, I2) + 0.1N2(32, I2)

  44. [44]

    ⋆ Required R package: PearsonDS

    N M IX2†: 0.9N2(02, A2) + 0.1N2(02, I2). ⋆ Required R package: PearsonDS. † Required R package: distrEllipse. ‡ Required R package: compositions