pith. machine review for the scientific record. sign in

arxiv: 2605.13710 · v1 · submitted 2026-05-13 · 🧮 math.ST · stat.TH

Recognition: unknown

Pattern-based tests for two-dimensional copulas

Authors on Pith no claims yet

Pith reviewed 2026-05-14 17:36 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords copulapermutonpattern frequencyfunctional central limit theoremgoodness-of-fit testnonparametric testrank plotbootstrap
0
0 comments X

The pith

A functional central limit theorem for pattern frequencies in bivariate rank plots enables nonparametric copula tests.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves a functional central limit theorem for the frequencies with which specific patterns appear in the rank plots of i.i.d. bivariate samples. These plots correspond to discrete copulas, and their pattern counts converge in the topology used to define permutons. The limit theorem supplies the asymptotic distribution needed to build goodness-of-fit tests, two-sample tests, and symmetry tests for the underlying copula. A bootstrap method is introduced to compute critical values, and the same approach is applied to parametric families including the Farlie-Gumbel-Morgenstern class.

Core claim

In the context of two-dimensional random samples, pattern frequencies in rank plots obey a functional central limit theorem. This result serves as the basis for nonparametric goodness-of-fit tests, for two-sample tests, and for tests of symmetry. It includes a bootstrap variant for critical values and extends to parametric examples such as the Farlie-Gumbel-Morgenstern class and a family of delay copulas.

What carries the argument

Functional central limit theorem for pattern frequencies, establishing convergence of the empirical pattern process to a Gaussian limit in the permuton space.

Load-bearing premise

The two-dimensional samples consist of independent and identically distributed observations drawn from a continuous bivariate distribution.

What would settle it

Drawing repeated samples from a known continuous bivariate distribution such as the standard bivariate normal and checking whether the suitably normalized pattern frequency process converges in distribution to the claimed Gaussian process would test the theorem; failure to observe the predicted convergence would falsify the central claim.

read the original abstract

In statistics permutations typically arise in the context of rank plots for two-dimensional data. Such plots can also be interpreted as discrete copulas. In discrete mathematics, typically in the context of the description of large (non-random) objects, two-dimensional copulas appear as limits of permutations and are then known as permutons if the topology refers to the convergence of pattern frequencies. We obtain a functional central limit theorem for such pattern frequencies in the context of two-dimensional random samples. The result serves as the basis for nonparametric goodness-of-fit tests, for two-sample tests, and for tests of symmetry. This includes a suitable variant of the bootstrap for obtaining critical values. Pattern-based procedures are also of interest in a parametric context. We consider two examples, the Farlie-Gumbel-Morgenstern class and a family of delay copulas. We discuss implementation aspects of the resulting procedures and we provide a simulation study that supplements the theoretical results in the nonparametric case.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper derives a functional central limit theorem for the empirical frequencies of fixed-size patterns in the rank permutation induced by i.i.d. samples from a continuous bivariate distribution. These frequencies are interpreted via the permuton topology on discrete copulas. The limiting Gaussian process is then used to construct nonparametric goodness-of-fit tests, two-sample tests, and symmetry tests, with a bootstrap procedure supplied for critical values. Parametric illustrations are given for the Farlie-Gumbel-Morgenstern family and a class of delay copulas, accompanied by implementation remarks and a simulation study in the nonparametric setting.

Significance. If the FCLT is valid, the work supplies a combinatorial, pattern-based route to copula inference that directly exploits the exchangeability properties of uniform random permutations. This approach may detect localized dependence structures missed by classical rank-correlation or distance-based tests. The bootstrap construction and simulation evidence are practical strengths; the grounding in standard limit theorems for pattern counts under the i.i.d. continuous assumption is technically clean.

major comments (1)
  1. [Section 3] The functional CLT is stated for pattern frequencies in the permuton topology, yet the manuscript does not display the explicit form of the covariance kernel of the limiting process (presumably in the main theorem of Section 3). Without this kernel, it is unclear whether the bootstrap is asymptotically valid for all pattern classes or only for those with finite support; a concrete expression or reference to its derivation is needed to confirm that the tests are parameter-free under the null.
minor comments (3)
  1. [Abstract] The abstract refers to 'a suitable variant of the bootstrap' without naming the resampling scheme (e.g., permutation bootstrap versus multiplier bootstrap). This detail should appear in the abstract or the first paragraph of the introduction.
  2. [Simulation study] In the simulation study, the reported empirical sizes and powers are given only for a few pattern sizes; adding a table that varies both pattern size and sample size would make the finite-sample behavior easier to assess.
  3. [Section 2] Notation for the pattern-counting functional and the permuton metric should be introduced once in a dedicated subsection rather than piecemeal across the theoretical development.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the constructive comment. We address the point below and will incorporate the requested clarification in the revised version.

read point-by-point responses
  1. Referee: [Section 3] The functional CLT is stated for pattern frequencies in the permuton topology, yet the manuscript does not display the explicit form of the covariance kernel of the limiting process (presumably in the main theorem of Section 3). Without this kernel, it is unclear whether the bootstrap is asymptotically valid for all pattern classes or only for those with finite support; a concrete expression or reference to its derivation is needed to confirm that the tests are parameter-free under the null.

    Authors: We agree that an explicit display of the covariance kernel would improve readability and confirm the properties of the bootstrap. In the revised manuscript we will augment the statement of the main functional CLT (Theorem 3.1) with the concrete form of the covariance kernel: for two fixed patterns A and B the limiting covariance is given by the difference between the joint probability that a uniform random permutation contains both patterns at the same locations and the product of the marginal probabilities, which follows directly from the standard multinomial structure of pattern counts in i.i.d. continuous samples. Under any fixed null copula (including independence) this kernel is completely determined by the uniform measure and therefore parameter-free. Because the permuton topology is metrized by the supremum over a finite collection of patterns, the process lives in a finite-dimensional space for any practical test; the bootstrap is therefore asymptotically valid by the standard continuous-mapping argument for finite-dimensional Gaussian limits. We will also add a short derivation sketch and a reference to the classical results on pattern statistics in random permutations. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation rests on the i.i.d. continuous bivariate sampling assumption directly implying that the rank plot is distributed as a uniform random permutation, whose pattern frequencies then admit a functional CLT in the permuton topology by standard exchangeability and convergence arguments for permutation statistics. The limiting covariance is parameter-free under the null, the bootstrap is applied directly to the empirical pattern counts without refitting, and the tests for goodness-of-fit, two-sample, and symmetry follow from the same limiting process. No equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported from self-citation, and no ansatz is smuggled via prior work; the central result is self-contained against external probabilistic benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard mathematical results for empirical processes and functional convergence; no free parameters, ad-hoc axioms, or invented entities are indicated in the abstract.

axioms (1)
  • standard math Standard assumptions for functional central limit theorems in empirical processes
    Invoked to obtain the limit theorem for pattern frequencies.

pith-pipeline@v0.9.0 · 5457 in / 1167 out tokens · 51970 ms · 2026-05-14T17:36:36.297905+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 38 canonical work pages

  1. [1]

    In: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pp

    Bahadur, R.R. (1971).Some Limit Theorems in Statistics.SIAM, Philadelphia. https://doi.org/10.1137/1. 9781611970630.ch1

  2. [2]

    Baringhaus, L. (1994). On a modification of the Hoeffding–Blum–Kiefer–Rosenblatt independence criterion. Comm. Statist. Simulation Comput.23683–689. https://doi.org/10.1080/03610919408813193

  3. [3]

    Baringhaus, L., Franz, C. (2004). On a new multivariate two-sample test.J. Multivariate Anal.88190–206. https://doi.org/10.1016/S0047-259X(03)00079-4

  4. [4]

    Baringhaus, L., Franz, C. (2010). Rigid motion invariant two-sample tests.Statist. Sinica201333–1361. Tests for copulas 25

  5. [5]

    Baringhaus, L., Grübel, R. (2024). Random permutations generated by delay models and estimation of delay distributions.Electron. J. Stat.18167–190. https://doi.org/10.1214/23-EJS2205

  6. [6]

    Beran, R., Millar, P.W. (1986). Confidence sets for a multivariate distribution.Ann. Statist.14431–443. https: //doi.org/10.1214/aos/1176349931

  7. [7]

    Bergsma, W., Dassios, A. (2014). A consistent test of independence based on a sign covariance related to Kendall’s tau.Bernoulli201006–1028. https://doi.org/10.3150/13-BEJ514

  8. [8]

    Blum, J.R.,Kiefer, J., Rosenblatt, M. (1961). Distribution free tests of independence based on the sample distribution function.Ann. Math. Statist.32485–498. https://doi.org/10.1214/aoms/1177705055

  9. [9]

    Bo, L., Genton, M.G. (2013). Nonparametric identification of copula structures.J. Amer. Statist. Assoc.108 666–675. https://doi.org/10.1080/01621459.2013.787083

  10. [10]

    Chan, T. F. N., Král’, D., Noel, J. A., Pehova, Y., Sharifzadeh, M., Volec, J. (2020). Characterization of quasirandom permutations by a pattern sum.Random Structures Algorithms57920–939. https://doi.org/10. 1002/rsa.20956

  11. [11]

    G., Zhu, L

    Chen, F., Meintanis, S. G., Zhu, L. (2019). On some characterizations and multidimensional criteria for testing homogeneity, symmetry and independence.J. Multivariate Anal.173125–144. https://doi.org/10.1016/ j.jmva.2019.02.006

  12. [12]

    Chernoff, H. (1956). Large sample theory: Parametric case.Ann. Math. Statist.271–22. https://doi.org/10. 1214/aoms/1177728347

  13. [13]

    Csörg ˝o, S. (1986). Testing for normality in arbitrary dimension.Ann. Statist.14708–723. https://doi.org/10. 1214/aos/1176349948

  14. [14]

    (2010).On the Asymptotic Behaviour of the Estimator of Kendall’s Tau

    Dengler, B. (2010).On the Asymptotic Behaviour of the Estimator of Kendall’s Tau. Doctoral thesis, Tech- nische Universität Wien

  15. [15]

    (1998)Decoupling

    de la Peña, V., Giné, E. (1998)Decoupling. From dependence to independence.Springer, New York. https: //doi.org/10.1007/978-1-4612-0537-1

  16. [16]

    S., Dassios, A., Bergsma, W

    Dhar, S. S., Dassios, A., Bergsma, W. (2016). A study of the power and robustness of a new test for indepen- dence against contiguous alternatives.Electron. J. Stat.10330–351. https://doi.org/10.1214/16-EJS1107

  17. [17]

    Drton, M., Han, F. Shi, H. (2020). High-dimensional consistent independence testing with maxima of rank correlations.Ann. Statist.483206–3227. https://doi.org/10.1214/19-AOS1926

  18. [18]

    Even-Zohar, C., Leng, C. (2021). Counting small permutation patterns.Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA)2288–2302. https://dl.acm.org/doi/10.5555/3458064.3458200

  19. [19]

    Farlie, D. J. G. (1960). The performance of some correlation coefficients for a general bivariate distribution. Biometrika47307–323. https://doi.org/10.1093/biomet/47.3-4.307

  20. [20]

    Goodness-of-fit tests for copulas.J

    Fermanian, J.-D.(2005). Goodness-of-fit tests for copulas.J. Multivariate Anal.95119–152. https://doi.org/ 10.1016/j.jmva.2004.07.004

  21. [21]

    Fermanian, J.-D., Radulovi ´c, D., Wegkamp, M. (2004). Weak convergence of empirical copula processes. Bernoulli10847–860. https://doi.org/10.3150/bj/1099579158

  22. [22]

    (1987).Seminar on Empirical Processes.Springer, Basel

    Gaenssler, P., Stute, W. (1987).Seminar on Empirical Processes.Springer, Basel. https://doi.org/10.1007/ 978-3-0348-6269-1

  23. [23]

    Genest, C., Verret, F. (2005). Locally most powerful rank tests of independence for copula models.J. Non- parametr. Stat.17521–539. https://doi.org/10.1080/10485250500038926

  24. [24]

    M., Rasch, M

    Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., Smola, A. (2012). A kernel two-sample test.J. Mach. Learn. Res.13723–773

  25. [25]

    Grübel, R. (2024). Ranks, copulas, and permutons.Metrika87155–182. https://doi.org/10.1007/ s00184-023-00908-2

  26. [26]

    Gumbel, E. J. (1958).Statistics of Extremes. Columbia University Press, New York. https://doi.org/10.7312/ gumb92958

  27. [27]

    Hoppen, C., Kohayakawa, Y., Moreira, C.G., Ráth, B., Sampaio, R.M. (2013). Limits of permutation se- quences.J. Combin. Theory Ser. B10393–113. https://doi.org/10.1016/j.jctb.2012.09.003

  28. [28]

    Hoeffding, W. (1948). A non-parametric test of independence.Ann. Math. Stat.19546–557. https://doi.org/ 10.1214/aoms/1177730150

  29. [29]

    Janson, S. (1984). The asymptotic distributions of incomplete𝑈-statistics.Z. Wahrsch. Verw. Gebiete66 495–505. https://doi.org/10.1007/BF00531887 26L. Baringhaus and R. Grübel

  30. [30]

    Janson, S., Nakamura, B., Zeilberger, D. (2015). On the asymptotic statistics of the number of occurrences of multiple permutation patterns.J. Comb.6117–143. https://dx.doi.org/10.4310/JOC.2015.v6.n1.a8

  31. [31]

    Kallenberg, W.C.M., Ledwina, T. (1987). On local and nonlocal measures of efficiency.Ann. Statist.15 1401–1420. https://doi.org/10.1214/aos/1176350601

  32. [32]

    Kourouklis, S. (1989). On the relation between Hodges–Lehmann efficiency and Pitman efficiency.Canad. J. Statist.17311–318. https://doi.org/10.2307/3315526

  33. [33]

    (2005).Testing Statistical Hypotheses

    Lehmann, E., Romano, J. (2005).Testing Statistical Hypotheses. Wiley, New York. https://doi.org/10.1007/ 0-387-27605-X

  34. [34]

    (2012).Large Networks and Graph Limits

    Lovász, L. (2012).Large Networks and Graph Limits. American Mathematical Society Colloquium Publica- tions 60. Amer. Math. Soc., Providence, RI. https://doi.org/10.1090/coll/060

  35. [35]

    Meintanis, S. G. (2016). A review of testing procedures based on the empirical characteristic function.South African Statist. J.50, 1–14. https://doi.org/10.37920/sasj.2016.50.1.1

  36. [36]

    Morgenstern, D. (1956). Einfache Beispiele zweidimensionaler Verteilungen.Mitt.-Bl. Math. Statistik8234– 235

  37. [37]

    Nandy, P., Weihs, L., Drton, M. (2016). Large-sample theory for the Bergsma-Dassios sign covariance.Elec- tron. J. Stat.102287–231. https://doi.org/10.1214/16-EJS1166

  38. [38]

    Nikitin, Ya. Yu. (1995).Asymptotic Efficiency of Nonparametric Tests.Cambridge University Press, Cam- bridge. https://doi.org/10.1017/CBO9780511530081

  39. [39]

    Yu, Ponikarov, E.V

    Nikitin, Ya. Yu, Ponikarov, E.V. (2001). Rough asymptotics of probabilities of Chernoff type large deviations for von Mises functionals and U-statistics.Amer. Math. Soc. Transl. Ser. 2203107–146. https://doi.org/10.1090/ trans2/203/04

  40. [40]

    Pinelis, I. (2023). Asymptotic relative efficiency of the Kendall and Spearman correlation statistics.Theory Probab. Appl.68111–122. https://doi.org/10.1137/S0040585X97T991313

  41. [41]

    Rémillard, B., Scaillet, O. (2009). Testing for equality between two copulas.J. Multivariate Anal.100377–

  42. [42]

    https://doi.org/10.1016/j.jmva.2008.05.004

  43. [43]

    Shi, H., Drton, M., Han, F. (2022). On the power of Chatterjee’s rank correlation.Biometrika109317–333. https://doi.org/10.1093/biomet/asab028

  44. [44]

    2004, InterStat, 5 Sz´ ekely, G

    Székely, G. J., Rizzo, M. L. (2013). Energy statistics: a class of statistics based on distances.J. Statist. Plann. Inference1431249–1272. https://doi.org/10.1016/j.jspi.2013.03.018

  45. [45]

    Tsirelson, V.S. (1975). The density of the distribution function of the maximum of a Gaussian process.Theory Probab. Appl.20847–856. https://doi.org/10.1137/1120092

  46. [46]

    (1998).Asymptotic Statistics

    van der Vaart, A.W. (1998).Asymptotic Statistics. Cambridge University Press, Cambridge. https://doi.org/ 10.1017/CBO9780511802256

  47. [47]

    Springer, New York

    van der Vaart, A.W., Wellner, J.A. (1996).Weak Convergence and Empirical Processes.Springer, New York. https://doi.org/10.1007/978-1-4757-2545-2

  48. [48]

    (1981).Probability Distributions on Linear Spaces.North Holland, New York

    Vakhania, N.N. (1981).Probability Distributions on Linear Spaces.North Holland, New York

  49. [49]

    Weihs, L., Drton, M., Meinshausen, N. (2018). Symmetric rank covariances: a generalized framework for nonparametric measures of dependence.Biometrika105547–562. https://doi.org/10.1093/biomet/asy021

  50. [50]

    Wieand, H. S. (1976). A condition under which the Pitman and Bahadur approaches to efficiency coincide. Ann. Statist.41003–1011. https://doi.org/10.1214/aos/1176343600

  51. [51]

    (1995).Mathematische Statistik II.Teubner, Stuttgart

    Witting, H., Müller–Funk, U. (1995).Mathematische Statistik II.Teubner, Stuttgart. https://doi.org/10.1007/ 978-3-322-90152-1

  52. [52]

    Yanagimoto, T. (1970). On measures of association and a related problem.Ann. Inst. Statist. Math.2257–63. https://doi.org/10.1007/BF02506323