pith. sign in

arxiv: 2605.22062 · v1 · pith:NWE3MJRYnew · submitted 2026-05-21 · 🧮 math.ST · stat.TH

A Circular Chatterjee's Correlation Coefficient

Pith reviewed 2026-05-22 03:00 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords circular datarank correlationChatterjee coefficientdirectional statisticsfunctional dependenceindependence testingcyclic ranks
0
0 comments X

The pith

A circular Chatterjee coefficient detects functional dependence on circles by averaging over all response cuts in rank space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a directed measure of association for circular data that extends Chatterjee's original rank correlation. The construction averages the usual coefficient over every possible way to cut the response circle into a line, so the result depends only on the intrinsic circular ordering rather than any fixed origin. This keeps the original interpretations: the coefficient is exactly zero under independence and exactly one when the response is a measurable function of the predictor. A sympathetic reader would care because many directional datasets involve angles or cycles where choosing an arbitrary cut changes the numerical value and standard circular correlations lose power on multi-winding patterns.

Core claim

Under non-atomic circular marginals, the proposed coefficient is zero exactly under independence and one exactly when the circular response is a measurable function of the circular predictor. The population version averages over response cuts in circular rank space; the finite-sample version averages over sample cut gaps and reduces to a simple statistic based only on cyclic ranks. The coefficient is consistent and has a distribution-free null distribution under independence.

What carries the argument

The population construction that averages over response cuts in circular rank space, which reduces to a cyclic-rank statistic in finite samples.

If this is right

  • The coefficient remains directed: it detects whether the response can be predicted from the predictor but not necessarily the reverse.
  • It is consistent, so the sample value converges to the population value as sample size increases.
  • Under independence the null distribution is free of the underlying marginal distributions.
  • It retains sensitivity to multi-winding circular relationships where the response completes two or four cycles per predictor cycle.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be applied directly to directional data in meteorology or animal navigation without first choosing an arbitrary zero direction.
  • Because the statistic depends only on cyclic ranks, it may combine readily with existing rank-based tests for circular data.
  • The same averaging idea might extend to other rank-based functionals on the circle, such as measures of concordance.

Load-bearing premise

The averaging over response cuts in circular rank space preserves the exact zero-under-independence and one-under-functional-dependence properties without additional regularity conditions beyond non-atomic marginals.

What would settle it

A concrete counter-example consisting of two non-atomic circular distributions that are independent but yield a positive population coefficient, or a clear functional dependence that yields a population coefficient strictly below one, would falsify the central claims.

Figures

Figures reproduced from arXiv: 2605.22062 by Sourav Majumdar.

Figure 1
Figure 1. Figure 1: Mean coefficient values as a function of the wrapped-normal noise standard deviation [PITH_FULL_IMAGE:figures/full_fig_p020_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sensitivity of the cut-based Borel construction to the choice of cut point, measured by [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Empirical rejection probability under independence for the normal approximation and the [PITH_FULL_IMAGE:figures/full_fig_p023_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Empirical power for the proposed cyclic Chatterjee statistic at [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗
read the original abstract

Chatterjee's rank correlation is a directed measure of association designed to detect whether one variable can be predicted as a function of another. While the original coefficient is naturally defined for real-valued data, circular data poses additional difficulty. Applying the usual construction requires cutting each circle at an arbitrary point and treating it as a line. Different choices of cut points can lead to different finite-sample values, even though the underlying circular relationship is unchanged. This paper proposes a circular version of Chatterjee's coefficient that removes this arbitrary choice. The population construction averages over response cuts in circular rank space, and the finite-sample construction averages over sample cut gaps and reduces to a simple statistic based only on cyclic ranks. The resulting coefficient is intrinsic to the circular ordering of the data, remains directed, and retains the key interpretation of Chatterjee's original coefficient. Under non-atomic circular marginals, it is zero exactly under independence and one exactly when the circular response is a measurable function of the circular predictor. We prove consistency and derive its distribution-free null behavior under independence. Simulations show that the proposed coefficient is especially useful for detecting multi-winding circular relationships, such as cases where the response goes around the circle twice or four times as the predictor goes around once, where standard circular correlations can be nearly blind.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a circular extension of Chatterjee's rank correlation coefficient for circular data. The population version averages the standard Chatterjee coefficient over all response cuts in circular rank space; the finite-sample version averages over sample cut gaps and reduces to a cyclic-rank statistic. Under non-atomic circular marginals the coefficient equals zero exactly under independence and one exactly when the circular response is a measurable function of the predictor. The authors prove consistency and derive a distribution-free null distribution under independence. Simulations demonstrate improved detection of multi-winding circular relationships compared with standard circular correlations.

Significance. If the central claims hold, the work supplies a directed, cut-point-free association measure for circular data that retains Chatterjee's functional-dependence interpretation. This is useful in directional statistics, biology, and meteorology. The distribution-free null behavior and the ability to detect multi-winding patterns (where linear or standard circular coefficients often fail) are concrete strengths. The reduction to a simple cyclic-rank statistic also aids computational tractability.

major comments (2)
  1. [§3] §3 (population construction): the claim that averaging over cuts preserves exact zero under independence and exact one under measurable functional dependence is load-bearing for the main theorem. The manuscript should explicitly state the measure on the circle used for the average (uniform Lebesgue?) and supply a short lemma verifying that degeneracy of the conditional distribution is preserved for any measurable unfolding, including the multi-winding case highlighted in the simulations.
  2. [Theorem 4.1] Theorem 4.1 (consistency): the statement of consistency is central, yet the regularity conditions (e.g., continuity of the joint density or moment restrictions) are not listed in a single place. Adding an explicit list of assumptions immediately before the theorem would make the result easier to apply and verify.
minor comments (2)
  1. [Abstract] The abstract states that the finite-sample statistic 'reduces to a simple statistic based only on cyclic ranks'; a one-line display of the final expression would improve readability.
  2. [Simulation section] Simulation section: the caption of Figure 2 should report the number of Monte Carlo replications and the exact winding numbers used, so readers can reproduce the power comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the constructive suggestions. The comments help clarify the presentation of the population construction and the consistency result. We address each major comment below and will incorporate the requested changes.

read point-by-point responses
  1. Referee: [§3] §3 (population construction): the claim that averaging over cuts preserves exact zero under independence and exact one under measurable functional dependence is load-bearing for the main theorem. The manuscript should explicitly state the measure on the circle used for the average (uniform Lebesgue?) and supply a short lemma verifying that degeneracy of the conditional distribution is preserved for any measurable unfolding, including the multi-winding case highlighted in the simulations.

    Authors: We agree that the measure and the preservation property should be stated explicitly. The averaging is performed with respect to the uniform Lebesgue measure on the circle. We will add a short lemma immediately following the population definition in §3. The lemma will verify that the averaged coefficient equals zero under independence and equals one under measurable functional dependence for any measurable unfolding of the circle, including the multi-winding cases used in the simulations. This lemma will be self-contained and use only the non-atomic marginal assumption already present in the manuscript. revision: yes

  2. Referee: [Theorem 4.1] Theorem 4.1 (consistency): the statement of consistency is central, yet the regularity conditions (e.g., continuity of the joint density or moment restrictions) are not listed in a single place. Adding an explicit list of assumptions immediately before the theorem would make the result easier to apply and verify.

    Authors: We agree that a consolidated list of assumptions will improve clarity and ease of verification. We will insert a dedicated paragraph immediately preceding the statement of Theorem 4.1 that enumerates all regularity conditions used in the consistency proof, including any requirements on the joint distribution, marginals, and moments. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained via explicit averaging construction

full rationale

The paper constructs the circular coefficient by averaging the standard Chatterjee coefficient over all response cuts in circular rank space. Under non-atomic marginals this directly inherits zero under independence and one under measurable functional dependence from the linear case, without any fitted parameters, self-referential equations, or load-bearing self-citations. The finite-sample reduction to a cyclic-rank statistic follows immediately from the population definition. No step reduces to its own inputs by construction; the central claims are consequences of the averaging operation itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proposal rests on standard properties of ranks and measurable functions on the circle; no free parameters, ad-hoc axioms, or new invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Non-atomic circular marginals
    Invoked to guarantee the coefficient equals zero exactly under independence and one under functional dependence.

pith-pipeline@v0.9.0 · 5746 in / 1196 out tokens · 28078 ms · 2026-05-22T03:00:04.618556+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

  1. [1]

    A simple measure of conditional dependence.Annals of Statistics, 49(6):3070–3102, 2021

    Mona Azadkia and Sourav Chatterjee. A simple measure of conditional dependence.Annals of Statistics, 49(6):3070–3102, 2021

  2. [2]

    Chakraborty and S

    S. Chakraborty and S. W. K. Wong. On the circular correlation coefficients for bivariate von mises distributions on a torus.Statistical Papers, 64:643–675, 2023

  3. [3]

    A new coefficient of correlation.Journal of the American Statistical Association, 116(536):2009–2022, 2021

    Sourav Chatterjee. A new coefficient of correlation.Journal of the American Statistical Association, 116(536):2009–2022, 2021

  4. [4]

    A survey of some recent developments in measures of association

    Sourav Chatterjee. A survey of some recent developments in measures of association. arXiv:2211.04702v2, 2023

  5. [5]

    & Sen, B

    Nabarun Deb, Promit Ghosal, and Bodhisattva Sen. Measuring association on topological spaces using kernels and geometric graphs. arXiv:2010.01768, 2020

  6. [6]

    Dette, K

    H. Dette, K. F. Siburg, and P. A. Stoimenov. A copula-based non-parametric measure of regression dependence.Scandinavian Journal of Statistics, 40(1):21–41, 2013

  7. [7]

    N. I. Fisher and A. J. Lee. A correlation coefficient for circular data.Biometrika, 70(2):327–332, 1983

  8. [8]

    Gehlot and A

    S. Gehlot and A. K. Laha. New tests of randomness for circular data. arXiv:2506.23522, 2025

  9. [9]

    Berry–esseen bounds for combinatorial central limit theorems and pattern occurrences, using zero and size biasing.Journal of Applied Probability, 42(3):661–683, 2005

    Larry Goldstein. Berry–esseen bounds for combinatorial central limit theorems and pattern occurrences, using zero and size biasing.Journal of Applied Probability, 42(3):661–683, 2005

  10. [10]

    & Huang, Z

    Fang Han and Zhihan Huang. Azadkia–chatterjee’s correlation coefficient adapts to manifold data. arXiv:2209.11156, 2022

  11. [11]

    S. R. Jammalamadaka and Y. R. Sarma. A correlation coefficient for angular variables. In Statistical Theory and Data Analysis II. North-Holland, New York, 1988

  12. [12]

    S. R. Jammalamadaka and A. Sengupta.Topics in Circular Statistics. World Scientific, 2001. 26

  13. [13]

    R. A. Johnson and T. E. Wehrly. Measures and models for angular correlation and angular-linear correlation.Journal of the Royal Statistical Society: Series B, 39(2):222–229, 1977

  14. [14]

    P. E. Jupp and K. V. Mardia. A general correlation coefficient for directional data and related regression problems.Biometrika, 67(1):163–173, 1980

  15. [15]

    On boosting the power of chatterjee’s rank correlation.Biometrika, 110(2):283–299, 2023

    Zhen Lin and Fang Han. On boosting the power of chatterjee’s rank correlation.Biometrika, 110(2):283–299, 2023

  16. [16]

    K. V. Mardia and P. E. Jupp.Directional Statistics. Wiley, 2000

  17. [17]

    M. L. Puri and J. S. Rao. Problems of association for bivariate circular data and a new test of independence. In P. R. Krishnaiah, editor,Multivariate Analysis IV, pages 513–522. North-Holland, 1977

  18. [18]

    E. D. Rothman. Tests of coordinate independence for a bivariate sample on a torus.The Annals of Mathematical Statistics, 42(6):1962–1969, 1971

  19. [19]

    On the power of chatterjee’s rank correlation

    Hongjian Shi, Mathias Drton, and Fang Han. On the power of chatterjee’s rank correlation. Biometrika, 109(2):317–333, 2022

  20. [20]

    G. S. Shieh, R. A. Johnson, and E. W. Frees. Testing independence of bivariate circular data and weighted degenerate U-statistics.Statistica Sinica, 4(2):729–747, 1994

  21. [21]

    X. Zhan, T. Ma, S. Liu, and K. Shimizu. On circular correlation for data on the torus.Statistical Papers, 60:1827–1847, 2019. 27