A general nonparametric framework for testing hypotheses about function-valued parameters

Aaron Hudson; Albert Osom; Ali Shojaie

arxiv: 2604.20045 · v1 · submitted 2026-04-21 · 📊 stat.ME

A general nonparametric framework for testing hypotheses about function-valued parameters

Albert Osom , Ali Shojaie , Aaron Hudson This is my paper

Pith reviewed 2026-05-10 01:26 UTC · model grok-4.3

classification 📊 stat.ME

keywords nonparametric testingfunction-valued parametersconditional distributionstreatment effect heterogeneitylimiting null distributionstatistical functionalshypothesis testingbiomarker identification

0 comments

The pith

A nonparametric test for whether function-valued parameters are constant across conditioning variables has a tractable limiting null distribution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors introduce a nonparametric framework to test the hypothesis that a statistical parameter obtained from conditional distributions stays constant as the conditioning variable varies. This covers practical questions such as treatment effect heterogeneity, conditional associations, and mean dependence without parametric restrictions on the forms involved. The central technical result is that their test statistic, unlike many existing norm-based procedures, possesses an explicit and usable limiting distribution under the null. A reader would care because the result supplies a practical route to valid p-values and inference for these function-valued objects using only nonparametric estimation. The paper demonstrates the method on simulations and on breast cancer trial data for biomarker identification.

Core claim

What carries the argument

Test statistic constructed from a smooth statistical functional evaluated on nonparametric estimates of conditional distributions, connected to but distinct from norm-based procedures and possessing an explicit limiting null distribution.

Load-bearing premise

The statistical functional must be smooth and the conditional distributions must admit consistent nonparametric estimation so that the limiting null distribution holds.

What would settle it

A simulation study under the null in which repeated realizations of the test statistic fail to converge in distribution to the claimed limiting law would refute the result.

Figures

Figures reproduced from arXiv: 2604.20045 by Aaron Hudson, Albert Osom, Ali Shojaie.

**Figure 2.** Figure 2: Empirical probability of rejection for hypotheses in Example 3 under different data generating [PITH_FULL_IMAGE:figures/full_fig_p022_2.png] view at source ↗

**Figure 3.** Figure 3: Estimate of CATE using causal forest (Athey and Imbens, 2019) for each of the four genes (BAG1, CDC20, ERBB2, MYC). Following Roth and Simon (2018), we consider the nine genes identified by Prat et al. (2014) as demonstrating predictive power comparable to the full 50-gene panel used in the PAM50 test [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

read the original abstract

We present a general nonparametric approach for testing whether a statistical parameter defined through conditional distributions is constant across the conditioning variables. Such hypotheses arise naturally in problems such as assessing treatment effect heterogeneity, conditional associational effects, and conditional mean dependence. Our framework studies function-valued parameters obtained by evaluating a smooth statistical functional on conditional probability distributions. We establish an explicit connection between our test and procedures based on studying the norm of the function-valued parameter. Unlike many existing norm-based tests, which exhibit poor asymptotic behavior under the null, the proposed test statistic admits a tractable limiting null distribution. We illustrate the applicability of the proposed test through several examples, assess its operating characteristics in simulation studies, and apply it to data from a breast cancer trial to identify predictive biomarkers for response to adjuvant chemotherapy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a nonparametric test for constancy of function-valued parameters that gets a tractable limiting null distribution, unlike many norm-based competitors.

read the letter

The main thing here is a general framework for testing whether a parameter defined on conditional distributions stays constant. The authors treat the parameter as a smooth functional of those distributions and build a test statistic whose null limit is easier to handle than the usual norm-based versions that often blow up or require heavy resampling. That connection to existing norm procedures is explicit and useful, and they show it covers common cases like treatment effect heterogeneity and conditional associations. They also run simulations and apply the method to breast cancer trial data to look for predictive biomarkers, which helps ground the work. The approach looks honest about its assumptions: smoothness of the functional plus consistent nonparametric estimation of the conditionals. Those are standard but not automatic in finite samples or higher dimensions, so the operating characteristics will depend on how well the estimation step performs in practice. I would want to see the actual derivations for the limiting distribution and any rate conditions, since the abstract only states the result. Bandwidth or tuning sensitivity could be a soft spot worth checking in revisions, but nothing in the setup suggests circularity or hidden contradictions. This is aimed at statisticians who already work on nonparametric tests for heterogeneity or conditional effects. A reader in that area would pick up a usable tool and some concrete examples. It is solid enough on its own terms to go to peer review rather than a desk reject; the central claim is clear and the examples show relevance.

Referee Report

0 major / 3 minor

Summary. The manuscript develops a general nonparametric framework for testing the hypothesis that a function-valued parameter—obtained by applying a smooth statistical functional to conditional distributions—is constant across the conditioning variable. Applications include treatment effect heterogeneity, conditional associational effects, and conditional mean dependence. The central claim is that the proposed test statistic, unlike many norm-based alternatives, admits a tractable limiting null distribution; an explicit connection is drawn between the two classes of procedures. The framework is illustrated through examples, evaluated via simulation studies, and applied to breast cancer trial data to detect predictive biomarkers.

Significance. If the asymptotic theory holds, the work provides a useful addition to the toolkit for testing constancy of function-valued parameters in nonparametric settings. The explicit link to norm-based tests and the emphasis on tractable null behavior address a known practical difficulty. Credit is due for the simulation studies assessing operating characteristics and the real-data application, which demonstrate applicability beyond theory.

minor comments (3)

[Abstract] The abstract states that the test statistic 'admits a tractable limiting null distribution' but does not indicate its form (e.g., Gaussian process, chi-squared). Adding one sentence on the nature of the limit would improve immediate accessibility.
[Simulations] In the simulation section, the choice of smoothing parameters or bandwidths for the nonparametric estimators of the conditional distributions is not detailed; explicit guidance or sensitivity checks would aid reproducibility.
[Methods] Notation for the smooth statistical functional and the conditioning variable could be introduced with a single consolidated display early in the methods section to reduce cross-referencing.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of its significance, and recommendation for minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents a nonparametric framework for testing constancy of function-valued parameters obtained from smooth statistical functionals applied to conditional distributions. The central claim of a tractable limiting null distribution follows directly from the stated smoothness of the functional and consistent nonparametric estimation of the conditional distributions, with an explicit link to norm-based procedures that avoids their poor null behavior. No derivation step reduces by construction to its inputs, no fitted parameter is relabeled as a prediction, and no load-bearing premise rests on self-citation chains or imported uniqueness theorems. The approach is self-contained against standard asymptotic theory for nonparametric estimators.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard nonparametric statistics assumptions including smoothness of the statistical functional and existence of limiting distributions under the null; no free parameters or invented entities are introduced in the abstract.

axioms (2)

domain assumption The statistical functional is smooth
Explicitly stated in the abstract as the basis for obtaining function-valued parameters from conditional probability distributions.
domain assumption Conditional distributions admit consistent nonparametric estimation leading to tractable asymptotics
Required for the claimed limiting null distribution to hold, as implied by the abstract's description of the test statistic.

pith-pipeline@v0.9.0 · 5424 in / 1251 out tokens · 36477 ms · 2026-05-10T01:26:34.819610+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

[1]

and Imbens, G

Athey, S. and Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11(1):685–725. Bickel, P. J., Klaassen, C. A., Bickel, P. J., Ritov, Y., Klaassen, J., Wellner, J. A., and Ritov, Y. (1993).Efficient and adaptive estimation for semiparametric models, volume

work page 2019
[2]

Springer. Byar, D. P. (1985). Assessing apparent treatment—covariate interactions in randomized clinical trials. Statistics in Medicine, 4(3):255–263. Cai, L., Guo, X., and Zhong, W. (2024). Test and measure for partial mean dependence based on machine learning methods.Journal of the American Statistical Association, (just-accepted):1–32. Chernozhukov, V....

work page arXiv 1985
[3]

Nonparametric tests of treatment effect homogeneity for policy-makers

Cambridge university press. Dukes, O., Stensrud, M. J., Brioschi, R., and Hudson, A. (2024). Nonparametric tests of treatment effect homogeneity for policy-makers.arXiv:2410.00985. Fisher, A. and Kennedy, E. H. (2021). Visually communicating and teaching intuition for influence functions.The American Statistician, 75(2):162–172. Folland, G. B. (1999).Real...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[4]

and Tibshirani, R

Hastie, T. and Tibshirani, R. (1990).Generalized Additive Models, volume

work page 1990
[5]

Hern´ an, M

CRC Press. Hern´ an, M. A. and Robins, J. M. (2020).Causal Inference: What If. Chapman & Hall/CRC, Boca Raton, FL. Hsu, Y.-C. (2016). Multiplier bootstrap for empirical processes. Technical report, Institute of Eco- nomics, Academia Sinica, Taipei, Taiwan. Hudson, A. (2023). Nonparametric inference on non-negative dissimilarity measures at the boundary of...

work page arXiv 2020
[6]

and Xie, J

Liu, Y. and Xie, J. (2020). Cauchy combination test: a powerful test with analytic p-value cal- culation under arbitrary dependency structures.Journal of the American Statistical Association, 115(529):393–402. Luedtke, A., Carone, M., and van der Laan, M. J. (2019). An omnibus non-parametric test of equality in distribution for unknown functions.Journal o...

work page 2020
[7]

van der Vaart, A

Cambridge university press. van der Vaart, A. (2014). Higher order tangent spaces and influence functions.Statistical Science, pages 679–686. Van der Vaart, A. W. (2000).Asymptotic statistics, volume

work page 2014
[8]

A general nonparametric framework for testing hypotheses about function-valued parameters

Cambridge university press. Van Der Vaart, A. W. and Wellner, J. A. (1996). Weak convergence. InWeak convergence and empirical processes: with applications to statistics, pages 16–28. Springer. 29 Verdinelli, I. and Wasserman, L. (2024). Decorrelated variable importance.Journal of Machine Learning Research, 25(7):1–27. Wang, R., Zhao, Y.-Q., Dukes, O., an...

work page arXiv 1996
[9]

for each of the five genes (ACTR3B, BLVRA, CCNE1, FGFRA, SFRP1). S3 Discussion of additional examples In this section, we discuss two additional examples: testing the constancy of the conditional covariance and assessing treatment effect heterogeneity in survival settings. For each example, we show that the testing problem fall within our hypotheses class...

work page 2024
[10]

However, for our scientific question of interest,θ 0 is often not zero and hence the test by Shah and Peters (2020) generally does not apply

can be used. However, for our scientific question of interest,θ 0 is often not zero and hence the test by Shah and Peters (2020) generally does not apply. To implement our test, we to derive the EIF,D ∗ h,0(o), of Ω P0(h). Lets denote the finite-dimensional parameter from the marginalization of Ψ 0,z as eΨ0 =E[E[(Y−E[Y|Z])(X−E[X|Z])|Z]], then eΨ0 is pathw...

work page 2020
[11]

•Setting 1:(Y, X)∼N 0 0 , 1 0 0 1 andZ∼Unif(−1,1)

with natural cubic spline components, implemented in the R packagemgcv. •Setting 1:(Y, X)∼N 0 0 , 1 0 0 1 andZ∼Unif(−1,1). •Setting 2:Z∼Unif(0,1) and (Y, X|Z=z)∼N 0 0 , 1ρ(z) ρ(z) 1 , whereρ(z) = ez2 −1 ez2 +1. •Setting 3:Z∼Unif(0,1),X∼N(0,1), andY= 0.5X1(Z >0) +ε, whereε∼N(0,1). The results are presented in Figure S.2. Under Setting 1, where the null hyp...

work page 2024
[12]

By Condition C2, for each fixedv, ∂ ∂t Ψt,v(v) t=0 = Z Dv P0(o)sv(o)dP 0,O|v(o), 12 where sv(o) =s(o)−E 0[s(O)|V=v]

Differentiating att= 0, we obtain ∂ ∂t eΨt t=0 = ∂ ∂t Z Ψt,v(v)dP t,V (v) t=0 = Z ∂ ∂t Ψt,v(v) t=0 dP0,V (v) + Z Ψ0,v(v) ∂ ∂t dPt,V (v) t=0 . By Condition C2, for each fixedv, ∂ ∂t Ψt,v(v) t=0 = Z Dv P0(o)sv(o)dP 0,O|v(o), 12 where sv(o) =s(o)−E 0[s(O)|V=v]. It follows that ∂ ∂t Ψt,v(v) t=0 = Z Dv P0(o){s(o)−E 0[s(O)|V=v]}dP 0,O|v(o) = Z Dv P0(o)s(o)dP 0,...

work page 1996

[1] [1]

and Imbens, G

Athey, S. and Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11(1):685–725. Bickel, P. J., Klaassen, C. A., Bickel, P. J., Ritov, Y., Klaassen, J., Wellner, J. A., and Ritov, Y. (1993).Efficient and adaptive estimation for semiparametric models, volume

work page 2019

[2] [2]

Springer. Byar, D. P. (1985). Assessing apparent treatment—covariate interactions in randomized clinical trials. Statistics in Medicine, 4(3):255–263. Cai, L., Guo, X., and Zhong, W. (2024). Test and measure for partial mean dependence based on machine learning methods.Journal of the American Statistical Association, (just-accepted):1–32. Chernozhukov, V....

work page arXiv 1985

[3] [3]

Nonparametric tests of treatment effect homogeneity for policy-makers

Cambridge university press. Dukes, O., Stensrud, M. J., Brioschi, R., and Hudson, A. (2024). Nonparametric tests of treatment effect homogeneity for policy-makers.arXiv:2410.00985. Fisher, A. and Kennedy, E. H. (2021). Visually communicating and teaching intuition for influence functions.The American Statistician, 75(2):162–172. Folland, G. B. (1999).Real...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[4] [4]

and Tibshirani, R

Hastie, T. and Tibshirani, R. (1990).Generalized Additive Models, volume

work page 1990

[5] [5]

Hern´ an, M

CRC Press. Hern´ an, M. A. and Robins, J. M. (2020).Causal Inference: What If. Chapman & Hall/CRC, Boca Raton, FL. Hsu, Y.-C. (2016). Multiplier bootstrap for empirical processes. Technical report, Institute of Eco- nomics, Academia Sinica, Taipei, Taiwan. Hudson, A. (2023). Nonparametric inference on non-negative dissimilarity measures at the boundary of...

work page arXiv 2020

[6] [6]

and Xie, J

Liu, Y. and Xie, J. (2020). Cauchy combination test: a powerful test with analytic p-value cal- culation under arbitrary dependency structures.Journal of the American Statistical Association, 115(529):393–402. Luedtke, A., Carone, M., and van der Laan, M. J. (2019). An omnibus non-parametric test of equality in distribution for unknown functions.Journal o...

work page 2020

[7] [7]

van der Vaart, A

Cambridge university press. van der Vaart, A. (2014). Higher order tangent spaces and influence functions.Statistical Science, pages 679–686. Van der Vaart, A. W. (2000).Asymptotic statistics, volume

work page 2014

[8] [8]

A general nonparametric framework for testing hypotheses about function-valued parameters

Cambridge university press. Van Der Vaart, A. W. and Wellner, J. A. (1996). Weak convergence. InWeak convergence and empirical processes: with applications to statistics, pages 16–28. Springer. 29 Verdinelli, I. and Wasserman, L. (2024). Decorrelated variable importance.Journal of Machine Learning Research, 25(7):1–27. Wang, R., Zhao, Y.-Q., Dukes, O., an...

work page arXiv 1996

[9] [9]

for each of the five genes (ACTR3B, BLVRA, CCNE1, FGFRA, SFRP1). S3 Discussion of additional examples In this section, we discuss two additional examples: testing the constancy of the conditional covariance and assessing treatment effect heterogeneity in survival settings. For each example, we show that the testing problem fall within our hypotheses class...

work page 2024

[10] [10]

However, for our scientific question of interest,θ 0 is often not zero and hence the test by Shah and Peters (2020) generally does not apply

can be used. However, for our scientific question of interest,θ 0 is often not zero and hence the test by Shah and Peters (2020) generally does not apply. To implement our test, we to derive the EIF,D ∗ h,0(o), of Ω P0(h). Lets denote the finite-dimensional parameter from the marginalization of Ψ 0,z as eΨ0 =E[E[(Y−E[Y|Z])(X−E[X|Z])|Z]], then eΨ0 is pathw...

work page 2020

[11] [11]

•Setting 1:(Y, X)∼N 0 0 , 1 0 0 1 andZ∼Unif(−1,1)

with natural cubic spline components, implemented in the R packagemgcv. •Setting 1:(Y, X)∼N 0 0 , 1 0 0 1 andZ∼Unif(−1,1). •Setting 2:Z∼Unif(0,1) and (Y, X|Z=z)∼N 0 0 , 1ρ(z) ρ(z) 1 , whereρ(z) = ez2 −1 ez2 +1. •Setting 3:Z∼Unif(0,1),X∼N(0,1), andY= 0.5X1(Z >0) +ε, whereε∼N(0,1). The results are presented in Figure S.2. Under Setting 1, where the null hyp...

work page 2024

[12] [12]

By Condition C2, for each fixedv, ∂ ∂t Ψt,v(v) t=0 = Z Dv P0(o)sv(o)dP 0,O|v(o), 12 where sv(o) =s(o)−E 0[s(O)|V=v]

Differentiating att= 0, we obtain ∂ ∂t eΨt t=0 = ∂ ∂t Z Ψt,v(v)dP t,V (v) t=0 = Z ∂ ∂t Ψt,v(v) t=0 dP0,V (v) + Z Ψ0,v(v) ∂ ∂t dPt,V (v) t=0 . By Condition C2, for each fixedv, ∂ ∂t Ψt,v(v) t=0 = Z Dv P0(o)sv(o)dP 0,O|v(o), 12 where sv(o) =s(o)−E 0[s(O)|V=v]. It follows that ∂ ∂t Ψt,v(v) t=0 = Z Dv P0(o){s(o)−E 0[s(O)|V=v]}dP 0,O|v(o) = Z Dv P0(o)s(o)dP 0,...

work page 1996