arxiv: 2605.05772 · v1 · submitted 2026-05-07 · 📊 stat.ME

Recognition: unknown

UD-DML: Uniform Design Subsampling for Double Machine Learning over Massive Data

Yuanke Qu , Xiaoya Xu , Hengtao Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:01 UTC · model grok-4.3

classification 📊 stat.ME

keywords double machine learningsubsamplinguniform designmixture discrepancyaverage treatment effectcausal inferenceasymptotic normalitymassive data

0 comments

The pith

UD-DML selects a low-discrepancy matched subsample in PCA-rotated space to let double machine learning deliver valid inference on the average treatment effect with subsample size r much smaller than n.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops Uniform Design Double Machine Learning to handle the prohibitive cost of fitting nuisance models on massive observational data. Instead of drawing a uniform random subsample, it places a low-discrepancy skeleton in the principal-component-rotated covariate space and then matches each skeleton point to the nearest treated and control observations. The resulting subsample is guaranteed to be representative of the full covariate distribution and balanced across treatment arms. Under mild conditions the UD-DML estimator for the average treatment effect is asymptotically normal at the square-root-of-r rate, so the dominant computational burden drops from the full-sample scale to the much smaller subsample scale.

Core claim

The UD-DML procedure first constructs a low-discrepancy skeleton in the PCA-rotated covariate space under the mixture-discrepancy criterion and then assigns to each skeleton point the nearest treated and control units via KD-tree search. Cross-fitted double machine learning is applied to the resulting matched subsample. The paper establishes discrepancy-based guarantees for representativeness and balance and proves that the UD-DML estimator is sqrt(r)-asymptotically normal under mild conditions, where r much less than n.

What carries the argument

Low-discrepancy skeleton in PCA-rotated covariate space under mixture-discrepancy, followed by nearest-neighbor matching to produce a representative and balanced subsample for cross-fitted DML.

If this is right

Nuisance-fitting cost falls from order n to order r while asymptotic normality is retained.
The estimator produces narrower confidence intervals and better coverage than uniform subsampling, with the largest gains when overlap is limited or models are misspecified.
Discrepancy guarantees directly control both representativeness of the covariate distribution and balance between treatment arms.
The method remains applicable once r is chosen substantially smaller than the original sample size n.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same skeleton-and-matching construction could be applied to other low-dimensional causal parameters beyond the average treatment effect.
If the PCA step is replaced by a sparse rotation, the procedure might extend to settings where the covariate dimension grows with n.
Large-scale tests on datasets with tens of millions of observations would quantify the exact wall-clock savings relative to full-data DML.

Load-bearing premise

The mild conditions required for sqrt(r) asymptotic normality hold, including sufficient overlap and that the PCA rotation together with nearest-neighbor matching preserve the necessary moments so the discrepancy bounds translate into valid DML error bounds.

What would settle it

A Monte Carlo experiment on data with deliberately reduced overlap in which the empirical coverage of the UD-DML confidence intervals falls materially below the nominal level for the chosen subsample size r.

Figures

Figures reproduced from arXiv: 2605.05772 by Hengtao Zhang, Xiaoya Xu, Yuanke Qu.

**Figure 1.** Figure 1: Results for the first simulation experiment, assessing statistical efficiency as a function view at source ↗

**Figure 2.** Figure 2: Propensity score distributions for the three observational study (OBS) scenarios. We view at source ↗

**Figure 3.** Figure 3: Overlap-gradient experiment on the OBS-3 variant with tunable propensity multiplier view at source ↗

**Figure 4.** Figure 4: Q–Q plot of the standardised UD-DML estimator on OBS-3 ( view at source ↗

**Figure 5.** Figure 5: Love plot of covariate-wise standardised mean differences (SMD) between the treated view at source ↗

**Figure 6.** Figure 6: Bootstrap behaviour of UD-DML and UNIF-DML at the subsample size view at source ↗

**Figure 7.** Figure 7: Subsample-size scaling on the CDC 2021 natality data. Each panel reports a view at source ↗

read the original abstract

Double machine learning (DML) delivers valid inference on low-dimensional causal parameters while permitting flexible nuisance estimation, but its computational cost becomes prohibitive once cross-fitted learners must be trained on massive observational data. Applying DML to a uniformly drawn subsample alleviates this burden, yet such a reduction disregards the geometry of the covariate space and can exacerbate treated-control imbalance as well as overlap deficiency. We propose Uniform Design Double Machine Learning (UD-DML), a design-based subsampling strategy for average treatment effect (ATE) estimation. UD-DML first constructs a low-discrepancy skeleton in a PCA-rotated covariate space under the mixture-discrepancy criterion, and then assigns, to each skeleton point, the nearest treated and control units via KD-tree search. The resulting matched subsample is, by construction, both representative of the full covariate distribution and balanced across treatment arms; cross-fitted DML is subsequently applied to it. We establish discrepancy-based guarantees for representativeness and balance, and prove that the UD-DML estimator is $\sqrt{r}$-asymptotically normal under mild conditions, where the selected subsample size $r \ll n$. The dominant nuisance-fitting cost is thereby reduced from the $n$-scale to the $r$-scale. Monte Carlo experiments show that UD-DML attains lower RMSE, narrower confidence intervals and more reliable coverage than uniform subsampling, with the largest gains in low-overlap and misspecified regimes. An application to a large observational dataset further demonstrates its practical feasibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

UD-DML gives a concrete subsampling route for DML on big data via PCA-uniform skeleton plus KD-tree matching, but the matching step's effect on nuisance rates is the part that still needs checking.

read the letter

The paper's core move is to replace uniform random subsampling with a design-based procedure: rotate covariates by PCA, lay down a low-discrepancy skeleton under the mixture criterion, then pull the nearest treated and control observations to each skeleton point via KD-tree. The resulting r-point subsample is fed into ordinary cross-fitted DML. They state discrepancy bounds for balance and representativeness and claim a sqrt(r) normality result under mild conditions, with the computational saving coming from fitting nuisances only at scale r rather than n. Simulations show lower RMSE, tighter intervals, and better coverage than plain random subsampling, with the biggest differences appearing in low-overlap and misspecified cases. An application to a large observational set is mentioned to show feasibility. That combination of uniform-design ideas with the DML pipeline is the part that is actually new relative to the cited literature. The experiments give a clear, practical comparison that practitioners can evaluate directly. The soft spot is the translation from skeleton discrepancy to the post-matching empirical measure. Nearest-neighbor assignment selects real data points, not the ideal skeleton locations, and the PCA rotation depends on the full sample, so extra bias or slower convergence in the nuisance estimators is possible in low-density regions. The abstract says the mild conditions cover this, but the argument would need to bound the additional remainder term explicitly; without that step the o_p(r^{-1/4}) requirement for the DML expansion is not automatic. This is aimed at applied people who already use DML for ATE but hit memory or time limits on large observational files in economics, health, or social science. A reader who needs a drop-in way to shrink the cross-fit step while keeping some design guarantees would get immediate value from the method and the reported Monte Carlo results. It deserves peer review. The scaling problem is real, the proposed fix is specific and implementable, and the experiments supply usable evidence even if the theory on the matching error will probably need tightening in revision.

Referee Report

2 major / 2 minor

Summary. The paper proposes Uniform Design Double Machine Learning (UD-DML) for ATE estimation on massive data: it builds a low-discrepancy skeleton in PCA-rotated covariate space under the mixture-discrepancy criterion, assigns nearest treated and control units to each skeleton point via KD-tree nearest-neighbor matching, and then runs cross-fitted DML on the resulting balanced subsample of size r ≪ n. It claims discrepancy-based guarantees for representativeness and balance, proves that the UD-DML estimator is √r-asymptotically normal under mild conditions, and demonstrates via simulations and a real-data example that it reduces nuisance-fitting cost while improving RMSE, CI width, and coverage relative to uniform subsampling, especially under low overlap or misspecification.

Significance. If the central asymptotic claim holds after accounting for the data-dependent matching step, the work provides a principled, design-based route to scale DML to large observational datasets while preserving valid inference and gaining finite-sample robustness in difficult regimes. The explicit use of uniform-design discrepancy theory to control both representativeness and treatment balance is a concrete strength that could influence future subsampling methods in causal machine learning.

major comments (2)

[§4, Theorem 3] §4 (asymptotic theory), Theorem 3: the proof of √r-asymptotic normality relies on the matched subsample satisfying the standard DML nuisance-rate conditions (o_p(r^{-1/4}) or faster), yet the argument only bounds discrepancy for the ideal skeleton; it does not explicitly derive that the post-PCA, post-NN-matching empirical measure deviates from the target by at most o_p(r^{-1/2}) in the relevant norms, leaving open whether matching-induced bias in low-density or poor-overlap regions inflates the remainder term beyond what is claimed.
[§3.2 and conditions preceding Theorem 3] §3.2 (matching step) and the regularity conditions stated before Theorem 3: the paper assumes that KD-tree nearest-neighbor assignment after data-dependent PCA rotation preserves the moment and overlap conditions needed for the DML expansion, but no quantitative bound is given on how the matching error scales with r or with local density; without this, the translation from skeleton discrepancy to the required nuisance-estimator rates is not fully load-bearing.

minor comments (2)

[§2] The mixture-discrepancy definition and its relation to the PCA rotation should be stated explicitly in the main text (or a short appendix) rather than only referenced, to aid readers outside uniform-design theory.
[§5] Simulation tables: report the exact ratio r/n used in each Monte Carlo setting and confirm that the reported coverage is for the √r-normalized intervals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and for identifying key points in the asymptotic analysis that require clarification. We will revise the manuscript to strengthen the proof of Theorem 3 by making the bounds on the post-matching empirical measure explicit. Our responses to the major comments follow.

read point-by-point responses

Referee: [§4, Theorem 3] §4 (asymptotic theory), Theorem 3: the proof of √r-asymptotic normality relies on the matched subsample satisfying the standard DML nuisance-rate conditions (o_p(r^{-1/4}) or faster), yet the argument only bounds discrepancy for the ideal skeleton; it does not explicitly derive that the post-PCA, post-NN-matching empirical measure deviates from the target by at most o_p(r^{-1/2}) in the relevant norms, leaving open whether matching-induced bias in low-density or poor-overlap regions inflates the remainder term beyond what is claimed.

Authors: We agree that the current write-up of the proof in §4 focuses primarily on the discrepancy bound for the ideal skeleton and invokes the regularity conditions to transfer the rates to the matched subsample. In the revision we will insert an intermediate lemma that bounds the total variation (or appropriate integral probability metric) distance between the post-PCA, post-NN empirical measure and the target measure. Under Assumptions 1–3 the additional discrepancy contributed by KD-tree matching is shown to be O_p(r^{-1/2} log r) in the relevant function class, which is absorbed into the o_p(r^{-1/2}) term required for the DML remainder. This step uses the fact that PCA rotation aligns the principal axes with the directions of highest density variation, thereby controlling the local matching error even in regions of moderate overlap. revision: yes
Referee: [§3.2 and conditions preceding Theorem 3] §3.2 (matching step) and the regularity conditions stated before Theorem 3: the paper assumes that KD-tree nearest-neighbor assignment after data-dependent PCA rotation preserves the moment and overlap conditions needed for the DML expansion, but no quantitative bound is given on how the matching error scales with r or with local density; without this, the translation from skeleton discrepancy to the required nuisance-estimator rates is not fully load-bearing.

Authors: We acknowledge that a quantitative scaling of the matching error with r and local density is not stated explicitly. The revised manuscript will add a supporting lemma (placed after the description of the KD-tree step in §3.2) that derives E[matching distance] = O(r^{-1/d_eff}) where d_eff is the effective dimension after PCA truncation, together with a high-probability bound on the deviation of the empirical moments and the propensity-score overlap measure. These bounds are obtained by combining the low-discrepancy property of the skeleton with standard covering-number arguments for nearest-neighbor search in Euclidean space. The resulting rates are sufficient to keep the nuisance estimators inside the o_p(r^{-1/4}) envelope required by Theorem 3. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation combines external discrepancy theory with standard DML asymptotics

full rationale

The paper constructs a low-discrepancy skeleton via mixture-discrepancy minimization in PCA space, matches nearest neighbors, then invokes standard DML cross-fitting and asymptotic normality results on the resulting subsample of size r. The claimed √r-normality is obtained by showing that the discrepancy guarantees imply the required o_p(r^{-1/4}) nuisance rates under the listed mild conditions; this step does not redefine the target parameter in terms of itself, rename a fitted quantity as a prediction, or rest on a self-citation chain whose validity is internal to the present work. The central claim therefore remains independent of the procedure's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard DML regularity conditions plus properties of uniform designs and PCA; no new free parameters or invented entities are introduced beyond the algorithmic choices.

axioms (2)

domain assumption Standard DML assumptions: unconfoundedness, overlap, and nuisance estimators that converge at appropriate rates.
Required for the √r-asymptotic normality to hold.
standard math Low-discrepancy properties of the uniform design skeleton are preserved after PCA rotation and nearest-neighbor assignment.
Used to obtain the representativeness and balance guarantees.

pith-pipeline@v0.9.0 · 5572 in / 1443 out tokens · 65964 ms · 2026-05-08T08:01:11.620976+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

107 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Communications of the ACM , volume=

Multidimensional binary search trees used for associative searching , author=. Communications of the ACM , volume=. 1975 , publisher=

1975
[2]

Proceedings of the Sixth Annual Symposium on Computational Geometry , pages=

K-d trees for semidynamic point sets , author=. Proceedings of the Sixth Annual Symposium on Computational Geometry , pages=
[3]

2022 , note =

Vital Statistics Natality Birth Data, 2021 (Public Use File) , howpublished =. 2022 , note =

2021
[4]

Journal of Applied Econometrics , volume=

Estimating the effect of smoking on birth outcomes using a matched panel data approach , author=. Journal of Applied Econometrics , volume=. 2006 , publisher=

2006
[5]

Journal of Econometrics , volume=

Efficient semiparametric estimation of multi-valued treatment effects under ignorability , author=. Journal of Econometrics , volume=. 2010 , publisher=

2010
[6]

, author=

Determinants of low birth weight: methodological assessment and meta-analysis. , author=. Bulletin of the World Health Organization , volume=. 1987 , publisher=

1987
[7]

The Quarterly Journal of Economics , volume=

The costs of low birth weight , author=. The Quarterly Journal of Economics , volume=. 2005 , publisher=

2005
[8]

Southern Economic Journal , volume=

Teen smoking and birth outcomes , author=. Southern Economic Journal , volume=. 2009 , publisher=

2009
[9]

Maternal active smoking during pregnancy and low birth weight in the

Pereira, Priscilla Perez da Silva and Da Mata, Fabiana AF and Figueiredo, Ana Claudia Godoy and de Andrade, Keitty Regina Cordeiro and Pereira, Maur. Maternal active smoking during pregnancy and low birth weight in the. Nicotine & Tobacco Research , volume=. 2017 , publisher=

2017
[10]

2023 , note =

Joblib: Running. 2023 , note =

2023
[11]

Handbook of Statistical Methods for Precision Medicine , pages=

Semiparametric doubly robust targeted double machine learning: a review , author=. Handbook of Statistical Methods for Precision Medicine , pages=. 2024 , publisher=

2024
[12]

eGEMs , volume=

Estimating causal effects in observational studies using electronic health data: challenges and (some) solutions , author=. eGEMs , volume=
[13]

The Innovation , volume=

A survey on causal inference for recommendation , author=. The Innovation , volume=. 2024 , publisher=

2024
[14]

The impacts of neighborhoods on intergenerational mobility

Chetty, Raj and Hendren, Nathaniel , journal=. The impacts of neighborhoods on intergenerational mobility. 2018 , publisher=

2018
[15]

Epidemiology , volume=

Machine learning for causal inference: on the use of cross-fit estimators , author=. Epidemiology , volume=. 2021 , publisher=

2021
[16]

Statistics in Medicine , volume=

A fast bootstrap algorithm for causal inference with large data , author=. Statistics in Medicine , volume=. 2024 , publisher=

2024
[17]

Biometrika , volume=

Joint sufficient dimension reduction and estimation of conditional and average treatment effects , author=. Biometrika , volume=. 2017 , publisher=

2017
[18]

Annals of statistics , volume=

A robust and efficient approach to causal inference based on sparse sufficient dimension reduction , author=. Annals of statistics , volume=
[19]

Statistica Sinica , volume=

Sufficient dimension reduction for feasible and robust estimation of average causal effect , author=. Statistica Sinica , volume=
[20]

Journal of Business & Economic Statistics , volume=

Matching using sufficient dimension reduction for causal inference , author=. Journal of Business & Economic Statistics , volume=. 2020 , publisher=

2020
[21]

Proceedings of the National Academy of Sciences , volume=

High-dimensional regression adjustments in randomized experiments , author=. Proceedings of the National Academy of Sciences , volume=. 2016 , publisher=

2016
[22]

2018 , publisher=

Design and Modeling for Computer Experiments , author=. 2018 , publisher=

2018
[23]

Journal of Complexity , volume=

Mixture discrepancy for quasi-random point sets , author=. Journal of Complexity , volume=. 2013 , publisher=

2013
[24]

arXiv preprint cs/9901013 , year=

Analysis of approximate nearest neighbor searching with clustered point sets , author=. arXiv preprint cs/9901013 , year=

work page internal anchor Pith review arXiv
[25]

Statistics and Computing , volume=

Model-free global likelihood subsampling for massive data , author=. Statistics and Computing , volume=. 2023 , publisher=

2023
[26]

Econometrica , pages=

On the role of the propensity score in efficient semiparametric estimation of average treatment effects , author=. Econometrica , pages=. 1998 , publisher=

1998
[27]

IEEE Transactions on Knowledge and Data Engineering , volume=

Model-free subsampling method based on uniform designs , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2023 , publisher=

2023
[28]

arXiv preprint arXiv:1801.09138 , year=

Cross-fitting and fast remainder rates for semiparametric estimation , author=. arXiv preprint arXiv:1801.09138 , year=

work page arXiv
[29]

Journal of the American Statistical Association , year =

Wang, HaiYing and Yang, Min and Stufken, John , title =. Journal of the American Statistical Association , year =
[30]

Econometrica: Journal of the Econometric Society , pages=

The asymptotic variance of semiparametric estimators , author=. Econometrica: Journal of the Econometric Society , pages=. 1994 , publisher=

1994
[31]

2015 , publisher=

Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , author=. 2015 , publisher=

2015
[32]

1993 , publisher=

Efficient and Adaptive Estimation for Semiparametric Models , author =. 1993 , publisher=

1993
[33]

Journal of the American Statistical Association , volume=

Adjusting for nonignorable drop-out using semiparametric nonresponse models , author=. Journal of the American Statistical Association , volume=. 1999 , publisher=

1999
[34]

Biometrika , volume=

Quasi-oracle estimation of heterogeneous treatment effects , author=. Biometrika , volume=. 2021 , publisher=

2021
[35]

Journal of the American Statistical Association , year =

Wager, Stefan and Athey, Susan , title =. Journal of the American Statistical Association , year =
[36]

The Annals of Statistics , year =

Athey, Susan and Tibshirani, Julie and Wager, Stefan , title =. The Annals of Statistics , year =
[37]

Journal of Data Science , volume=

A review on optimal subsampling methods for massive datasets , author=. Journal of Data Science , volume=. 2021 , doi=

2021
[38]

Statistical Papers , volume=

A review on design inspired subsampling for big data , author=. Statistical Papers , volume=. 2024 , doi=

2024
[39]

Statistics and Computing , volume=

Adaptive iterative Hessian sketch via A-optimal subsampling , author=. Statistics and Computing , volume=. 2020 , doi=

2020
[40]

Technometrics , volume=

Uniform design: theory and application , author=. Technometrics , volume=. 2000 , publisher=

2000
[41]

ACM Transactions on Knowledge Discovery from Data , volume=

Stable Subsampling under Model Misspecification and Covariate Shift , author=. ACM Transactions on Knowledge Discovery from Data , volume=. 2025 , publisher=

2025
[42]

Journal of Machine Learning Research , volume=

More efficient estimation for logistic regression with optimal subsamples , author=. Journal of Machine Learning Research , volume=
[43]

Statistics in Medicine , volume=

Sampling-based estimation for massive survival data with additive hazards model , author=. Statistics in Medicine , volume=. 2021 , publisher=

2021
[44]

Journal of the American Statistical Association , volume=

Optimal distributed subsampling for maximum quasi-likelihood estimators with massive data , author=. Journal of the American Statistical Association , volume=. 2022 , publisher=

2022
[45]

Technometrics , volume=

Efficient model-free subsampling method for massive data , author=. Technometrics , volume=. 2024 , publisher=

2024
[46]

ACM Transactions on Knowledge Discovery from Data (TKDD) , volume=

Balance-subsampled stable prediction across unknown test data , author=. ACM Transactions on Knowledge Discovery from Data (TKDD) , volume=. 2021 , publisher=

2021
[47]

Computational Statistics & Data Analysis , pages=

Fast and efficient causal inference in large-scale data via subsampling and projection calibration , author=. Computational Statistics & Data Analysis , pages=. 2025 , publisher=

2025
[48]

Computational Statistics & Data Analysis , volume=

A two-stage optimal subsampling estimation for missing data problems with large-scale data , author=. Computational Statistics & Data Analysis , volume=. 2022 , publisher=

2022
[49]

Biometrika , volume=

The central role of the propensity score in observational studies for causal effects , author=. Biometrika , volume=. 1983 , doi=

1983
[50]

Econometrica , volume=

On the role of the propensity score in efficient estimation of the average treatment effect , author=. Econometrica , volume=. 2018 , doi=

2018
[51]

Journal of the American Statistical Association , year =

Wang, HaiYing and Zhu, Rong and Ma, Ping , title =. Journal of the American Statistical Association , year =
[52]

Statistica Sinica , volume=

Optimal subsampling algorithms for big data regressions , author=. Statistica Sinica , volume=. 2021 , publisher=

2021
[53]

Biometrika , volume=

Optimal subsampling for quantile regression in big data , author=. Biometrika , volume=. 2021 , publisher=

2021
[54]

Political Analysis , year =

Hainmueller, Jens , title =. Political Analysis , year =
[55]

, title =

Zubizarreta, Jose R. , title =. Journal of the American Statistical Association , year =
[56]

Biometrika , year =

Deville, Jean-Claude and Tillé, Yves , title =. Biometrika , year =
[57]

Approximation of rejective sampling inclusion probabilities and application to high order correlations , journal =

Boistard, Hélène and Lopuha. Approximation of rejective sampling inclusion probabilities and application to high order correlations , journal =. 2012 , volume =

2012
[58]

Functional central limit theorems for single-stage sampling designs , journal =

Boistard, Hélène and Lopuha. Functional central limit theorems for single-stage sampling designs , journal =. 2017 , volume =

2017
[59]

The Annals of Statistics , year =

Loh, Wai-Lam , title =. The Annals of Statistics , year =
[60]

, title =

Basu, Kinjal and Owen, Art B. , title =. The Annals of Statistics , year =
[61]

Limit theorems for sampling from finite populations , journal =

Ros. Limit theorems for sampling from finite populations , journal =. 1964 , volume =

1964
[62]

, title =

van der Vaart, Aad W. , title =. 1998 , doi =

1998
[63]

Spatially Balanced Sampling through the Pivotal Method , journal =

Grafstr. Spatially Balanced Sampling through the Pivotal Method , journal =. 2012 , volume =

2012
[64]

The Annals of Mathematical Statistics , volume=

A class of statistics with asymptotically normal distribution , author=. The Annals of Mathematical Statistics , volume=. 1948 , doi=

1948
[65]

Econometrica , volume=

Large sample properties of matching estimators for average treatment effects , author=. Econometrica , volume=. 2006 , doi=

2006
[66]

Publications of the Mathematical Institute of the Hungarian Academy of Sciences , volume=

Limiting distributions in simple random sampling from a finite population , author=. Publications of the Mathematical Institute of the Hungarian Academy of Sciences , volume=
[67]

1977 , publisher=

Sampling Techniques , author=. 1977 , publisher=

1977
[68]

Technometrics , volume=

A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , author=. Technometrics , volume=. 1979 , doi=

1979
[69]

Large sample properties of simulations using

Stein, Michael , journal=. Large sample properties of simulations using. 1987 , doi=

1987
[70]

SciPy 1.0: fundamental algorithms for scientific computing in

Virtanen, Pauli and Gommers, Ralf and Oliphant, Travis E and Haberland, Matt and Reddy, Tyler and Cournapeau, David and Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and Bright, Jonathan and others , journal=. SciPy 1.0: fundamental algorithms for scientific computing in. 2020 , doi=

2020
[71]

Communications of the ACM , volume=

Multidimensional binary search trees used for associative searching , author=. Communications of the ACM , volume=. 1975 , doi=

1975
[72]

ACM Transactions on Mathematical Software (TOMS) , volume=

An algorithm for finding best matches in logarithmic expected time , author=. ACM Transactions on Mathematical Software (TOMS) , volume=. 1977 , doi=

1977
[73]

Mathematical Proceedings of the Cambridge Philosophical Society , volume =

On Functions of Bounded Variation , author =. Mathematical Proceedings of the Cambridge Philosophical Society , volume =. 2017 , doi =

2017
[74]

Foundations and Trends in Machine Learning , volume =

Kernel Mean Embedding of Distributions: A Review and Beyond , author =. Foundations and Trends in Machine Learning , volume =. 2017 , doi =

2017
[75]

Journal of Complexity , volume =

On the Koksma–Hlawka Inequality , author =. Journal of Complexity , volume =. 2013 , doi =

2013
[76]

2010 , publisher=

Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration , author=. 2010 , publisher=

2010
[77]

1990 , publisher=

Spline Models for Observational Data , author=. 1990 , publisher=

1990
[78]

2002 , publisher=

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , author=. 2002 , publisher=

2002
[79]

High-dimensional integration: the quasi-

Dick, Josef and Kuo, Frances Y and Sloan, Ian H , journal=. High-dimensional integration: the quasi-. 2013 , publisher=

2013
[80]

1994 , publisher=

Uniform Design: Theory and Application , author=. 1994 , publisher=

1994

Showing first 80 references.