Locally Optimal Design for A/B Testing in the Presence of Covariates and Network Connection
Pith reviewed 2026-05-24 14:03 UTC · model grok-4.3
The pith
Incorporating network correlations into A/B test designs via a conditional autoregressive model produces treatment assignments with lower variance in the estimated effect.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that the optimal treatment allocation is found by minimizing a criterion equal to the variance of the estimated treatment effect under the CAR model, using local optimality to handle the unknown network correlation parameter, and this yields designs superior to those ignoring the network structure.
What carries the argument
A design criterion based on the variance of the treatment effect estimator in a conditional autoregressive model that incorporates the network adjacency matrix.
Load-bearing premise
The correlation between responses of connected users follows a specific conditional autoregressive model whose strength parameter can be fixed in advance for design purposes.
What would settle it
Running A/B tests on the same network with the proposed design versus a design ignoring the network and observing whether the actual variance of the treatment effect estimate is smaller for the proposed design.
Figures
read the original abstract
A/B test, a simple type of controlled experiment, refers to the statistical procedure of experimenting to compare two treatments applied to test subjects. For example, many IT companies frequently conduct A/B tests on their users who are connected and form social networks. Often, the users' responses could be related to the network connection. In this paper, we assume that the users, or the test subjects of the experiments, are connected on an undirected network, and the responses of two connected users are correlated. We include the treatment assignment, covariate features, and network connection in a conditional autoregressive model. Based on this model, we propose a design criterion that measures the variance of the estimated treatment effect and allocate the treatment settings to the test subjects by minimizing the criterion. Since the design criterion depends on an unknown network correlation parameter, we adopt the locally optimal design method and develop a hybrid optimization approach to obtain the optimal design. Through synthetic and real social network examples, we demonstrate the value of including network dependence in designing A/B experiments and validate that the proposed locally optimal design is robust to the choices of parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a locally optimal design for A/B testing on subjects connected by an undirected network. Responses are modeled via a conditional autoregressive (CAR) process that incorporates treatment assignments, covariates, and network edges. A design criterion equal to the variance of the treatment-effect estimator is minimized by a hybrid optimization procedure after fixing the unknown scalar network-correlation parameter; synthetic and real-network examples are used to illustrate efficiency gains from accounting for dependence and to claim robustness to the choice of that parameter.
Significance. Accounting for network dependence when designing experiments on social graphs is a relevant practical extension of classical optimal-design methods. If the variance criterion and the hybrid optimizer can be shown to deliver reliable gains under realistic departures from the CAR assumption, the approach could improve power in networked A/B tests; the examples already demonstrate that ignoring the network can inflate variance.
major comments (3)
- [Model and design criterion] The design criterion is a direct function of the unknown network-correlation parameter in the CAR model (abstract and model section). Local optimality is obtained only after fixing a value for this parameter; the manuscript therefore reports optimality conditional on a quantity that must be supplied or estimated outside the derivation itself.
- [Examples] Synthetic data are generated from the identical CAR specification used to derive the design criterion, and the real-network analysis likewise assumes the same undirected adjacency structure. Consequently the reported efficiency gains and robustness claims do not address model misspecification (directed edges, higher-order neighbors, or covariate-by-network interactions).
- [Examples] No error bars, standard errors, or explicit numerical comparison metrics (e.g., relative efficiency ratios with confidence intervals) are reported for the example results. This leaves the magnitude and stability of the claimed improvements unquantified.
minor comments (2)
- [Abstract] The abstract states that robustness is validated but does not summarize the concrete metrics or ranges of parameter values examined.
- [Throughout] Notation for the adjacency matrix and the correlation parameter should be introduced once and used consistently; occasional redefinition interrupts readability.
Simulated Author's Rebuttal
We thank the referee for the thoughtful comments on our manuscript. We address each major comment below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [Model and design criterion] The design criterion is a direct function of the unknown network-correlation parameter in the CAR model (abstract and model section). Local optimality is obtained only after fixing a value for this parameter; the manuscript therefore reports optimality conditional on a quantity that must be supplied or estimated outside the derivation itself.
Authors: We agree that the design is locally optimal conditional on a fixed value of the network-correlation parameter ρ. This is standard in the theory of locally optimal designs whenever the criterion depends on unknown parameters (as occurs with covariance structures or nonlinear models). The manuscript develops a hybrid optimization procedure applicable once ρ is specified and demonstrates robustness of the resulting designs to the choice of ρ. We will revise the text to state more explicitly that optimality is local with respect to ρ and to recommend obtaining a preliminary estimate from pilot data when applying the method. revision: partial
-
Referee: [Examples] Synthetic data are generated from the identical CAR specification used to derive the design criterion, and the real-network analysis likewise assumes the same undirected adjacency structure. Consequently the reported efficiency gains and robustness claims do not address model misspecification (directed edges, higher-order neighbors, or covariate-by-network interactions).
Authors: The referee correctly notes that the examples are generated under the assumed CAR model with undirected edges and therefore do not examine robustness to misspecifications such as directed edges, higher-order neighbors, or covariate-by-network interactions. The paper's scope is the derivation and illustration of the method under the stated model, with robustness checks confined to the scalar parameter ρ. We will add a paragraph in the discussion section clarifying this scope and noting the need for future work on model misspecification. revision: partial
-
Referee: [Examples] No error bars, standard errors, or explicit numerical comparison metrics (e.g., relative efficiency ratios with confidence intervals) are reported for the example results. This leaves the magnitude and stability of the claimed improvements unquantified.
Authors: We accept that the example results would be strengthened by quantitative metrics. In the revision we will add tables reporting relative efficiency ratios of the proposed design versus random assignment and covariate-balanced designs. For the synthetic examples we will also report standard deviations of the efficiency measures across repeated simulation runs to quantify stability. revision: yes
Circularity Check
No significant circularity; local optimality is explicitly conditional on a fixed parameter value
full rationale
The paper's central derivation defines a variance-based design criterion from the CAR model, notes its dependence on the unknown correlation parameter, and applies the standard locally optimal design procedure by substituting a fixed value before optimizing the assignment. This is not a reduction of the result to its inputs by construction; the optimality is declared conditional on the plugged-in value, and robustness is assessed separately via synthetic and real-network examples. No self-citation chains, self-definitional loops, fitted inputs renamed as predictions, or ansatzes smuggled via prior work appear in the load-bearing steps. The approach is self-contained against external benchmarks once the modeling assumptions are granted.
Axiom & Free-Parameter Ledger
free parameters (1)
- network correlation parameter
axioms (1)
- domain assumption Responses of connected users follow a conditional autoregressive model whose dependence structure is fully captured by the given undirected network adjacency matrix.
Reference graph
Works this paper leans on
-
[1]
Atkinson, A. C. and Bailey, R. (2001), One hundred years of the design of experiments on and off the pages of Biometrika, Biometrika, 88, 53--97
work page 2001
-
[2]
Atkinson, A. C. and Donev, A. N. (1992), Optimum experimental designs, Oxford Science Publications, London
work page 1992
-
[3]
Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014), Hierarchical Modeling and Analysis for Spatial Data, New York: Chapman and Hall/CRC, 2nd ed
work page 2014
-
[4]
Basse, G. W. and Airoldi, E. M. (2018 a ), Limitations of design-based causal inference and A/B testing under arbitrary and network interference, Sociological Methodology, 48, 136--151
work page 2018
-
[5]
--- (2018 b ), Model-assisted design of experiments in the presence of network-correlated outcomes, Biometrika, 105, 849--858
work page 2018
-
[6]
(2013), Mixed-integer nonlinear optimization, Acta Numerica, 22, 1--131
Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., and Mahajan, A. (2013), Mixed-integer nonlinear optimization, Acta Numerica, 22, 1--131
work page 2013
-
[7]
Bertsimas, D., Johnson, M., and Kallus, N. (2015), The power of optimization over randomization in designing experiments involving small samples, Operations Research, 63, 868--876
work page 2015
-
[8]
Besag, J. (1974), Spatial interaction and the statistical analysis of lattice systems, Journal of the Royal Statistical Society: Series B (Methodological), 36, 192--225
work page 1974
-
[9]
Bhat, N., Farias, V. F., Moallemi, C. C., and Sinha, D. (2020), Near-Optimal AB Testing, Management Science
work page 2020
-
[10]
Chaloner, K. and Verdinelli, I. (1995), Bayesian experimental design: A review, Statistical Science, 273--304
work page 1995
-
[11]
Cressie, N. A. C. (1993), Statistics for spatial data, New York: Wiley, revised edition ed
work page 1993
-
[12]
Drovandi, C. C. and Tran, M.-N. (2018), Improving the Efficiency of Fully Bayesian Optimal Design of Experiments Using Randomised Quasi-Monte Carlo , Bayesian Analysis, 13, 139 -- 162
work page 2018
-
[13]
Eckles, D., Karrer, B., and Ugander, J. (2016), Design and Analysis of Experiments in Networks: Reducing Bias from Interference, Journal of Causal Inference, 5, 20150021
work page 2016
-
[14]
Gui, H., Xu, Y., Bhasin, A., and Han, J. (2015), Network A/B Testing: From Sampling to Estimation, in Proceedings of the 24th International Conference on World Wide Web, pp. 399--409
work page 2015
-
[15]
(2015), Gurobi optimizer reference manual, URL http://www
Gurobi Optimization, I. (2015), Gurobi optimizer reference manual, URL http://www. gurobi. com
work page 2015
-
[16]
Imbens, G. W. and Rubin, D. B. (2015), Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction, New York: Cambridge University Press
work page 2015
-
[17]
Kallus, N. (2018), Optimal a priori balance in the design of controlled experiments, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 85--112
work page 2018
-
[18]
(1961), Optimum designs in regression problems, II, The Annals of Mathematical Statistics, 298--325
Kiefer, J. (1961), Optimum designs in regression problems, II, The Annals of Mathematical Statistics, 298--325
work page 1961
-
[19]
Kohavi, R., Tang, D., and Xu, Y. (2020), Trustworthy online controlled experiments: A practical guide to a/b testing, Cambridge University Press
work page 2020
-
[20]
(2017), Designing experiments on networks, Ph.D
Koutra, V. (2017), Designing experiments on networks, Ph.D. thesis, University of Southampton
work page 2017
-
[21]
Li, Y., Kang, L., and Huang, X. (2021), Covariate balancing based on kernel density estimates for controlled experiments, Statistical Theory and Related Fields, 5, 102--113
work page 2021
-
[22]
(1986), On the design of experiments under spatial correlation, Biometrika, 73, 247--277
Martin, R. (1986), On the design of experiments under spatial correlation, Biometrika, 73, 247--277
work page 1986
-
[23]
Morgan, K. L. and Rubin, D. B. (2012), Rerandomization to improve covariate balance in experiments, The Annals of Statistics, 40, 1263--1282
work page 2012
-
[24]
--- (2015), Rerandomization to balance tiers of covariates, Journal of the American Statistical Association, 110, 1412--1421
work page 2015
-
[25]
Nandy, P., Basu, K., Chatterjee, S., and Tu, Y. (2020), A/B testing in dense large-scale networks: design and inference, Advances in Neural Information Processing Systems, 33
work page 2020
-
[26]
Parker, B. M., Gilmour, S. G., and Schormans, J. (2017), Optimal design of experiments on connected units with application to social networks, Journal of the Royal Statistical Society: Series C (Applied Statistics), 3, 455--480
work page 2017
-
[27]
Phan, T. Q. and Airoldi, E. M. (2015), A natural experiment of social network formation and dynamics, Proceedings of the National Academy of Sciences, 112, 6595--6600
work page 2015
-
[28]
Pokhilko, V., Zhang, Q., Kang, L., and Darcy, P. M. (2019), D-Optimal Design for Network A/B Testing, Journal of Statistical Theory and Practice, 13, 61
work page 2019
-
[29]
(2018), GEMSEC: Graph Embedding with Self Clustering,
Rozemberczki, B., Davies, R., Sarkar, R., and Sutton, C. (2018), GEMSEC: Graph Embedding with Self Clustering,
work page 2018
-
[30]
Rubin, D. B. (1974), Estimating causal effects of treatments in randomized and nonrandomized studies, Journal of educational Psychology, 66, 688
work page 1974
-
[31]
--- (2005), Causal inference using potential outcomes: Design, modeling, decisions, Journal of the American Statistical Association, 100, 322--331
work page 2005
-
[32]
Rue, H. and Held, L. (2005), Gaussian Markov Random Fields: Theory and Applications, New York: Chapman and Hall/CRC
work page 2005
-
[33]
Ryan, E. G., Drovandi, C. C., McGree, J. M., and Pettitt, A. N. (2016), A Review of Modern Computational Algorithms for Bayesian Optimal Design, International Statistical Review, 84, 128--154
work page 2016
-
[34]
Ryan, E. G., Drovandi, C. C., Thompson, M. H., and Pettitt, A. N. (2014), Towards Bayesian experimental design for nonlinear models that require a large number of sampling times, Computational Statistics & Data Analysis, 70, 45--60
work page 2014
-
[35]
Ver Hoef, J. M., Hanks, E. M., and Hooten, M. B. (2018), On the relationship between conditional (CAR) and simultaneous (SAR) autoregressive models, Spatial statistics, 25, 68--85
work page 2018
-
[36]
, " * write output.state after.block = add.period write newline
ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 '...
-
[37]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in " " * FUNCTION format....
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.