pith. sign in

arxiv: 2008.06476 · v3 · submitted 2020-08-14 · 📊 stat.ME

Locally Optimal Design for A/B Testing in the Presence of Covariates and Network Connection

Pith reviewed 2026-05-24 14:03 UTC · model grok-4.3

classification 📊 stat.ME
keywords A/B testingoptimal experimental designnetwork dependenceconditional autoregressive modeltreatment effect estimationcovariatessocial networks
0
0 comments X

The pith

Incorporating network correlations into A/B test designs via a conditional autoregressive model produces treatment assignments with lower variance in the estimated effect.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a method for designing A/B experiments when test subjects are linked in an undirected network and their responses are correlated through those links. It models the responses with a conditional autoregressive structure that accounts for treatment assignments, covariates, and network connections. The design criterion targets the variance of the treatment effect estimator, and because the correlation strength is unknown, the authors use locally optimal designs optimized via a hybrid approach. Synthetic and real data examples show that accounting for the network improves design efficiency and that the method is robust to parameter choices. A sympathetic reader would care because many real-world experiments, such as those on social platforms, involve interdependent users where standard independent designs waste statistical power.

Core claim

The paper claims that the optimal treatment allocation is found by minimizing a criterion equal to the variance of the estimated treatment effect under the CAR model, using local optimality to handle the unknown network correlation parameter, and this yields designs superior to those ignoring the network structure.

What carries the argument

A design criterion based on the variance of the treatment effect estimator in a conditional autoregressive model that incorporates the network adjacency matrix.

Load-bearing premise

The correlation between responses of connected users follows a specific conditional autoregressive model whose strength parameter can be fixed in advance for design purposes.

What would settle it

Running A/B tests on the same network with the proposed design versus a design ignoring the network and observing whether the actual variance of the treatment effect estimate is smaller for the proposed design.

Figures

Figures reproduced from arXiv: 2008.06476 by Lulu Kang, Qiong Zhang.

Figure 1
Figure 1. Figure 1: Scatter plot of T(x, ρ0) and T(x, ρ) of each pair of (ρ0, ρ) with 1000 randomly generated designs. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of the optimal design allocation. Two treatments are denoted by different [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: PIP(x) of the locally optimal designs with p = 10 and network density 0.08. 6.1 Robustness on the Choices of α and ρ0 In this subsection, we consider two versions of the proposed hybrid design approach. 1. Locally optimal design: the optimal design obtained by solving the optimization problem in (11). We specify the mean of the prior distribution to be 0.5, i.e., ρ0 = 0.5. 2. True optimal design: the optim… view at source ↗
Figure 4
Figure 4. Figure 4: The differences of PIP between the locally optimal design with [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The PIP values of optimal designs with and without network with [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The PIPs of the locally optimal design with network and the optimal design without [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Boxplots of percentages of PIP values of two kinds of optimal designs for case study. [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The percentiles of MSEs of the two optimal design approaches (i.e., with and without [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
read the original abstract

A/B test, a simple type of controlled experiment, refers to the statistical procedure of experimenting to compare two treatments applied to test subjects. For example, many IT companies frequently conduct A/B tests on their users who are connected and form social networks. Often, the users' responses could be related to the network connection. In this paper, we assume that the users, or the test subjects of the experiments, are connected on an undirected network, and the responses of two connected users are correlated. We include the treatment assignment, covariate features, and network connection in a conditional autoregressive model. Based on this model, we propose a design criterion that measures the variance of the estimated treatment effect and allocate the treatment settings to the test subjects by minimizing the criterion. Since the design criterion depends on an unknown network correlation parameter, we adopt the locally optimal design method and develop a hybrid optimization approach to obtain the optimal design. Through synthetic and real social network examples, we demonstrate the value of including network dependence in designing A/B experiments and validate that the proposed locally optimal design is robust to the choices of parameters.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a locally optimal design for A/B testing on subjects connected by an undirected network. Responses are modeled via a conditional autoregressive (CAR) process that incorporates treatment assignments, covariates, and network edges. A design criterion equal to the variance of the treatment-effect estimator is minimized by a hybrid optimization procedure after fixing the unknown scalar network-correlation parameter; synthetic and real-network examples are used to illustrate efficiency gains from accounting for dependence and to claim robustness to the choice of that parameter.

Significance. Accounting for network dependence when designing experiments on social graphs is a relevant practical extension of classical optimal-design methods. If the variance criterion and the hybrid optimizer can be shown to deliver reliable gains under realistic departures from the CAR assumption, the approach could improve power in networked A/B tests; the examples already demonstrate that ignoring the network can inflate variance.

major comments (3)
  1. [Model and design criterion] The design criterion is a direct function of the unknown network-correlation parameter in the CAR model (abstract and model section). Local optimality is obtained only after fixing a value for this parameter; the manuscript therefore reports optimality conditional on a quantity that must be supplied or estimated outside the derivation itself.
  2. [Examples] Synthetic data are generated from the identical CAR specification used to derive the design criterion, and the real-network analysis likewise assumes the same undirected adjacency structure. Consequently the reported efficiency gains and robustness claims do not address model misspecification (directed edges, higher-order neighbors, or covariate-by-network interactions).
  3. [Examples] No error bars, standard errors, or explicit numerical comparison metrics (e.g., relative efficiency ratios with confidence intervals) are reported for the example results. This leaves the magnitude and stability of the claimed improvements unquantified.
minor comments (2)
  1. [Abstract] The abstract states that robustness is validated but does not summarize the concrete metrics or ranges of parameter values examined.
  2. [Throughout] Notation for the adjacency matrix and the correlation parameter should be introduced once and used consistently; occasional redefinition interrupts readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful comments on our manuscript. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Model and design criterion] The design criterion is a direct function of the unknown network-correlation parameter in the CAR model (abstract and model section). Local optimality is obtained only after fixing a value for this parameter; the manuscript therefore reports optimality conditional on a quantity that must be supplied or estimated outside the derivation itself.

    Authors: We agree that the design is locally optimal conditional on a fixed value of the network-correlation parameter ρ. This is standard in the theory of locally optimal designs whenever the criterion depends on unknown parameters (as occurs with covariance structures or nonlinear models). The manuscript develops a hybrid optimization procedure applicable once ρ is specified and demonstrates robustness of the resulting designs to the choice of ρ. We will revise the text to state more explicitly that optimality is local with respect to ρ and to recommend obtaining a preliminary estimate from pilot data when applying the method. revision: partial

  2. Referee: [Examples] Synthetic data are generated from the identical CAR specification used to derive the design criterion, and the real-network analysis likewise assumes the same undirected adjacency structure. Consequently the reported efficiency gains and robustness claims do not address model misspecification (directed edges, higher-order neighbors, or covariate-by-network interactions).

    Authors: The referee correctly notes that the examples are generated under the assumed CAR model with undirected edges and therefore do not examine robustness to misspecifications such as directed edges, higher-order neighbors, or covariate-by-network interactions. The paper's scope is the derivation and illustration of the method under the stated model, with robustness checks confined to the scalar parameter ρ. We will add a paragraph in the discussion section clarifying this scope and noting the need for future work on model misspecification. revision: partial

  3. Referee: [Examples] No error bars, standard errors, or explicit numerical comparison metrics (e.g., relative efficiency ratios with confidence intervals) are reported for the example results. This leaves the magnitude and stability of the claimed improvements unquantified.

    Authors: We accept that the example results would be strengthened by quantitative metrics. In the revision we will add tables reporting relative efficiency ratios of the proposed design versus random assignment and covariate-balanced designs. For the synthetic examples we will also report standard deviations of the efficiency measures across repeated simulation runs to quantify stability. revision: yes

Circularity Check

0 steps flagged

No significant circularity; local optimality is explicitly conditional on a fixed parameter value

full rationale

The paper's central derivation defines a variance-based design criterion from the CAR model, notes its dependence on the unknown correlation parameter, and applies the standard locally optimal design procedure by substituting a fixed value before optimizing the assignment. This is not a reduction of the result to its inputs by construction; the optimality is declared conditional on the plugged-in value, and robustness is assessed separately via synthetic and real-network examples. No self-citation chains, self-definitional loops, fitted inputs renamed as predictions, or ansatzes smuggled via prior work appear in the load-bearing steps. The approach is self-contained against external benchmarks once the modeling assumptions are granted.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the CAR model for network dependence and on the ability to treat the correlation parameter as fixed when constructing the design; no new entities are postulated.

free parameters (1)
  • network correlation parameter
    The variance criterion used for design optimization depends on this unknown scalar; the locally optimal assignment is computed after fixing a value for it.
axioms (1)
  • domain assumption Responses of connected users follow a conditional autoregressive model whose dependence structure is fully captured by the given undirected network adjacency matrix.
    Invoked when the authors state that responses of two connected users are correlated and include network connection in the model.

pith-pipeline@v0.9.0 · 5721 in / 1487 out tokens · 23572 ms · 2026-05-24T14:03:19.058630+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Atkinson, A. C. and Bailey, R. (2001), One hundred years of the design of experiments on and off the pages of Biometrika, Biometrika, 88, 53--97

  2. [2]

    Atkinson, A. C. and Donev, A. N. (1992), Optimum experimental designs, Oxford Science Publications, London

  3. [3]

    P., and Gelfand, A

    Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014), Hierarchical Modeling and Analysis for Spatial Data, New York: Chapman and Hall/CRC, 2nd ed

  4. [4]

    Basse, G. W. and Airoldi, E. M. (2018 a ), Limitations of design-based causal inference and A/B testing under arbitrary and network interference, Sociological Methodology, 48, 136--151

  5. [5]

    --- (2018 b ), Model-assisted design of experiments in the presence of network-correlated outcomes, Biometrika, 105, 849--858

  6. [6]

    (2013), Mixed-integer nonlinear optimization, Acta Numerica, 22, 1--131

    Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., and Mahajan, A. (2013), Mixed-integer nonlinear optimization, Acta Numerica, 22, 1--131

  7. [7]

    (2015), The power of optimization over randomization in designing experiments involving small samples, Operations Research, 63, 868--876

    Bertsimas, D., Johnson, M., and Kallus, N. (2015), The power of optimization over randomization in designing experiments involving small samples, Operations Research, 63, 868--876

  8. [8]

    (1974), Spatial interaction and the statistical analysis of lattice systems, Journal of the Royal Statistical Society: Series B (Methodological), 36, 192--225

    Besag, J. (1974), Spatial interaction and the statistical analysis of lattice systems, Journal of the Royal Statistical Society: Series B (Methodological), 36, 192--225

  9. [9]

    F., Moallemi, C

    Bhat, N., Farias, V. F., Moallemi, C. C., and Sinha, D. (2020), Near-Optimal AB Testing, Management Science

  10. [10]

    and Verdinelli, I

    Chaloner, K. and Verdinelli, I. (1995), Bayesian experimental design: A review, Statistical Science, 273--304

  11. [11]

    Cressie, N. A. C. (1993), Statistics for spatial data, New York: Wiley, revised edition ed

  12. [12]

    Drovandi, C. C. and Tran, M.-N. (2018), Improving the Efficiency of Fully Bayesian Optimal Design of Experiments Using Randomised Quasi-Monte Carlo , Bayesian Analysis, 13, 139 -- 162

  13. [13]

    (2016), Design and Analysis of Experiments in Networks: Reducing Bias from Interference, Journal of Causal Inference, 5, 20150021

    Eckles, D., Karrer, B., and Ugander, J. (2016), Design and Analysis of Experiments in Networks: Reducing Bias from Interference, Journal of Causal Inference, 5, 20150021

  14. [14]

    (2015), Network A/B Testing: From Sampling to Estimation, in Proceedings of the 24th International Conference on World Wide Web, pp

    Gui, H., Xu, Y., Bhasin, A., and Han, J. (2015), Network A/B Testing: From Sampling to Estimation, in Proceedings of the 24th International Conference on World Wide Web, pp. 399--409

  15. [15]

    (2015), Gurobi optimizer reference manual, URL http://www

    Gurobi Optimization, I. (2015), Gurobi optimizer reference manual, URL http://www. gurobi. com

  16. [16]

    Imbens, G. W. and Rubin, D. B. (2015), Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction, New York: Cambridge University Press

  17. [17]

    (2018), Optimal a priori balance in the design of controlled experiments, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 85--112

    Kallus, N. (2018), Optimal a priori balance in the design of controlled experiments, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 85--112

  18. [18]

    (1961), Optimum designs in regression problems, II, The Annals of Mathematical Statistics, 298--325

    Kiefer, J. (1961), Optimum designs in regression problems, II, The Annals of Mathematical Statistics, 298--325

  19. [19]

    (2020), Trustworthy online controlled experiments: A practical guide to a/b testing, Cambridge University Press

    Kohavi, R., Tang, D., and Xu, Y. (2020), Trustworthy online controlled experiments: A practical guide to a/b testing, Cambridge University Press

  20. [20]

    (2017), Designing experiments on networks, Ph.D

    Koutra, V. (2017), Designing experiments on networks, Ph.D. thesis, University of Southampton

  21. [21]

    (2021), Covariate balancing based on kernel density estimates for controlled experiments, Statistical Theory and Related Fields, 5, 102--113

    Li, Y., Kang, L., and Huang, X. (2021), Covariate balancing based on kernel density estimates for controlled experiments, Statistical Theory and Related Fields, 5, 102--113

  22. [22]

    (1986), On the design of experiments under spatial correlation, Biometrika, 73, 247--277

    Martin, R. (1986), On the design of experiments under spatial correlation, Biometrika, 73, 247--277

  23. [23]

    Morgan, K. L. and Rubin, D. B. (2012), Rerandomization to improve covariate balance in experiments, The Annals of Statistics, 40, 1263--1282

  24. [24]

    --- (2015), Rerandomization to balance tiers of covariates, Journal of the American Statistical Association, 110, 1412--1421

  25. [25]

    (2020), A/B testing in dense large-scale networks: design and inference, Advances in Neural Information Processing Systems, 33

    Nandy, P., Basu, K., Chatterjee, S., and Tu, Y. (2020), A/B testing in dense large-scale networks: design and inference, Advances in Neural Information Processing Systems, 33

  26. [26]

    M., Gilmour, S

    Parker, B. M., Gilmour, S. G., and Schormans, J. (2017), Optimal design of experiments on connected units with application to social networks, Journal of the Royal Statistical Society: Series C (Applied Statistics), 3, 455--480

  27. [27]

    Phan, T. Q. and Airoldi, E. M. (2015), A natural experiment of social network formation and dynamics, Proceedings of the National Academy of Sciences, 112, 6595--6600

  28. [28]

    Pokhilko, V., Zhang, Q., Kang, L., and Darcy, P. M. (2019), D-Optimal Design for Network A/B Testing, Journal of Statistical Theory and Practice, 13, 61

  29. [29]

    (2018), GEMSEC: Graph Embedding with Self Clustering,

    Rozemberczki, B., Davies, R., Sarkar, R., and Sutton, C. (2018), GEMSEC: Graph Embedding with Self Clustering,

  30. [30]

    Rubin, D. B. (1974), Estimating causal effects of treatments in randomized and nonrandomized studies, Journal of educational Psychology, 66, 688

  31. [31]

    --- (2005), Causal inference using potential outcomes: Design, modeling, decisions, Journal of the American Statistical Association, 100, 322--331

  32. [32]

    and Held, L

    Rue, H. and Held, L. (2005), Gaussian Markov Random Fields: Theory and Applications, New York: Chapman and Hall/CRC

  33. [33]

    G., Drovandi, C

    Ryan, E. G., Drovandi, C. C., McGree, J. M., and Pettitt, A. N. (2016), A Review of Modern Computational Algorithms for Bayesian Optimal Design, International Statistical Review, 84, 128--154

  34. [34]

    G., Drovandi, C

    Ryan, E. G., Drovandi, C. C., Thompson, M. H., and Pettitt, A. N. (2014), Towards Bayesian experimental design for nonlinear models that require a large number of sampling times, Computational Statistics & Data Analysis, 70, 45--60

  35. [35]

    M., Hanks, E

    Ver Hoef, J. M., Hanks, E. M., and Hooten, M. B. (2018), On the relationship between conditional (CAR) and simultaneous (SAR) autoregressive models, Spatial statistics, 25, 68--85

  36. [36]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 '...

  37. [37]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in " " * FUNCTION format....