Inference from multivariate differential recruitment in respondent-driven sampling data

Danilo Alvares; Isabelle S. Beaudry; Jonathan Acosta; Vanesa Reinoso

arxiv: 2604.10018 · v1 · submitted 2026-04-11 · 📊 stat.ME

Inference from multivariate differential recruitment in respondent-driven sampling data

Vanesa Reinoso , Danilo Alvares , Jonathan Acosta , Isabelle S. Beaudry This is my paper

Pith reviewed 2026-05-10 16:25 UTC · model grok-4.3

classification 📊 stat.ME

keywords respondent-driven samplingdifferential recruitmentmultivariate covariatesMarkov processprevalence estimationhidden populationsbootstrap variancechain-referral sampling

0 comments

The pith

Respondent-driven sampling inference can now adjust for multiple simultaneous covariates in recruitment behavior.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a framework called Multivariate Differential Recruitment that treats RDS recruitment as a Markov process whose transition probabilities are shaped by any number of observed covariates on nodes or ties. Standard prevalence estimators are then rewritten inside this model, and a modified neighborhood bootstrap supplies variance estimates. Simulations test the approach across varied network sizes, recruitment rates, and covariate types, while a real application to Venezuelan migrants in Chile shows how the adjustments change population estimates. A sympathetic reader would care because RDS is widely used for hidden populations in public health, and ignoring multivariate recruitment preferences has long introduced uncontrolled bias in prevalence figures.

Core claim

We model RDS as a Markov process with transition probabilities that depend on continuous or categorical variables associated with nodes or their ties. We then extend various prevalence estimators to this multivariate framework and implement a slightly modified neighborhood bootstrap for variance estimation.

What carries the argument

Multivariate Differential Recruitment (MDR) as a first-order Markov process whose transition probabilities are fully determined by the observed multivariate covariates.

Load-bearing premise

The recruitment process is adequately captured by a first-order Markov model whose transition probabilities are fully determined by the observed multivariate covariates, without substantial unmeasured network structure or higher-order dependencies.

What would settle it

Generate RDS data from networks that include unmeasured homophily or second-order recruitment rules, apply the MDR estimators, and check whether the resulting prevalence estimates remain unbiased relative to the known true values.

Figures

Figures reproduced from arXiv: 2604.10018 by Danilo Alvares, Isabelle S. Beaudry, Jonathan Acosta, Vanesa Reinoso.

**Figure 2.** Figure 2: Estimation error for each estimator in different scenario configurations. Rows correspond to homophily levels [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: 95% confidence interval coverage by estimator in different scenarios configuration. Rows represents the [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Tree of the RDS sampling process, the non-males nodes are colored with dark gray and the males with light [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

read the original abstract

Respondent-Driven Sampling (RDS) is a chain-referral design used for collecting data from hidden or hard-to-reach populations through their social networks. In RDS, respondents recruit their peers from the population of interest. As such, inference with RDS data commonly relies on estimated sampling probabilities derived from specific recruitment assumptions. Early literature assumes random recruitment, which is often unrealistic because individuals may recruit based on their personal preferences. This behavior is known as Differential Recruitment (DR). Recent works have incorporated univariate categorical DR in the estimation procedures. The main objective of this paper is to introduce Multivariate Differential Recruitment (MDR), a framework that incorporates multiple simultaneous covariates, both categorical and continuous, into the sampling representation. We model RDS as a Markov process with transition probabilities that depend on continuous or categorical variables associated with nodes or their ties. We then extend various prevalence estimators to this multivariate framework and implement a slightly modified neighborhood bootstrap for variance estimation. The proposed methodology is assessed through simulation studies for a range of network and sampling features. It is applied to an RDS study conducted among the adult Venezuelan population living in the Metropolitan Region of Santiago, Chile.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends differential recruitment in RDS to multiple covariates including continuous ones, but the whole thing rests on a first-order Markov model that may miss unmeasured network effects.

read the letter

The paper introduces multivariate differential recruitment for respondent-driven sampling by treating the process as a Markov chain whose transition probabilities depend on several covariates at once, both categorical and continuous. They then adjust the usual prevalence estimators and add a modified neighborhood bootstrap for variance. Simulations check performance across network and sampling setups, and they apply it to an RDS study of Venezuelan adults in Santiago, Chile. This moves past the univariate categorical versions in earlier work and gives a usable framework for more realistic recruitment modeling. The real-data example is a plus for showing how the pieces fit together in practice. The central assumption is that observed covariates fully determine the recruitment transitions under a first-order Markov structure. If unmeasured attributes or longer-range dependencies matter, the transition estimates will be wrong and the adjusted prevalence numbers will stay biased. The simulations are said to cover a range of features, but they are unlikely to catch this kind of misspecification if the data are generated from the same model. Without the full tables it is hard to judge how large the practical gains are. This is for survey statisticians and epidemiologists who already work with RDS data from hidden populations and need to handle richer covariate information. A methods-focused reading group could discuss the Markov step and the bootstrap tweak. It deserves a serious referee because the extension addresses a known practical limitation with a clear implementation path, even if the robustness checks need strengthening.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a multivariate differential recruitment (MDR) framework for respondent-driven sampling (RDS) data. It models the recruitment process as a first-order Markov chain with transition probabilities that are functions of multiple covariates (categorical or continuous) associated with nodes or ties. Standard prevalence estimators are extended to this setting, and a modified neighborhood bootstrap is proposed for variance estimation. The method is evaluated in simulation studies covering various network and sampling features and demonstrated on an RDS survey of Venezuelan adults in Santiago, Chile.

Significance. If the first-order Markov assumption holds and the observed covariates sufficiently capture recruitment preferences without substantial residual network effects, this framework offers a meaningful extension beyond univariate categorical differential recruitment methods by accommodating simultaneous multivariate influences. The simulation studies across network features and the real-data application to the Chilean Venezuelan population provide practical validation, while the modified neighborhood bootstrap addresses a key implementation need for variance estimation in the extended model.

major comments (2)

[Simulation studies] Simulation studies section: the reported simulations generate data from the assumed first-order Markov model with covariate-dependent transitions; this setup cannot detect bias arising from unmeasured network structure or higher-order dependencies, which directly undermines the central claim that the extended prevalence estimators remain valid under realistic MDR.
[Methods] Methods, Markov process modeling: the transition probabilities are stated to depend on the observed multivariate covariates, but no diagnostic or sensitivity analysis is provided for residual dependence after conditioning on these covariates; this assumption is load-bearing for the subsequent derivation of adjusted sampling weights and the extensions of RDS-I/RDS-II-type estimators.

minor comments (2)

[Abstract] Abstract: the phrase 'extend various prevalence estimators' should explicitly name the estimators (e.g., RDS-I, RDS-II, or others) being generalized to the MDR setting.
[Methods] Notation: the manuscript should clarify whether the transition probability parameters are estimated jointly with the prevalence parameters or in a two-step procedure, as this affects the bootstrap implementation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the potential of the multivariate differential recruitment framework. We address each major comment below and outline the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Simulation studies] Simulation studies section: the reported simulations generate data from the assumed first-order Markov model with covariate-dependent transitions; this setup cannot detect bias arising from unmeasured network structure or higher-order dependencies, which directly undermines the central claim that the extended prevalence estimators remain valid under realistic MDR.

Authors: We agree that the simulations evaluate estimator performance under the first-order Markov data-generating process with covariate-dependent transitions. This verifies the derivations when the modeling assumptions hold, which is the primary scope of the proposed MDR framework. We acknowledge that the current design does not probe robustness to higher-order dependencies or unmeasured network structure. In the revised manuscript we will expand the simulation section with additional experiments that generate recruitment chains from networks exhibiting residual dependence or higher-order Markov structure not captured by the observed covariates. We will also add explicit discussion clarifying that the validity of the extended prevalence estimators is conditional on the first-order MDR assumption and note the need for future robustness checks under more complex network processes. revision: yes
Referee: [Methods] Methods, Markov process modeling: the transition probabilities are stated to depend on the observed multivariate covariates, but no diagnostic or sensitivity analysis is provided for residual dependence after conditioning on these covariates; this assumption is load-bearing for the subsequent derivation of adjusted sampling weights and the extensions of RDS-I/RDS-II-type estimators.

Authors: The referee is correct that the assumption of no residual dependence after conditioning on the observed covariates is central to the transition model and to the subsequent weight derivations. The current manuscript does not include formal diagnostics or sensitivity analyses for this assumption. We will add a dedicated subsection on model assessment that proposes practical checks for residual dependence (for example, examining autocorrelation patterns in the recruitment chains after covariate adjustment) and outlines sensitivity analyses obtained by successively omitting or adding covariates. These additions will allow users to evaluate the assumption in applied settings and will be accompanied by guidance on interpreting results when the assumption may be only approximately satisfied. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper models RDS recruitment as a first-order Markov process whose transition probabilities are functions of observed multivariate covariates (categorical or continuous), then extends standard prevalence estimators (RDS-I, RDS-II and variants) to this setting and applies a modified neighborhood bootstrap. These steps are presented as direct extensions of existing RDS literature and Markov assumptions rather than reductions of any claimed result to quantities defined solely by the paper's own fitted parameters or self-citations. No equations are shown to be equivalent by construction, no fitted input is relabeled as an independent prediction, and no load-bearing uniqueness theorem or ansatz is imported from the authors' prior work. Simulations and the Chile application serve as external checks rather than the derivation itself. The framework therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on modeling RDS recruitment as a covariate-dependent Markov process and extending existing estimators; this introduces parameters for the transition probabilities that must be estimated from data.

free parameters (1)

covariate-dependent transition probability parameters
Parameters governing how multiple covariates influence recruitment probabilities; these are estimated within the model.

axioms (1)

domain assumption RDS data can be represented as a Markov process on the network with transitions depending on node and tie covariates
Core modeling choice stated in the abstract; standard in RDS but extended here to multivariate case.

pith-pipeline@v0.9.0 · 5502 in / 1211 out tokens · 56998 ms · 2026-05-10T16:25:52.599411+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages

[1]

Arayasirikul, S., Cai, X., and Wilson, E. C. (2015). A qualitative examination of respondent-driven sampling (RDS): Peer referral challenges among young transwomen in the San Francisco bay area.JMIR Public Health and Surveillance, 1(2):e9

work page 2015
[2]

D., Morris, M

Assaf, R. D., Morris, M. D., Straus, E. R., Martinez, P., Philbin, M. M., and Kushel, M. (2025). Illicit substance use and treatment access among adults experiencing homelessness.JAMA, 333(14):1222–1231

work page 2025
[3]

and Rotondi, M

Avery, L. and Rotondi, M. (2023). Evaluation of respondent-driven sampling prevalence estimators using real-world reported network degree.Sociological Methodology, 53(2):269–287

work page 2023
[4]

R., Muleia, R., Nuvunga, S., Boothe, M., and Baltazar, C

Banze, A. R., Muleia, R., Nuvunga, S., Boothe, M., and Baltazar, C. S. (2024). Trends in HIV prevalence and risk factors among men who have sex with men in Mozambique: Implications for targeted interventions and public health strategies.BMC Public Health, 24(1):1185

work page 2024
[5]

Barash, V ., Cameron, C., and Heckathorn, D. (2016). Respondent-driven sampling: Testing assumptions.Journal of Official Statistics, 32(1):29–73

work page 2016
[6]

Beaudry, I. S. and Gile, K. J. (2020). Correcting for differential recruitment in respondent-driven sampling data using ego-network information.Electronic Journal of Statistics, 14(2):2678–2713

work page 2020
[7]

O., and Pin, P

Currarini, S., Jackson, M. O., and Pin, P. (2009). An economic model of friendship: Homophily, minorities, and segregation.Econometrica, 77(4):1003–1045

work page 2009
[8]

and Forsé, M

Degenne, A. and Forsé, M. (1999).Introducing social networks. Sage Publications, London

work page 1999
[9]

Fellows, I. E. (2019). Respondent-driven sampling and the homophily configuration graph.Statistics in Medicine, 38(1):131–150

work page 2019
[10]

Fellows, I. E. (2022). On the robustness of respondent-driven sampling estimators to measurement error.Journal of Survey Statistics and Methodology, 10(2):377–396. Fonseca de Barros, B., Fynn, I., Nocetto, L., Beaudry, I., Luna, J. P., Piñeiro, R., and Rosenblatt Rodríguez, F. (2024). How parties take advantage of immigrant waves. Political incorporation ...

work page 2022
[11]

and Strauss, D

Frank, O. and Strauss, D. (1986). Markov graphs.Journal of the American Statistical Association, 81(395):832–842

work page 1986
[12]

Gile, K., Beaudry, I., Handcock, M., and Ott, M. (2018). Methods for inference from respondent-driven sampling data. Annual Review of Statistics and Its Application, 5:65–93

work page 2018
[13]

Gile, K. J. and Handcock, M. S. (2010). Respondent-driven sampling: An assessment of current methodology. Sociological Methodology, 40(1):285–327

work page 2010
[14]

J., Johnston, L

Gile, K. J., Johnston, L. G., and Salganik, M. J. (2015). Diagnostics for respondent-driven sampling.Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(1):241–269

work page 2015
[15]

and Salganik, M

Goel, S. and Salganik, M. J. (2009). Respondent-driven sampling as Markov chain Monte Carlo.Statistics in Medicine, 28(17):2202–2229

work page 2009
[16]

Hansen, M. H. and Hurwitz, W. N. (1943). On the theory of sampling from finite populations.The Annals of Mathematical Statistics, 14(4):333–362

work page 1943
[17]

Heckathorn, D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations.Social Problems, 44(2):174–199

work page 1997
[18]

Heckathorn, D. D. (2002). Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations.Social Problems, 49(1):11–34

work page 2002
[19]

Heckathorn, D. D. (2007). Extensions of respondent-driven sampling: Analyzing continuous variables and controlling for differential recruitment.Sociological Methodology, 37(1):151–208

work page 2007
[20]

Heckathorn, D. D. (2011). Snowball versus respondent-driven sampling.Sociological Methodology, 41(1):355–366

work page 2011
[21]

D., Semaan, S., Broadhead, R

Heckathorn, D. D., Semaan, S., Broadhead, R. S., and Hughes, J. J. (2002). Extensions of respondent-driven sampling: A new approach to the study of injection drug users aged 18-25.AIDS and Behavior, 6(1):55–67. 20

work page 2002
[22]

Hunter, D. R. and Handcock, M. S. (2006). Inference in curved exponential family models for networks.Journal of Computational and Graphical Statistics, 15(3):565–583

work page 2006
[23]

D., Ouedraogo, R., Kakesa, J., and Fetters, T

Jayaweera, R., Odhoch, L., Nabunje, J., Oduor, C., Zuniga, C., Powell, B., Barasa, W., Aber, F., Nyalwal, B., Wado, Y . D., Ouedraogo, R., Kakesa, J., and Fetters, T. (2025). Incidence and safety of abortion in two humanitarian settings in Uganda and Kenya: A respondent-driven sampling study.eClinicalMedicine, 83:103200

work page 2025
[24]

G., Malekinejad, M., Kendall, C., Iuppa, I

Johnston, L. G., Malekinejad, M., Kendall, C., Iuppa, I. M., and Rutherford, G. W. (2008). Implementation challenges to using respondent-driven sampling methodology for HIV biological and behavioral surveillance: Field experiences in international settings.AIDS and Behavior, 12(4):S131–S141

work page 2008
[25]

Johnston, L. G. and Sabin, K. (2010). Sampling hard-to-reach populations with respondent driven sampling.Method- ological Innovations Online, 5(2):38–48

work page 2010
[26]

A., Wejnert, C., Hall, D

Lansky, A., Abdul-Quader, L. A., Wejnert, C., Hall, D. R., Finlayson, D. M., Garfein, L. A., and Sullivan, P. S. (2007). Developing an HIV behavioral surveillance system for injecting drug users: The national HIV behavioral surveillance system.Public Health Reports, 122(Suppl 1):48–55

work page 2007
[27]

C., Carvalho, T

Leal, M. C., Carvalho, T. D. G., Santos, Y . R. P., Queiroz, R. S. B., Fonseca, P. A. M., Silva, A. A. M., Szwarcwald, C. L., and Riggirozzi, P. (2025). Determinants of self-rated health among Venezuelan migrant women in Brazil: A cross-sectional study.The Lancet Regional Health - Americas, 45:101077

work page 2025
[28]

W., Shin, H.-S., Weeks, M., Zelenev, A., Moothi, G., Mosher, H., Heimer, R., Robles, E., Palmer, G., and Obidoa, C

Li, J., Valente, T. W., Shin, H.-S., Weeks, M., Zelenev, A., Moothi, G., Mosher, H., Heimer, R., Robles, E., Palmer, G., and Obidoa, C. (2018). Overlooked threats to respondent driven sampling estimators: Peer recruitment reality, degree measures, and random selection assumption.AIDS and Behavior, 22(7):2340–2359

work page 2018
[29]

Liu, H., Li, J., Ha, T., and Li, J. (2012). Assessment of random recruitment assumption in respondent-driven sampling in egocentric network data.Social Networking, 1(2):13–21

work page 2012
[30]

Lu, X. (2013). Linked ego networks: Improving estimate reliability and validity with respondent-driven sampling. Social Networks, 35:669–685

work page 2013
[31]

Cambridge University Press, Cambridge, UK

Lusher, D., Koskinen, J., and Robins, G., editors (2013).Exponential random graph models for social networks: Theory, methods and applications. Cambridge University Press, Cambridge, UK

work page 2013
[32]

Magnani, R., Sabin, K., Saidel, T., and Heckathorn, D. D. (2005). Review of sampling hard-to-reach and hidden populations for HIV surveillance.AIDS, 19(Suppl. 2):S67–S72

work page 2005
[33]

McCreesh, N., Frost, S. D. W., Seeley, J., Katongole, J., Tarsh, M. N., Ndunguse, R., Jichi, F., Lunel, N. L., and Maher, D. (2012). Evaluation of respondent-driven sampling.Epidemiology, 23(1):138–147

work page 2012
[34]

McPherson, M., Smith-Lovin, L., and Cook, J. M. (2001). Birds of a feather: Homophily in social networks.Annual Review of Sociology, 27:415–444. R Core Team (2025).R: A language and environment for statistical computing. R Foundation for Statistical Computing,

work page 2001
[35]

and Rohe, K

Roch, S. and Rohe, K. (2018). Generalized least squares can overcome the critical threshold in respondent-driven sampling.Proceedings of the National Academy of Sciences, 115(41):10299–10304

work page 2018
[36]

E., Nance, R

Rudolph, A. E., Nance, R. M., Bobashev, G., Brook, D., Akhtar, W., Cook, R., Cooper, H. L., Friedmann, P. D., Frost, S. D. W., Go, V . F., Jenkins, W. D., Korthuis, P. T., Miller, W. C., Pho, M. T., Ruderman, S. A., Seal, D. W., Stopka, T. J., Westergaard, R. P., Young, A. M., Zule, W. A., and Tsui, J. I. (2024). Evaluation of respondent-driven sampling i...

work page 2024
[37]

Salganik, M. (2006). Variance estimation, design effects, and sample size calculations for respondent-driven sampling. Journal of Urban Health, 83(7):98–112

work page 2006
[38]

and Heckathorn, D

Salganik, M. and Heckathorn, D. (2004). Sampling and estimation in hidden populations using respondent-drive sampling.Sociological Methodology, 34(1):193–240

work page 2004
[39]

Shi, Y ., Cameron, C., and Heckathorn, D. (2019). Model-based and design-based inference: Reducing bias due to differential recruitment in respondent-driven sampling.Sociological Methods & Research, 48(1):3–33

work page 2019
[40]

Takahashi, Y ., Song, J., and Iida, T. (2025). Transnational political participation of undocumented Mexican immigrants in the US: Respondent-driven sampling with the hard-to-reach population.The Journal of Race, Ethnicity, and Politics, pages 1–26

work page 2025
[41]

and Gile, K

Tomas, A. and Gile, K. J. (2011). The effect of differential recruitment, non-response and non-recruitment on estimators for respondent-driven sampling.Electronic Journal of Statistics, 5:899–934. 21

work page 2011
[42]

Tourangeau, R., Edwards, B., and Johnson, T. (2014). Understanding respondent-driven sampling from a total survey error perspective.Survey Practice, 7(2):1–6

work page 2014
[43]

M., Merli, M

Verdery, A. M., Merli, M. G., Moody, J., Smith, J. A., and Fisher, J. C. (2015). Brief report: Respondent-driven sampling estimators under real and theoretical recruitment conditions of female sex workers in China.Epidemiology, 26(5):661–665. V olz, E. and Heckathorn, D. (2008). Probability based estimation theory for respondent driven sampling.Journal of...

work page 2015
[44]

Wang, P., Wei, C., McFarland, W., and Raymond, H. F. (2024). The development and the assessment of sampling methods for hard-to-reach populations in HIV surveillance.Journal of Urban Health, 101(4):856–866

work page 2024
[45]

L., Iyer, J., Brooks, D., Hailey-Fair, K., Galai, N., Beyrer, C., Celentano, D., and Arrington-Sanders, R

Wirtz, A. L., Iyer, J., Brooks, D., Hailey-Fair, K., Galai, N., Beyrer, C., Celentano, D., and Arrington-Sanders, R. (2021). An evaluation of assumptions underlying respondent-driven sampling and the social contexts of sexual and gender minority youth participating in HIV clinical trials in the United States.Journal of the International AIDS Society, 24(5):e25694

work page 2021
[46]

J., Merli, M

Yamanis, T. J., Merli, M. G., Neely, W. W., Tian, F. F., Moody, J., Tu, X., and Gao, E. (2013). An empirical analysis of the impact of recruitment patterns on RDS estimates among a socially ordered population of female sex workers in China.Sociological Methods & Research, 42(3):392–425

work page 2013
[47]

Yauck, M., Moodie, E. E. M., Apelian, H., Fourmigue, A., Grace, D., Hart, T. A., Lambert, G., and Cox, J. (2022). Neighborhood bootstrap for respondent-driven sampling.Journal of Survey Statistics and Methodology, 10(2):419– 438. A Apendix Table 8: Standard deviation (SD) across estimators and scenarios. Scenario =(τ, ϕ MDR) Estimator ˆµII V H ˆµII DR ˆµI...

work page 2022

[1] [1]

Arayasirikul, S., Cai, X., and Wilson, E. C. (2015). A qualitative examination of respondent-driven sampling (RDS): Peer referral challenges among young transwomen in the San Francisco bay area.JMIR Public Health and Surveillance, 1(2):e9

work page 2015

[2] [2]

D., Morris, M

Assaf, R. D., Morris, M. D., Straus, E. R., Martinez, P., Philbin, M. M., and Kushel, M. (2025). Illicit substance use and treatment access among adults experiencing homelessness.JAMA, 333(14):1222–1231

work page 2025

[3] [3]

and Rotondi, M

Avery, L. and Rotondi, M. (2023). Evaluation of respondent-driven sampling prevalence estimators using real-world reported network degree.Sociological Methodology, 53(2):269–287

work page 2023

[4] [4]

R., Muleia, R., Nuvunga, S., Boothe, M., and Baltazar, C

Banze, A. R., Muleia, R., Nuvunga, S., Boothe, M., and Baltazar, C. S. (2024). Trends in HIV prevalence and risk factors among men who have sex with men in Mozambique: Implications for targeted interventions and public health strategies.BMC Public Health, 24(1):1185

work page 2024

[5] [5]

Barash, V ., Cameron, C., and Heckathorn, D. (2016). Respondent-driven sampling: Testing assumptions.Journal of Official Statistics, 32(1):29–73

work page 2016

[6] [6]

Beaudry, I. S. and Gile, K. J. (2020). Correcting for differential recruitment in respondent-driven sampling data using ego-network information.Electronic Journal of Statistics, 14(2):2678–2713

work page 2020

[7] [7]

O., and Pin, P

Currarini, S., Jackson, M. O., and Pin, P. (2009). An economic model of friendship: Homophily, minorities, and segregation.Econometrica, 77(4):1003–1045

work page 2009

[8] [8]

and Forsé, M

Degenne, A. and Forsé, M. (1999).Introducing social networks. Sage Publications, London

work page 1999

[9] [9]

Fellows, I. E. (2019). Respondent-driven sampling and the homophily configuration graph.Statistics in Medicine, 38(1):131–150

work page 2019

[10] [10]

Fellows, I. E. (2022). On the robustness of respondent-driven sampling estimators to measurement error.Journal of Survey Statistics and Methodology, 10(2):377–396. Fonseca de Barros, B., Fynn, I., Nocetto, L., Beaudry, I., Luna, J. P., Piñeiro, R., and Rosenblatt Rodríguez, F. (2024). How parties take advantage of immigrant waves. Political incorporation ...

work page 2022

[11] [11]

and Strauss, D

Frank, O. and Strauss, D. (1986). Markov graphs.Journal of the American Statistical Association, 81(395):832–842

work page 1986

[12] [12]

Gile, K., Beaudry, I., Handcock, M., and Ott, M. (2018). Methods for inference from respondent-driven sampling data. Annual Review of Statistics and Its Application, 5:65–93

work page 2018

[13] [13]

Gile, K. J. and Handcock, M. S. (2010). Respondent-driven sampling: An assessment of current methodology. Sociological Methodology, 40(1):285–327

work page 2010

[14] [14]

J., Johnston, L

Gile, K. J., Johnston, L. G., and Salganik, M. J. (2015). Diagnostics for respondent-driven sampling.Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(1):241–269

work page 2015

[15] [15]

and Salganik, M

Goel, S. and Salganik, M. J. (2009). Respondent-driven sampling as Markov chain Monte Carlo.Statistics in Medicine, 28(17):2202–2229

work page 2009

[16] [16]

Hansen, M. H. and Hurwitz, W. N. (1943). On the theory of sampling from finite populations.The Annals of Mathematical Statistics, 14(4):333–362

work page 1943

[17] [17]

Heckathorn, D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations.Social Problems, 44(2):174–199

work page 1997

[18] [18]

Heckathorn, D. D. (2002). Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations.Social Problems, 49(1):11–34

work page 2002

[19] [19]

Heckathorn, D. D. (2007). Extensions of respondent-driven sampling: Analyzing continuous variables and controlling for differential recruitment.Sociological Methodology, 37(1):151–208

work page 2007

[20] [20]

Heckathorn, D. D. (2011). Snowball versus respondent-driven sampling.Sociological Methodology, 41(1):355–366

work page 2011

[21] [21]

D., Semaan, S., Broadhead, R

Heckathorn, D. D., Semaan, S., Broadhead, R. S., and Hughes, J. J. (2002). Extensions of respondent-driven sampling: A new approach to the study of injection drug users aged 18-25.AIDS and Behavior, 6(1):55–67. 20

work page 2002

[22] [22]

Hunter, D. R. and Handcock, M. S. (2006). Inference in curved exponential family models for networks.Journal of Computational and Graphical Statistics, 15(3):565–583

work page 2006

[23] [23]

D., Ouedraogo, R., Kakesa, J., and Fetters, T

Jayaweera, R., Odhoch, L., Nabunje, J., Oduor, C., Zuniga, C., Powell, B., Barasa, W., Aber, F., Nyalwal, B., Wado, Y . D., Ouedraogo, R., Kakesa, J., and Fetters, T. (2025). Incidence and safety of abortion in two humanitarian settings in Uganda and Kenya: A respondent-driven sampling study.eClinicalMedicine, 83:103200

work page 2025

[24] [24]

G., Malekinejad, M., Kendall, C., Iuppa, I

Johnston, L. G., Malekinejad, M., Kendall, C., Iuppa, I. M., and Rutherford, G. W. (2008). Implementation challenges to using respondent-driven sampling methodology for HIV biological and behavioral surveillance: Field experiences in international settings.AIDS and Behavior, 12(4):S131–S141

work page 2008

[25] [25]

Johnston, L. G. and Sabin, K. (2010). Sampling hard-to-reach populations with respondent driven sampling.Method- ological Innovations Online, 5(2):38–48

work page 2010

[26] [26]

A., Wejnert, C., Hall, D

Lansky, A., Abdul-Quader, L. A., Wejnert, C., Hall, D. R., Finlayson, D. M., Garfein, L. A., and Sullivan, P. S. (2007). Developing an HIV behavioral surveillance system for injecting drug users: The national HIV behavioral surveillance system.Public Health Reports, 122(Suppl 1):48–55

work page 2007

[27] [27]

C., Carvalho, T

Leal, M. C., Carvalho, T. D. G., Santos, Y . R. P., Queiroz, R. S. B., Fonseca, P. A. M., Silva, A. A. M., Szwarcwald, C. L., and Riggirozzi, P. (2025). Determinants of self-rated health among Venezuelan migrant women in Brazil: A cross-sectional study.The Lancet Regional Health - Americas, 45:101077

work page 2025

[28] [28]

W., Shin, H.-S., Weeks, M., Zelenev, A., Moothi, G., Mosher, H., Heimer, R., Robles, E., Palmer, G., and Obidoa, C

Li, J., Valente, T. W., Shin, H.-S., Weeks, M., Zelenev, A., Moothi, G., Mosher, H., Heimer, R., Robles, E., Palmer, G., and Obidoa, C. (2018). Overlooked threats to respondent driven sampling estimators: Peer recruitment reality, degree measures, and random selection assumption.AIDS and Behavior, 22(7):2340–2359

work page 2018

[29] [29]

Liu, H., Li, J., Ha, T., and Li, J. (2012). Assessment of random recruitment assumption in respondent-driven sampling in egocentric network data.Social Networking, 1(2):13–21

work page 2012

[30] [30]

Lu, X. (2013). Linked ego networks: Improving estimate reliability and validity with respondent-driven sampling. Social Networks, 35:669–685

work page 2013

[31] [31]

Cambridge University Press, Cambridge, UK

Lusher, D., Koskinen, J., and Robins, G., editors (2013).Exponential random graph models for social networks: Theory, methods and applications. Cambridge University Press, Cambridge, UK

work page 2013

[32] [32]

Magnani, R., Sabin, K., Saidel, T., and Heckathorn, D. D. (2005). Review of sampling hard-to-reach and hidden populations for HIV surveillance.AIDS, 19(Suppl. 2):S67–S72

work page 2005

[33] [33]

McCreesh, N., Frost, S. D. W., Seeley, J., Katongole, J., Tarsh, M. N., Ndunguse, R., Jichi, F., Lunel, N. L., and Maher, D. (2012). Evaluation of respondent-driven sampling.Epidemiology, 23(1):138–147

work page 2012

[34] [34]

McPherson, M., Smith-Lovin, L., and Cook, J. M. (2001). Birds of a feather: Homophily in social networks.Annual Review of Sociology, 27:415–444. R Core Team (2025).R: A language and environment for statistical computing. R Foundation for Statistical Computing,

work page 2001

[35] [35]

and Rohe, K

Roch, S. and Rohe, K. (2018). Generalized least squares can overcome the critical threshold in respondent-driven sampling.Proceedings of the National Academy of Sciences, 115(41):10299–10304

work page 2018

[36] [36]

E., Nance, R

Rudolph, A. E., Nance, R. M., Bobashev, G., Brook, D., Akhtar, W., Cook, R., Cooper, H. L., Friedmann, P. D., Frost, S. D. W., Go, V . F., Jenkins, W. D., Korthuis, P. T., Miller, W. C., Pho, M. T., Ruderman, S. A., Seal, D. W., Stopka, T. J., Westergaard, R. P., Young, A. M., Zule, W. A., and Tsui, J. I. (2024). Evaluation of respondent-driven sampling i...

work page 2024

[37] [37]

Salganik, M. (2006). Variance estimation, design effects, and sample size calculations for respondent-driven sampling. Journal of Urban Health, 83(7):98–112

work page 2006

[38] [38]

and Heckathorn, D

Salganik, M. and Heckathorn, D. (2004). Sampling and estimation in hidden populations using respondent-drive sampling.Sociological Methodology, 34(1):193–240

work page 2004

[39] [39]

Shi, Y ., Cameron, C., and Heckathorn, D. (2019). Model-based and design-based inference: Reducing bias due to differential recruitment in respondent-driven sampling.Sociological Methods & Research, 48(1):3–33

work page 2019

[40] [40]

Takahashi, Y ., Song, J., and Iida, T. (2025). Transnational political participation of undocumented Mexican immigrants in the US: Respondent-driven sampling with the hard-to-reach population.The Journal of Race, Ethnicity, and Politics, pages 1–26

work page 2025

[41] [41]

and Gile, K

Tomas, A. and Gile, K. J. (2011). The effect of differential recruitment, non-response and non-recruitment on estimators for respondent-driven sampling.Electronic Journal of Statistics, 5:899–934. 21

work page 2011

[42] [42]

Tourangeau, R., Edwards, B., and Johnson, T. (2014). Understanding respondent-driven sampling from a total survey error perspective.Survey Practice, 7(2):1–6

work page 2014

[43] [43]

M., Merli, M

Verdery, A. M., Merli, M. G., Moody, J., Smith, J. A., and Fisher, J. C. (2015). Brief report: Respondent-driven sampling estimators under real and theoretical recruitment conditions of female sex workers in China.Epidemiology, 26(5):661–665. V olz, E. and Heckathorn, D. (2008). Probability based estimation theory for respondent driven sampling.Journal of...

work page 2015

[44] [44]

Wang, P., Wei, C., McFarland, W., and Raymond, H. F. (2024). The development and the assessment of sampling methods for hard-to-reach populations in HIV surveillance.Journal of Urban Health, 101(4):856–866

work page 2024

[45] [45]

L., Iyer, J., Brooks, D., Hailey-Fair, K., Galai, N., Beyrer, C., Celentano, D., and Arrington-Sanders, R

Wirtz, A. L., Iyer, J., Brooks, D., Hailey-Fair, K., Galai, N., Beyrer, C., Celentano, D., and Arrington-Sanders, R. (2021). An evaluation of assumptions underlying respondent-driven sampling and the social contexts of sexual and gender minority youth participating in HIV clinical trials in the United States.Journal of the International AIDS Society, 24(5):e25694

work page 2021

[46] [46]

J., Merli, M

Yamanis, T. J., Merli, M. G., Neely, W. W., Tian, F. F., Moody, J., Tu, X., and Gao, E. (2013). An empirical analysis of the impact of recruitment patterns on RDS estimates among a socially ordered population of female sex workers in China.Sociological Methods & Research, 42(3):392–425

work page 2013

[47] [47]

Yauck, M., Moodie, E. E. M., Apelian, H., Fourmigue, A., Grace, D., Hart, T. A., Lambert, G., and Cox, J. (2022). Neighborhood bootstrap for respondent-driven sampling.Journal of Survey Statistics and Methodology, 10(2):419– 438. A Apendix Table 8: Standard deviation (SD) across estimators and scenarios. Scenario =(τ, ϕ MDR) Estimator ˆµII V H ˆµII DR ˆµI...

work page 2022