Scalable Online Survey Framework: from Sampling to Analysis

Qian Wang; Rogier Verhulst; Weitao Duan; Ya Xu

arxiv: 1906.10082 · v1 · pith:7KYH7SBFnew · submitted 2019-06-24 · 📊 stat.AP

Scalable Online Survey Framework: from Sampling to Analysis

Weitao Duan , Qian Wang , Rogier Verhulst , Ya Xu This is my paper

Pith reviewed 2026-05-25 16:43 UTC · model grok-4.3

classification 📊 stat.AP

keywords online surveyssamplingrepresentativenessuser experienceA/B testingin-product surveysemail surveysmonitoring tools

0 comments

The pith

Coordinated sampling across multiple surveys collects enough representative responses for each analysis without overloading users.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that growing demand for surveys requires moving beyond simply increasing the number of surveys per user, because that harms experience and still fails to guarantee enough samples per analysis. It presents methods for coordinated sampling in email surveys and demonstrates their application to in-product surveys via a study on two mobile apps. The work further shows that survey responses can function as monitoring instruments when linked to A/B tests. A sympathetic reader would care because behavioral event data alone is too noisy and incomplete for measuring intention or satisfaction, so an efficient survey layer directly improves the quality of product and marketing decisions.

Core claim

The central claim is that a scalable survey framework, built on coordinated sampling rather than independent per-survey recruitment, can deliver sufficient and representative data for every analysis while preserving user experience. The framework is illustrated first through email survey handling under load constraints, then through an in-product survey deployment across mobile apps, and finally through the integration of survey data as a monitoring complement to A/B testing.

What carries the argument

The coordinated sampling mechanism that allocates users across simultaneous surveys to balance response volume and demographic or behavioral representativeness for each separate analysis.

If this is right

Multiple surveys can run in parallel while each still receives enough targeted responses.
In-product surveys can be deployed across separate apps without duplicating user burden.
Survey responses become usable as continuous monitoring signals when aligned with experimental tests.
Analysis pipelines can treat survey data as a reliable complement to behavioral logs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same coordination logic could be extended to dynamically reallocate sampling rates when early response patterns indicate under-coverage in a particular subgroup.
Combining the sampling layer with existing A/B infrastructure might allow survey questions to be inserted into live experiments without separate recruitment.
Over time the framework could reduce overall survey fatigue by spreading participation more evenly across the user base.

Load-bearing premise

Coordinated sampling across surveys can produce a representative sample for every individual analysis without introducing selection bias or coverage gaps.

What would settle it

A direct comparison showing that the user distribution obtained under coordinated sampling for a given survey differs materially in key covariates from the distribution obtained by running that survey independently on the full eligible population.

Figures

Figures reproduced from arXiv: 1906.10082 by Qian Wang, Rogier Verhulst, Weitao Duan, Ya Xu.

**Figure 3.** Figure 3: Sampling probability with respect to member’s [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

With the advancement in technology, raw event data generated by the digital world have grown tremendously. However, such data tend to be insufficient and noisy when it comes to measuring user intention or satisfaction. One effective way to measure user experience directly is through surveys. In particular, with the popularity of online surveys, extensive work has been put in to study this field. Surveys at LinkedIn play a major role in influencing product and marketing decisions and supporting our sales efforts. We run an increasing number of surveys that help us understand shifts in awareness and perceptions with regards to our own products and also to peer companies. As the need to survey grows, both sampling and analysis of surveys have become more challenging. Instead of simply multiplying the number of surveys each user takes, we need a scalable approach to collect enough and representative samples for each survey analysis while maintaining good user experience. In this paper, we start with discussions on how we handle multiple email surveys under such constraints. We then shift our discussions to challenges of in-product surveys and how we address them at LinkedIn through a survey study conducted across two mobile apps. Finally, we share how in-product surveys can be utilized as monitoring tools and connect surveys with A/B testing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a descriptive case study of LinkedIn's survey operations with no new methods, data, or testable claims.

read the letter

This paper is a case study of how LinkedIn runs its internal survey program at scale. It lays out the practical constraints of running many email and in-product surveys without overloading users, then sketches high-level steps they take for sampling and analysis. The sections cover email survey coordination, a mobile app study for in-product surveys, and linking surveys to A/B testing for monitoring. That is the full extent of what is new: real-world workflow notes from one company. The practical descriptions may be useful to other teams facing similar volume and user-experience trade-offs. The paper does not claim or show any reduction in bias, improvement in response rates, or novel sampling algorithm. No equations, code, datasets, or quantitative results appear. The central motivation—that coordinated sampling can keep each analysis representative—is stated but never measured or compared against alternatives. The mobile-app study is mentioned without sample sizes, error rates, or validation details. Because the text supplies no evidence on whether their approach actually works, the claims stay at the level of internal practice rather than research findings. This is for industry practitioners who want to see one large company's survey setup. It does not advance survey methodology or provide material that would support a methods paper. I would not send it for peer review in a statistics or applied-methods venue; the lack of data and analysis makes it unsuitable for that process.

Referee Report

2 major / 1 minor

Summary. The manuscript describes LinkedIn's operational practices for scaling multiple online surveys, covering strategies for sampling across email surveys to balance volume and representativeness, a case study of in-product surveys on two mobile apps, and the integration of in-product surveys as monitoring tools with A/B testing.

Significance. If the described practices are effective, the work could supply useful industry context for applied survey operations at scale. However, as a purely descriptive case study with no quantitative results, formal methods, or validation, its contribution to the statistical literature remains limited.

major comments (2)

[Abstract] Abstract: the central claim that coordinated sampling can collect 'enough and representative samples for each survey analysis' while preserving user experience is presented without any supporting data, sample sizes, bias metrics, or validation results, leaving the claim untested.
[in-product surveys discussion] The section describing the in-product survey study across two mobile apps: no methodological details, response rates, representativeness checks, or outcome metrics are supplied, so it is impossible to assess whether the stated challenges were actually addressed.

minor comments (1)

The transition between email-survey and in-product-survey sections is abrupt; explicit section headings and a short methods overview would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review and comments on our manuscript describing LinkedIn's scalable online survey framework. We address the major comments below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that coordinated sampling can collect 'enough and representative samples for each survey analysis' while preserving user experience is presented without any supporting data, sample sizes, bias metrics, or validation results, leaving the claim untested.

Authors: The paper presents a descriptive case study of operational practices rather than a quantitative empirical study. The central claim is grounded in our implementation experience at LinkedIn, where the coordinated sampling approach has been deployed to manage multiple surveys. We do not provide supporting data, sample sizes, or validation metrics because the focus is on the framework and strategies, and specific internal metrics are confidential. This is consistent with the manuscript's aim to share industry practices for the statistical community. revision: no
Referee: [in-product surveys discussion] The section describing the in-product survey study across two mobile apps: no methodological details, response rates, representativeness checks, or outcome metrics are supplied, so it is impossible to assess whether the stated challenges were actually addressed.

Authors: Similar to the above, the in-product surveys section provides a high-level overview of the challenges and solutions implemented across two mobile apps. Detailed methodological information, response rates, and metrics are not included to protect proprietary information. The section serves to illustrate real-world application and integration with A/B testing, rather than to offer a validated case study with outcome metrics. revision: no

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a descriptive industry case study detailing LinkedIn's practical approaches to multi-survey sampling, in-product surveys, and integration with A/B testing. It contains no equations, formal models, fitted parameters, derivations, or falsifiable predictions. All content consists of operational descriptions and case examples without any load-bearing steps that could reduce to self-definition, fitted inputs, or self-citation chains. The derivation chain is empty by construction, making the paper self-contained with no circularity burden.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model, free parameters, axioms, or invented entities are described in the abstract.

pith-pipeline@v0.9.0 · 5745 in / 908 out tokens · 20628 ms · 2026-05-25T16:43:51.149863+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

In the rNPS pool, determine rNPS eligible members. 3. A fixed percentage of members are then randomly drawn from the rNPS eligible group. This guarantees no interference between rNPS and other email surveys. If we do not enforce the 90-day same survey cool-off constraints, every member has an equal chance of being selected for rNPS. However, with the cool...

work page
[2]

Voyager”) as the successor for the old app (codename “Titan

Assign the first bucket to rNPS surveys. Set aside its cool-off group. Similarly, the remaining buckets are for MoT surveys, followed by MoT cool-off group. The cool-off group spans no less than 30 buckets according to the cool-off constraints. Refer to the green circle in Figure 1 Left for the four groups. 3. On Day 1, send rNPS surveys to all eligible m...

work page 2015
[3]

Researching Internet-Based Populations: Advantages and Disadvantages of Online Survey Research, Online Questionnaire Authoring Software Packages, and Web Survey Services

Resnick, R. M., Comparison of postal and online surveys: Cost, speed, response rates and reliability. 2012 Education Market Research and MCH Strategic Data, 2012. [15] K. Wright, "Researching Internet-Based Populations: Advantages and Disadvantages of Online Survey Research, Online Questionnaire Authoring Software Packages, and Web Survey Services", Journ...

work page 2012

[1] [1]

In the rNPS pool, determine rNPS eligible members. 3. A fixed percentage of members are then randomly drawn from the rNPS eligible group. This guarantees no interference between rNPS and other email surveys. If we do not enforce the 90-day same survey cool-off constraints, every member has an equal chance of being selected for rNPS. However, with the cool...

work page

[2] [2]

Voyager”) as the successor for the old app (codename “Titan

Assign the first bucket to rNPS surveys. Set aside its cool-off group. Similarly, the remaining buckets are for MoT surveys, followed by MoT cool-off group. The cool-off group spans no less than 30 buckets according to the cool-off constraints. Refer to the green circle in Figure 1 Left for the four groups. 3. On Day 1, send rNPS surveys to all eligible m...

work page 2015

[3] [3]

Researching Internet-Based Populations: Advantages and Disadvantages of Online Survey Research, Online Questionnaire Authoring Software Packages, and Web Survey Services

Resnick, R. M., Comparison of postal and online surveys: Cost, speed, response rates and reliability. 2012 Education Market Research and MCH Strategic Data, 2012. [15] K. Wright, "Researching Internet-Based Populations: Advantages and Disadvantages of Online Survey Research, Online Questionnaire Authoring Software Packages, and Web Survey Services", Journ...

work page 2012