Better Measurement or Larger Samples? Data Collection for Policy Learning with Unobserved Heterogeneity

Giacomo Opocher

arxiv: 2604.07181 · v2 · submitted 2026-04-08 · 💰 econ.EM

Better Measurement or Larger Samples? Data Collection for Policy Learning with Unobserved Heterogeneity

Giacomo Opocher This is my paper

Pith reviewed 2026-05-10 18:08 UTC · model grok-4.3

classification 💰 econ.EM

keywords policy learningunobserved heterogeneityminimax regretdata collectionlatent traitstargeted policieswelfare maximization

0 comments

The pith

The precision of latent trait estimates governs the worst-case regret of targeted policies, and optimal data collection balances measurement quality with sample size.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes how the accuracy with which policymakers can estimate unobserved individual differences shapes the reliability of treatment assignment rules. It provides rate-sharp bounds on the maximum possible regret of policies that use those estimates and identifies data collection strategies that are optimal in the minimax sense. A sympathetic reader would care because the results show concrete ways to improve welfare in real programs by deciding whether to invest in better proxies or more participants, as demonstrated by a 5% welfare gain and halved loss probability when adding a business skill proxy to cash transfer targeting.

Core claim

Estimates of latent traits affect policy performance through rate-sharp regret bounds on assignment rules, and collection plans can be designed to be minimax optimal by satisfying a sufficient condition that trades off precision against sample size in a way that depends on the complexity of the policy space.

What carries the argument

The rate-sharp regret bounds for assignment rules including or excluding latent trait estimates, which quantify the impact of estimate precision and enable the derivation of minimax optimal data collection conditions.

If this is right

Targeted policies using estimated latent traits have lower worst-case regret than observable-only policies when precision is sufficient.
Minimax optimal data collection may prioritize repeated measurements to improve precision over expanding sample size for complex policy spaces.
In the development economics application, incorporating the skill proxy boosts welfare by 5% and halves the risk of welfare losses.
The optimal resource allocation between measurement precision and sample size can be estimated from the regret bounds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could extend to policy learning in education or labor markets where similar unobserved heterogeneity affects treatment responses.
Testable predictions include that the welfare gains scale with the precision improvements as per the derived bounds in new empirical settings.
Policymakers in other domains might use the sufficient condition to decide on survey design or administrative data collection strategies.

Load-bearing premise

The policymaker can control the precision of latent trait estimates through repeated measurements or proxies and can choose data collection plans to achieve the minimax optimality condition.

What would settle it

Finding that the welfare gains from including the proxy in the cash transfer program are not 5% or that the probability of losses is not halved, or that the estimated optimal allocation between measurements and sample size does not align with the minimax condition, would challenge the practical applicability of the bounds.

Figures

Figures reproduced from arXiv: 2604.07181 by Giacomo Opocher.

read the original abstract

Empirical research shows that individuals' responses to treatments vary along latent characteristics, such as innate ability or motivation. Therefore, a policymaker seeking to maximize welfare may consider designing policies based on observed characteristics and estimated latent traits. I characterize how the estimates' precision affects the worst-case performance of policies deriving rate-sharp regret bounds for assignment rules that include or exclude them, highlighting new trade-offs with the policy space complexity. I then study how a policymaker can solve such trade-offs by designing tailored data collections and derive a sufficient condition for a collection plan to be minimax optimal. In an empirical application in development economics, I show that including a proxy for entrepreneurs' business skills in targeting cash transfers increases welfare by 5%, and halves the probability of generating welfare losses. Moreover, I estimate the optimal allocation of resources between improving the precision of the proxy via repeated measurements, and increasing sample size.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives rate-sharp regret bounds that tie the precision of latent trait estimates directly to worst-case policy performance and gives a sufficient condition for when a data collection plan is minimax optimal.

read the letter

The main thing here is the explicit link between how precisely you estimate unobserved heterogeneity and the regret of the resulting assignment rule. By making that connection rate-sharp and showing how policy space complexity interacts with it, the work extends standard policy learning results in a way that matches a real design choice researchers face. The sufficient condition for minimax optimality then tells you when to favor repeated measurements over more observations, which is the practical payoff. The empirical section applies this to cash transfer targeting in development data and reports a 5% welfare lift plus a halved chance of losses when the skill proxy is included, along with an estimated optimal split between measurement quality and sample size. That part illustrates the framework without overclaiming generality. The derivations appear to rest on standard primitives rather than circular fitting, and the bounds are presented as new relative to the cited policy learning literature. One limitation is that the whole setup treats the precision of the latent trait estimates as something the designer can set independently through the collection plan. If measurement error has non-classical structure or if repeats do not deliver separable precision gains once sample size is fixed, the regret expressions and optimality condition will not describe the actual worst-case performance. The paper should make clear how sensitive the results are to that separability. Overall this is for economists doing policy learning or targeting experiments who already work with proxies for heterogeneity. It is worth sending to referees because the theoretical extension is distinct and the empirical exercise shows how the bounds can be used, even if the application is one case.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a theoretical framework for policy learning with unobserved heterogeneity, deriving rate-sharp regret bounds for assignment rules that include or exclude estimated latent traits and highlighting trade-offs with policy space complexity. It provides a sufficient condition for a data collection plan (balancing repeated measurements for precision vs. sample size) to be minimax optimal. In an empirical application to targeting cash transfers in development economics, including a proxy for entrepreneurs' business skills increases welfare by 5% and halves the probability of welfare losses, while estimating the optimal resource allocation between measurement precision and sample size.

Significance. If the results hold, the paper contributes to policy learning and targeting literatures by modeling the data collection trade-off explicitly, with rate-sharp bounds and a minimax optimality condition that could inform practical design. The empirical illustration of a 5% welfare gain is a concrete strength, though its policy relevance hinges on the separability assumption between precision gains and sample size.

major comments (2)

[Theoretical framework and regret bounds derivation] Theoretical derivation of regret bounds and minimax condition: The rate-sharp regret bounds and sufficient condition for minimax optimality assume that latent trait estimation precision is controllable independently via repeated measurements, separable from sample size effects. This separability may fail under non-classical measurement error or correlated errors (common in development data), which would invalidate the worst-case performance characterization and the derived trade-offs. The empirical application estimates the optimal allocation under this assumption but reports no robustness checks, making the 5% welfare gain and optimality conclusion load-bearing on an unverified premise.
[Empirical application section] Empirical application: The reported 5% welfare increase and halved welfare loss probability rely on the proxy's precision being adjustable separately from sample size. Without checks for non-separability or non-classical errors, these estimates do not fully support the claim that the collection plan is minimax optimal in practice.

minor comments (2)

[Abstract] Abstract: The phrasing of 'new trade-offs with the policy space complexity' is vague; a brief parenthetical on the specific complexity measure would improve clarity.
[Introduction and notation] Notation consistency: Terms like 'rate-sharp' and 'minimax optimal' should be defined at first use with reference to the relevant equations for reader accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below, clarifying the role of the separability assumption in the theoretical results and its implications for the empirical illustration. We will incorporate additional discussion and caveats in the revised version.

read point-by-point responses

Referee: [Theoretical framework and regret bounds derivation] Theoretical derivation of regret bounds and minimax condition: The rate-sharp regret bounds and sufficient condition for minimax optimality assume that latent trait estimation precision is controllable independently via repeated measurements, separable from sample size effects. This separability may fail under non-classical measurement error or correlated errors (common in development data), which would invalidate the worst-case performance characterization and the derived trade-offs. The empirical application estimates the optimal allocation under this assumption but reports no robustness checks, making the 5% welfare gain and optimality conclusion load-bearing on an unverified premise.

Authors: Our theoretical framework explicitly models data collection plans in which precision of latent trait estimates is improved via repeated measurements, with this precision treated as separable from sample size to isolate the relevant trade-off and derive rate-sharp regret bounds. The sufficient condition for minimax optimality is stated with respect to this class of plans. We agree that non-classical measurement error or correlated errors could violate separability and alter the worst-case characterization; the bounds and optimality condition are derived under the maintained model rather than claimed to be robust to all possible error structures. In the revision we will add a limitations subsection discussing this assumption, its plausibility in the policy-learning setting, and directions for relaxing it. The empirical application computes the welfare gain and optimal allocation under the same maintained assumptions; we will add explicit language noting that these quantities are conditional on separability and that the minimax claim applies within the modeled environment. revision: partial
Referee: [Empirical application section] Empirical application: The reported 5% welfare increase and halved welfare loss probability rely on the proxy's precision being adjustable separately from sample size. Without checks for non-separability or non-classical errors, these estimates do not fully support the claim that the collection plan is minimax optimal in practice.

Authors: The empirical section illustrates the theoretical framework by applying it to cash-transfer targeting and reporting the welfare gain and halved loss probability that arise when the proxy is included under the model's data-collection assumptions. We acknowledge that the current version contains no explicit robustness checks against non-separability or non-classical errors. In the revision we will (i) restate that the 5% figure and optimality conclusion are obtained under the separability maintained throughout the paper and (ii) add a short sensitivity discussion that explores how the welfare numbers change if the effective precision gain is attenuated. The core contribution of the application remains the demonstration that the theoretical objects can be computed with existing data; we do not claim the numbers are robust to every conceivable violation of the modeling assumptions. revision: partial

Circularity Check

0 steps flagged

No circularity: derivation follows from model primitives

full rationale

The paper's core contributions—rate-sharp regret bounds for policies using estimated latent traits, trade-offs with policy space complexity, and a sufficient condition for minimax-optimal data collection plans—are presented as derived from the underlying model of unobserved heterogeneity and controllable measurement precision. No equations or steps reduce by construction to fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations. The empirical application applies these derived objects to estimate welfare gains and optimal resource allocation between measurement precision and sample size, without evidence that the bounds or optimality condition are tautological with the inputs. The analysis remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard domain assumptions about the existence and estimability of latent traits plus the ability to control precision via repeated measurements; no free parameters or invented entities are identifiable from the abstract.

axioms (2)

domain assumption Latent traits exist and influence treatment responses in a manner that can be partially recovered through proxies or repeated measurements.
This underpins the entire analysis of precision effects and data collection trade-offs.
domain assumption The policymaker can design data collection plans that vary measurement precision and sample size to achieve minimax optimality.
Required for the sufficient condition on optimal collection plans.

pith-pipeline@v0.9.0 · 5446 in / 1399 out tokens · 58655 ms · 2026-05-10T18:08:25.521754+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

[1]

Emily Breza, Arun G

URLhttps: //arxiv.org/abs/2510.23434. Emily Breza, Arun G. Chandrasekhar, and Davide Viviano. Generalizability with ignorance in mind: learning what we do (not) know for archetypes discovery,

work page arXiv
[2]

32 Davide Viviano and Jelena Bradic

URLhttps: //arxiv.org/abs/2501.13355. 32 Davide Viviano and Jelena Bradic. Fair Policy Targeting.Journal of the American Statistical Association, 119(545):730–743, January

work page internal anchor Pith review arXiv
[3]

Xiaohong Chen, Han Hong, and Denis Nekipelov

URLhttps://arxiv.org/ abs/2107.02780. Xiaohong Chen, Han Hong, and Denis Nekipelov. Nonlinear models of measurement errors. Journal of Economic Literature, 49(4):901–37, December

work page arXiv
[4]

Schennach

Susanne M. Schennach. Recent advances in the measurement error literature.Annual Review of Economics, 8(Volume 8, 2016):341–377,

work page 2016
[5]

Jeff Dominitz and Charles F

ISSN 1941-1391. Jeff Dominitz and Charles F. Manski. More data or better data? a statistical decision problem.The Review of Economic Studies, 84(4):1583–1605, 02

work page 1941
[6]

Version 1

URLhttps://arxiv.org/abs/2405.13241. Wassily Hoeffding. Probability Inequalities for Sums of Bounded Random Variables.Journal of the American Statistical Association, 58(301):13–30,

work page arXiv
[7]

How to run surveys: A guide to creating your own identifying variation and revealing the invisible.Annual Review of Economics, 15(Volume 15, 2023):205–234,

Stefanie Stantcheva. How to run surveys: A guide to creating your own identifying variation and revealing the invisible.Annual Review of Economics, 15(Volume 15, 2023):205–234,

work page 2023
[8]

Original 12345Original123452-MeasurementsDataLinear Fit2-Measurements vs

35 A Additional Figures Figure A1: Original Measure vs t-Measurements 12345Original123451-MeasurementsDataLinear Fit1-Measurements vs. Original 12345Original123452-MeasurementsDataLinear Fit2-Measurements vs. Original 12345Original123453-MeasurementsDataLinear Fit3-Measurements vs. Original 12345Original123454-MeasurementsDataLinear Fit4-Measurements vs. ...

work page 2022
[9]

Step 1: Approximation-error lower bound.Fixσ 0 >0 and consider the following data-generating processP σ

Proof of Theorem 2.Define the minimax risk Rn := inf {ˆGθ(Xi)} sup P∈P(σ0) R( ˆGθ(Xi)).(B28) We establish two separate lower bounds onR n: one arising from approximation error and one from estimation error, and then combine them. Step 1: Approximation-error lower bound.Fixσ 0 >0 and consider the following data-generating processP σ. LetX i∼N(µx,σ2 x), i.i...

work page 2018
[10]

Consider the eventE:={|ˆAi|< ρ}

Under this rule, the realized outcome equals +M/2, hence W(G∗ θ(Xi,Ai)) = M 2 (B78) 44 For any policyG(Z i), Yi(1)G(Zi) +Yi(0)(1−G(Zi)) = M 2 sign(Ai)G(Zi)−M 2 sign(Ai)(1−G(Zi)) (B79) = M 2 sign(Ai)(2G(Zi)−1) (B80) Therefore, for any policyG θ(Xi, ˆAi), W(Gθ(Xi, ˆAi)) =E P [M 2 sign(Ai)(2Gθ(Xi, ˆAi)−1) ] (B81) by developing the expectation: (B82) = M 2 ( ...

work page 2018
[11]

1.Estimate Representation- Let ˆAi be written as ˆAi = ˆfm(Xi)

Hence, in the repeated-measurement case, Assumption 4 is satisfied with the envelope h(t) =b 0 + m0√ t.(B122) C.2 External Data-Dependent Proxy Assumption 2B(External data-dependent ˆAi). 1.Estimate Representation- Let ˆAi be written as ˆAi = ˆfm(Xi). 2.External Estimator- ˆfm :X → ˆAis learned on an auxiliary sampleS m := {(Yi(0),Xi)}m i=1⊥Sn and then tr...

work page 2018
[12]

Conditional onm:=m(X i), P(Ai <0|m) =P(m+Ui <0|m) =    r−m 2r ,0≤m<r, 0, m≥r (B163) and symmetrically form<0

The oracle rule that observesA i treats iffA i≥0 and attains W(G∗ θ(Xi,Ai)) = M 2 (B161) Given only (Xi, ˆAi), the Bayes-optimal feasible rule is G∗ θ(Xi, ˆAi) =1{ˆAi≥0}=1{m(Xi)≥0}(B162) We compute its misclassification probability. Conditional onm:=m(X i), P(Ai <0|m) =P(m+Ui <0|m) =    r−m 2r ,0≤m<r, 0, m≥r (B163) and symmetrically form<0. Hence P...

work page 2018

[1] [1]

Emily Breza, Arun G

URLhttps: //arxiv.org/abs/2510.23434. Emily Breza, Arun G. Chandrasekhar, and Davide Viviano. Generalizability with ignorance in mind: learning what we do (not) know for archetypes discovery,

work page arXiv

[2] [2]

32 Davide Viviano and Jelena Bradic

URLhttps: //arxiv.org/abs/2501.13355. 32 Davide Viviano and Jelena Bradic. Fair Policy Targeting.Journal of the American Statistical Association, 119(545):730–743, January

work page internal anchor Pith review arXiv

[3] [3]

Xiaohong Chen, Han Hong, and Denis Nekipelov

URLhttps://arxiv.org/ abs/2107.02780. Xiaohong Chen, Han Hong, and Denis Nekipelov. Nonlinear models of measurement errors. Journal of Economic Literature, 49(4):901–37, December

work page arXiv

[4] [4]

Schennach

Susanne M. Schennach. Recent advances in the measurement error literature.Annual Review of Economics, 8(Volume 8, 2016):341–377,

work page 2016

[5] [5]

Jeff Dominitz and Charles F

ISSN 1941-1391. Jeff Dominitz and Charles F. Manski. More data or better data? a statistical decision problem.The Review of Economic Studies, 84(4):1583–1605, 02

work page 1941

[6] [6]

Version 1

URLhttps://arxiv.org/abs/2405.13241. Wassily Hoeffding. Probability Inequalities for Sums of Bounded Random Variables.Journal of the American Statistical Association, 58(301):13–30,

work page arXiv

[7] [7]

How to run surveys: A guide to creating your own identifying variation and revealing the invisible.Annual Review of Economics, 15(Volume 15, 2023):205–234,

Stefanie Stantcheva. How to run surveys: A guide to creating your own identifying variation and revealing the invisible.Annual Review of Economics, 15(Volume 15, 2023):205–234,

work page 2023

[8] [8]

Original 12345Original123452-MeasurementsDataLinear Fit2-Measurements vs

35 A Additional Figures Figure A1: Original Measure vs t-Measurements 12345Original123451-MeasurementsDataLinear Fit1-Measurements vs. Original 12345Original123452-MeasurementsDataLinear Fit2-Measurements vs. Original 12345Original123453-MeasurementsDataLinear Fit3-Measurements vs. Original 12345Original123454-MeasurementsDataLinear Fit4-Measurements vs. ...

work page 2022

[9] [9]

Step 1: Approximation-error lower bound.Fixσ 0 >0 and consider the following data-generating processP σ

Proof of Theorem 2.Define the minimax risk Rn := inf {ˆGθ(Xi)} sup P∈P(σ0) R( ˆGθ(Xi)).(B28) We establish two separate lower bounds onR n: one arising from approximation error and one from estimation error, and then combine them. Step 1: Approximation-error lower bound.Fixσ 0 >0 and consider the following data-generating processP σ. LetX i∼N(µx,σ2 x), i.i...

work page 2018

[10] [10]

Consider the eventE:={|ˆAi|< ρ}

Under this rule, the realized outcome equals +M/2, hence W(G∗ θ(Xi,Ai)) = M 2 (B78) 44 For any policyG(Z i), Yi(1)G(Zi) +Yi(0)(1−G(Zi)) = M 2 sign(Ai)G(Zi)−M 2 sign(Ai)(1−G(Zi)) (B79) = M 2 sign(Ai)(2G(Zi)−1) (B80) Therefore, for any policyG θ(Xi, ˆAi), W(Gθ(Xi, ˆAi)) =E P [M 2 sign(Ai)(2Gθ(Xi, ˆAi)−1) ] (B81) by developing the expectation: (B82) = M 2 ( ...

work page 2018

[11] [11]

1.Estimate Representation- Let ˆAi be written as ˆAi = ˆfm(Xi)

Hence, in the repeated-measurement case, Assumption 4 is satisfied with the envelope h(t) =b 0 + m0√ t.(B122) C.2 External Data-Dependent Proxy Assumption 2B(External data-dependent ˆAi). 1.Estimate Representation- Let ˆAi be written as ˆAi = ˆfm(Xi). 2.External Estimator- ˆfm :X → ˆAis learned on an auxiliary sampleS m := {(Yi(0),Xi)}m i=1⊥Sn and then tr...

work page 2018

[12] [12]

Conditional onm:=m(X i), P(Ai <0|m) =P(m+Ui <0|m) =    r−m 2r ,0≤m<r, 0, m≥r (B163) and symmetrically form<0

The oracle rule that observesA i treats iffA i≥0 and attains W(G∗ θ(Xi,Ai)) = M 2 (B161) Given only (Xi, ˆAi), the Bayes-optimal feasible rule is G∗ θ(Xi, ˆAi) =1{ˆAi≥0}=1{m(Xi)≥0}(B162) We compute its misclassification probability. Conditional onm:=m(X i), P(Ai <0|m) =P(m+Ui <0|m) =    r−m 2r ,0≤m<r, 0, m≥r (B163) and symmetrically form<0. Hence P...

work page 2018