Mind the Gap: Optimal and Equitable Encouragement Policies
Pith reviewed 2026-05-24 06:56 UTC · model grok-4.3
The pith
Encouragement policies should optimize and equalize induced treatment take-up rather than recommendation rates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the covariate-conditional no-direct-effect model, the value of an encouragement policy is the product of responsiveness to the recommendation and the efficacy of the treatment conditional on take-up. This decomposition identifies induced treatment take-up as the correct fairness target and supplies closed-form characterizations of optimal policies subject to budget or access constraints. In deterministic recommendation settings the same model isolates overlap robustness to the recommendation-response function rather than the outcome model.
What carries the argument
The covariate-conditional no-direct-effect model of encouragement, which states that recommendations affect outcomes only by changing treatment adoption and carry no direct effect once covariates are held fixed.
If this is right
- Induced treatment take-up, not recommendation frequency, becomes the fairness criterion that must be equalized across groups.
- Optimal policies admit tractable characterizations when total encouragement volume or access is constrained by budgets.
- Overlap robustness in deterministic recommendation regimes localizes entirely to the recommendation-response model.
- Policy design must separately estimate responsiveness and treatment efficacy rather than treating them as a single combined quantity.
Where Pith is reading between the lines
- Data collection efforts would need to track actual take-up rates in addition to recommendation logs to apply the fairness criterion.
- The separation of responsiveness and efficacy could be tested in other recommendation settings such as health or education interventions where adherence is voluntary.
- If the model holds, existing fairness audits focused only on recommendation parity would systematically miss disparities in realized treatment.
- Budget-constrained characterizations might extend to sequential decision settings where recommendations can be adjusted over time.
Load-bearing premise
Encouragement influences outcomes solely through its effect on whether individuals actually take the treatment, with no remaining direct effect once covariates are controlled.
What would settle it
Observing that a recommendation still changes outcomes for individuals who do not change their treatment status, after conditioning on the same covariates, would falsify the central modeling assumption.
Figures
read the original abstract
In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. We study personalized decision problems in which the planner controls recommendations into treatment rather than treatment itself. Under a covariate-conditional no-direct-effect model of encouragement, policy value depends on two distinct objects: responsiveness to encouragement and treatment efficacy. This modeling distinction makes induced treatment take-up, rather than recommendation rates alone, the natural fairness target and yields tractable policy characterizations under budget and access constraints. In settings with deterministic algorithmic recommendations, the same model localizes overlap-robustness to the recommendation-response model rather than the downstream outcome model. We illustrate the methods in case studies based on data from reminders of SNAP benefits recertification, and from pretrial supervised release with electronic monitoring. While the specific remedy to inequities in algorithmic allocation is context-specific, it requires studying both take-up of decisions and downstream outcomes of them.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a framework for optimal and equitable encouragement policies when treatment cannot be compelled. Under a covariate-conditional no-direct-effect model of encouragement, policy value is shown to depend separately on responsiveness to encouragement and treatment efficacy; this distinction implies that fairness should target induced treatment take-up rather than recommendation rates alone. The paper derives tractable policy characterizations under budget and access constraints, localizes overlap-robustness to the recommendation-response model in deterministic settings, and illustrates the approach with case studies on SNAP benefit recertification reminders and pretrial supervised release with electronic monitoring.
Significance. Conditional on the maintained no-direct-effect assumption, the separation of responsiveness and efficacy parameters supplies a principled basis for fairness considerations in encouragement settings and yields explicit characterizations that could inform policy design. The localization of robustness and the emphasis on take-up as the fairness target are potentially useful for causal policy learning applications.
minor comments (2)
- [Abstract and main theoretical sections] The abstract states that the model 'yields tractable policy characterizations' and 'localizes overlap-robustness'; the main text should include explicit theorem or proposition numbers that state these characterizations so readers can verify the claimed tractability directly.
- [Case studies section] The case studies are described as illustrations rather than tests; the text should clarify whether any sensitivity analysis to the no-direct-effect assumption (e.g., via alternative outcome models) was conducted or is left for future work.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our work and the recommendation of minor revision. No major comments were listed in the report, so there are no specific points requiring point-by-point response. We remain available to address any minor issues that may be identified.
Circularity Check
No significant circularity; derivation conditional on stated model
full rationale
The paper explicitly conditions all results on the covariate-conditional no-direct-effect model of encouragement. Under this modeling assumption the separation of policy value into responsiveness and efficacy parameters follows directly by definition, and subsequent policy characterizations under budget/access constraints are standard optimization steps. No equations or claims reduce by construction to fitted parameters, self-citations, or renamed known results. The provided abstract and reader summary contain no load-bearing self-referential steps; the work is self-contained given its explicit assumption.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption covariate-conditional no-direct-effect model of encouragement
Reference graph
Works this paper leans on
-
[1]
https://news.wttw.com/sites/default/files/article/file-attachments/PSA 2016
Public safety assessment decision making framework - cook county, il [effective march 2016]. https://news.wttw.com/sites/default/files/article/file-attachments/PSA 2016
work page 2016
-
[2]
URL https://www.whitehouse.gov/wp-content/uploads/2022/12/BurdenReductionStrategies.pdf
Dec 2022. URL https://www.whitehouse.gov/wp-content/uploads/2022/12/BurdenReductionStrategies.pdf
work page 2022
-
[3]
A. Agarwal, D. Hsu, S. Kale, J. Langford, L. Li, and R. Schapire. Taming the monster: A fast and simple algorithm for contextual bandits. In International Conference on Machine Learning, pages 1638--1646. PMLR, 2014
work page 2014
-
[4]
A. Agarwal, A. Beygelzimer, M. Dud \' k, J. Langford, and H. Wallach. A reductions approach to fair classification. In International Conference on Machine Learning, pages 60--69. PMLR, 2018
work page 2018
-
[5]
F. Akinnibi and S. Holder. America is the world leader in locking people up. one city found a fix. https://www.bloomberg.com/news/features/2023-08-30/nyc-s-cash-bail-reform-program-is-working-but-caseworkers-need-help, 2023. [Accessed 08-09-2023]
work page 2023
- [6]
- [7]
-
[8]
S. Athey. Beyond prediction: Using big data for policy problems. Science, 2017
work page 2017
-
[9]
S. Athey and S. Wager. Policy learning with observational data. Econometrica, 89 0 (1): 0 133--161, 2021
work page 2021
- [10]
-
[11]
S. Barocas, M. Hardt, and A. Narayanan. Fairness and Machine Learning. fairmlbook.org, 2018. http://www.fairmlbook.org
work page 2018
-
[12]
P. L. Bartlett and S. Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3 0 (Nov): 0 463--482, 2002
work page 2002
-
[13]
H. Bastani, O. Bastani, and W. P. Sinchaisri. Improving human decision-making with machine learning. arXiv preprint arXiv:2108.08454, 2021
-
[14]
E. Ben-Michael, D. J. Greiner, K. Imai, and Z. Jiang. Safe policy learning through extrapolation: Application to pre-trial risk assessment. arXiv preprint arXiv:2109.11679, 2021
-
[15]
A. Beygelzimer and J. Langford. The offset tree for learning with partial labels. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 129--138, 2009
work page 2009
-
[16]
N. Cesa-Bianchi, Y. Mansour, and G. Stoltz. Improved second-order bounds for prediction with expert advice. Machine Learning, 66: 0 321--352, 2007
work page 2007
-
[17]
V. Chernozhukov, M. Demirer, G. Lewis, and V. Syrgkanis. Semi-parametric efficient policy learning with continuous actions. Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[18]
A. Chohlas-Wood, M. Coots, H. Zhu, E. Brunskill, and S. Goel. Learning to be fair: A consequentialist approach to equitable decision-making. arXiv preprint arXiv:2109.08792, 2021
-
[19]
J. Christensen, L. Aar e, M. Baekgaard, P. Herd, and D. P. Moynihan. Human capital and administrative burden: The role of cognitive resources in citizen-state interactions. Public Administration Review, 80 0 (1): 0 127--136, 2020
work page 2020
- [20]
-
[21]
M. De-Arteaga, R. Fogliato, and A. Chouldechova. A case for humans-in-the-loop: Decisions in the presence of erroneous algorithmic scores. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1--12, 2020
work page 2020
-
[22]
J. L. Doleac and M. T. Stevenson. Algorithmic risk assessments in the hands of humans. Salem Center, 2020
work page 2020
-
[23]
A. Finkelstein, S. Taubman, B. Wright, M. Bernstein, J. Gruber, J. P. Newhouse, H. Allen, K. Baicker, and O. H. S. Group. The oregon health insurance experiment: evidence from the first year. The Quarterly journal of economics, 127 0 (3): 0 1057--1106, 2012
work page 2012
- [24]
-
[25]
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55 0 (1): 0 119--139, 1997
work page 1997
-
[26]
B. Green and Y. Chen. Disparate interactions: An algorithm-in-the-loop analysis of fairness in risk assessments. In Proceedings of the conference on fairness, accountability, and transparency, pages 90--99, 2019
work page 2019
-
[27]
B. Green and Y. Chen. Algorithmic risk assessments can alter human decision-making processes in high-stakes government contexts. Proceedings of the ACM on Human-Computer Interaction, 5 0 (CSCW2): 0 1--33, 2021
work page 2021
-
[28]
T. Gross. L etter R egarding E lectronic M onitoring in I llinois — C ommunity R enewal S ociety --- communityrenewalsociety.org. https://www.communityrenewalsociety.org/blog/letter-regarding-electronic-monitoring-in-illinois. [Accessed 08-09-2023]
work page 2023
- [29]
-
[30]
P. Herd and D. P. Moynihan. Administrative burden: Policymaking by other means. Russell Sage Foundation, 2019
work page 2019
-
[31]
M. A. Hern \'a n and J. M. Robins. Causal inference
- [32]
- [33]
-
[34]
N. Kallus and A. Zhou. Confounding-robust policy improvement. In Advances in Neural Information Processing Systems, pages 9269--9279, 2018
work page 2018
-
[35]
N. Kallus and A. Zhou. Assessing disparate impact of personalized interventions: identifiability and bounds. Advances in neural information processing systems, 32, 2019
work page 2019
-
[36]
N. Kallus and A. Zhou. Fairness, welfare, and equity in personalized pricing. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 296--314, 2021 a
work page 2021
-
[37]
N. Kallus and A. Zhou. Minimax-optimal policy learning under unobserved confounding. Management Science, 67 0 (5): 0 2870--2890, 2021 b
work page 2021
- [38]
- [39]
-
[40]
K. Kim, E. Kennedy, and J. Zubizarreta. Doubly robust counterfactual classification. Advances in Neural Information Processing Systems, 35: 0 34831--34845, 2022
work page 2022
- [41]
-
[42]
W. Lin, S.-H. Kim, and J. Tong. Does algorithm aversion exist in the field? an empirical analysis of algorithm use determinants in diabetes self-management. An Empirical Analysis of Algorithm Use Determinants in Diabetes Self-Management (July 23, 2021). USC Marshall School of Business Research Paper Sponsored by iORB, No. Forthcoming, 2021
work page 2021
-
[43]
M. Lipsky. Street-level bureaucracy: Dilemmas of the individual in public service. Russell Sage Foundation, 2010
work page 2010
-
[44]
L. Liu, Z. Shahn, J. M. Robins, and A. Rotnitzky. Efficient estimation of optimal regimes under a no direct effect assumption. Journal of the American Statistical Association, 116 0 (533): 0 224--239, 2021
work page 2021
-
[45]
J. Ludwig and S. Mullainathan. Fragile algorithms and fallible decision-makers: lessons from the justice system. Journal of Economic Perspectives, 35 0 (4): 0 71--96, 2021
work page 2021
-
[46]
K. Lum, E. Ma, and M. Baiocchi. The causal impact of bail on case outcomes for indigent defendants in new york city. Observational Studies, 3 0 (1): 0 38--64, 2017
work page 2017
-
[47]
C. Manski. Social Choice with Partial Knoweldge of Treatment Response. The Econometric Institute Lectures, 2005
work page 2005
-
[48]
A. Maurer. A vector-contraction inequality for rademacher complexities. In Algorithmic Learning Theory: 27th International Conference, ALT 2016, Bari, Italy, October 19-21, 2016, Proceedings 27, pages 3--17. Springer, 2016
work page 2016
-
[49]
B. Metevier, S. Giguere, S. Brockman, A. Kobren, Y. Brun, E. Brunskill, and P. S. Thomas. Offline contextual bandits with high probability fairness guarantees. Advances in neural information processing systems, 32, 2019
work page 2019
-
[50]
A. Mishler, E. H. Kennedy, and A. Chouldechova. Fairness in risk assessment instruments: Post-processing to achieve counterfactual equalized odds. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 386--400, 2021
work page 2021
-
[51]
Bail reform in cook county: An examination of general order 18.8a and bail in felony cases
Office of the Chief Judge . Bail reform in cook county: An examination of general order 18.8a and bail in felony cases. 2019 a
work page 2019
-
[52]
Office of the Chief Judge . Bail reform. 2019 b . URL https://www.cookcountycourt.org/HOME/Bail-Reform
work page 2019
-
[53]
H. Qiu, M. Carone, E. Sadikova, M. Petukhova, R. C. Kessler, and A. Luedtke. Optimal individualized decision rules using instrumental variable methods. Journal of the American Statistical Association, 116 0 (533): 0 174--191, 2021
work page 2021
-
[54]
D. B. Rubin. Comments on ``randomization analysis of experimental data: The fisher randomization test comment''. Journal of the American Statistical Association, 75 0 (371): 0 591--593, 1980
work page 1980
-
[55]
Safety and C. f. C. I. Justice Challenge. Expanding supervised release in new york city. 2022. URL https://safetyandjusticechallenge.org/resources/expanding-supervised-release-in-new-york-city/
work page 2022
-
[56]
A. Shapiro. On duality theory of conic linear problems. Semi-Infinite Programming: Recent Advances, pages 135--165, 2001
work page 2001
-
[57]
J. Steinhardt and P. Liang. Adaptivity and optimism: An improved exponentiated gradient algorithm. In International conference on machine learning, pages 1593--1601. PMLR, 2014
work page 2014
- [58]
-
[59]
A. Swaminathan and T. Joachims. Counterfactual risk minimization. Journal of Machine Learning Research, 2015
work page 2015
-
[60]
U.S. Commission on Civil Rights . A new paradigm for welfare reform: The need for civil rights enforcement. 2002
work page 2002
-
[61]
A. W. Van Der Vaart, J. A. Wellner, A. W. van der Vaart, and J. A. Wellner. Weak convergence. Springer, 1996
work page 1996
-
[62]
M. J. Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019
work page 2019
-
[63]
B. Woodworth, S. Gunasekar, M. I. Ohannessian, and N. Srebro. Learning non-discriminatory predictors. In Conference on Learning Theory, pages 1920--1953. PMLR, 2017
work page 1920
-
[64]
if it didn’t happen, why would i change my decision?
Y. Yacoby, B. Green, C. L. Griffin Jr, and F. Doshi-Velez. “if it didn’t happen, why would i change my decision?”: How judges respond to counterfactual explanations for the public safety assessment. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 10, pages 219--230, 2022
work page 2022
-
[65]
Y. Zhao, D. Zeng, A. J. Rush, and M. R. Kosorok. Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107 0 (499): 0 1106--1118, 2012
work page 2012
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.