pith. sign in

arxiv: 2606.25668 · v1 · pith:YRJX5VMRnew · submitted 2026-06-24 · 💻 cs.CY

Bridging Predictions and Interventions: An Integrated Framework for Automated Decision-Systems

Pith reviewed 2026-06-25 19:05 UTC · model grok-4.3

classification 💻 cs.CY
keywords automated decision systemspredictionsinterventionsorganizational workflowsdecision-making processescriminal pretrial releaseclinical triagesocietal consequences
0
0 comments X

The pith

Automated decision systems require shifting from prediction accuracy to an intervention-oriented framework because predictions change how organizations assess and decide.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the common focus on improving predictive accuracy in automated decision systems overlooks how those predictions alter organizational workflows, assessments, and processes. Real-world examples from criminal pretrial release and clinical triage show that accuracy gains do not reliably produce better downstream outcomes. The authors therefore propose an integrated framework that treats ADS as interventions within social systems rather than isolated prediction tools. This matters because it reframes design, evaluation, and deployment priorities toward anticipating real organizational and societal effects.

Core claim

Introducing individual predictions into decision-making modifies organizational workflows, assessment, and decision-making processes in ways that require a complete re-consideration of our approach to the design, evaluation, and deployment of ADS, shifting priorities from a purely prediction-based paradigm toward an intervention-oriented view that accounts for real-world conditions.

What carries the argument

An integrated framework that bridges predictions and interventions by treating automated decision systems as elements embedded in organizational and social processes rather than standalone predictors.

If this is right

  • Evaluation of ADS must include effects on organizational decision processes, not only predictive metrics.
  • Design priorities shift toward anticipating how predictions will be used as interventions.
  • Deployment strategies require modeling downstream societal and organizational consequences before rollout.
  • Research focus moves from accuracy benchmarks to integrated study of prediction-plus-intervention effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could be tested by comparing two otherwise identical ADS deployments that differ only in whether they incorporate workflow redesign.
  • This view connects to questions in public administration about how new information technologies reshape bureaucratic routines.
  • If the claim holds, regulators might require evidence of intervention effects rather than accuracy alone when approving high-stakes systems.

Load-bearing premise

Real-world case studies demonstrate that improved predictive accuracy is far from the main factor needed for better downstream outcomes.

What would settle it

A controlled deployment in one of the studied settings where raising predictive accuracy produces measurably better organizational or individual outcomes without any accompanying changes to workflows or assessment procedures.

Figures

Figures reproduced from arXiv: 2606.25668 by Amanda Coston, Angela Zhou, Angelina Wang, Ashia Wilson, Avi Feller, Ben Laufer, Berk Ustun, Bryan Wilder, Daniel Ho, Daniel Malinsky, Eli Ben-Michael, Ezinne Nwankwo, Hammaad Adam, Inioluwa Deborah Raji, Jessica Hullman, Joshua Loftus, Juan Carlos Perdomo, Kosuke Imai, Lily Hu, Luke Guerdan, Lydia T. Liu, Mark Sendak, Matthew Salganik, Michael Zanger-Tishler, Razieh Nabi, Sayash Kapoor, Shion Guha, Simone Zhang, Suresh Venkatasubramanian, Talia Gillis.

Figure 1
Figure 1. Figure 1: Example of predictive ADS and the decisions they inform. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: From prediction to intervention in automated decision systems. Predictive risk scores R, generated from individual data covariates X, does not affect outcomes Y directly. It operates through assessments Yˆ , which leads to decisions D under an institutional policy change Z, such as the introduction of an ADS into an existing bureaucratic workflow. • (Z, D): Observational Albright [2019], Angelova et al. [2… view at source ↗
Figure 3
Figure 3. Figure 3: ADS lifecycle before and after our integrated view. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
read the original abstract

Automated decision systems (ADS) leverage predictions about individual future outcomes to inform consequential decision-making in organizational settings. Across various settings - including criminal pretrial release, clinical triage, student support, and more - it is often assumed that improved predictive accuracy is the priority consideration in determining better downstream outcomes upon the deployment of ADS. In practice, real-world case studies reveal that this is far from the case: introducing individual predictions into decision-making modifies organizational workflows, assessment, and decision-making processes in ways that require a complete re-consideration of our approach to the design, evaluation, and deployment of ADS. As a result, this Perspective develops an integrated framework for studying ADS in social systems, shifting current priorities from a purely prediction-based paradigm towards an intervention-oriented view that accounts for real-world conditions. Our aim is to improve our understanding of ADS and more meaningfully anticipate its downstream societal and organizational consequences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript is a Perspective arguing that automated decision systems (ADS) in domains such as criminal pretrial release, clinical triage, and student support do not primarily benefit from improved predictive accuracy. Instead, the introduction of individual-level predictions alters organizational workflows, assessment processes, and decision-making in ways that require a fundamental re-design of ADS. The paper develops an integrated framework that shifts emphasis from a prediction-centric paradigm to an intervention-oriented view accounting for real-world conditions, with the goal of better anticipating downstream societal and organizational consequences.

Significance. If the interpretive synthesis holds, the framework offers a useful conceptual lens for the cs.CY community by highlighting how prediction deployment interacts with organizational structures. This could encourage more context-sensitive evaluation practices beyond isolated accuracy metrics, though the absence of concrete examples limits immediate applicability.

major comments (1)
  1. [Abstract] Abstract: The central assertion that 'real-world case studies reveal that this is far from the case' (i.e., that prediction accuracy is not the priority) is load-bearing for the entire argument yet remains unsupported by any specific case descriptions, data points, or citations within the manuscript. Without these details, the shift to an intervention-oriented framework rests on an unevaluated premise.
minor comments (1)
  1. The transition from observed case studies to the proposed framework would be clearer if the manuscript explicitly mapped specific workflow changes (e.g., in assessment or decision processes) to elements of the new framework.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed review and constructive feedback on our Perspective manuscript. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central assertion that 'real-world case studies reveal that this is far from the case' (i.e., that prediction accuracy is not the priority) is load-bearing for the entire argument yet remains unsupported by any specific case descriptions, data points, or citations within the manuscript. Without these details, the shift to an intervention-oriented framework rests on an unevaluated premise.

    Authors: We agree that the abstract's claim would be more robust with explicit supporting references. Although the manuscript synthesizes known patterns across the cited domains (pretrial release, clinical triage, student support), the current text does not include the specific case descriptions or citations needed to ground the assertion. In the revised version we will add targeted citations to established studies documenting workflow changes and add one or two brief illustrative examples in the abstract and introduction to substantiate the premise without altering the Perspective's conceptual focus. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a Perspective that presents an interpretive synthesis of real-world ADS case studies to argue that prediction accuracy is not the dominant factor for downstream outcomes and that workflows require redesign around interventions. It contains no equations, fitted parameters, mathematical derivations, or load-bearing self-citations. The central claim is advanced as an observational and conceptual reframing rather than a deductive reduction; no step reduces by construction to its own inputs or to a self-referential chain. The argument is self-contained against external benchmarks of case-study illustration.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the domain assumption that case studies show prediction accuracy is insufficient, with no free parameters or invented entities introduced in the abstract.

axioms (1)
  • domain assumption Real-world case studies reveal that improved predictive accuracy is far from the priority consideration for better downstream outcomes in ADS deployment.
    Invoked directly in the abstract as the motivation for shifting to an intervention-oriented view.

pith-pipeline@v0.9.1-grok · 5808 in / 1120 out tokens · 36938 ms · 2026-06-25T19:05:44.992119+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 5 canonical work pages

  1. [1]

    M. J. Azizi, P. Vayanos, B. Wilder, E. Rice, and M. Tambe. Designing fair, efficient, and inter- pretable policies for prioritizing homeless youth for housing resources. InIntegration of Constraint Programming, Artificial Intelligence, and Operations Research: 15th International Conference, CPAIOR 2018, Delft, The Netherlands, June 26–29, 2018, Proceeding...

  2. [2]

    M. Bao, A. Zhou, S. Zottola, B. Brubach, S. Desmarais, A. Horowitz, K. Lum, and S. Venkata- subramanian. It’s compaslicated: The messy relationship between rai datasets and algorithmic fairness benchmarks.arXiv preprint arXiv:2106.05498,

  3. [3]

    http://www.fairmlbook.org. E. Ben-Michael, D. J. Greiner, M. Huang, K. Imai, Z. Jiang, and S. Shin. Does ai help humans make better decisions? a statistical evaluation framework for experimental and observational studies.Proceedings of the National Academy of Sciences, 122(38):e2505106122, 2025a. E. Ben-Michael, D. J. Greiner, K. Imai, and Z. Jiang. Safe ...

  4. [4]

    Coston, A

    A. Coston, A. Mishler, E. H. Kennedy, and A. Chouldechova. Counterfactual risk assessments, evaluation, and fairness. InProceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 582–593,

  5. [5]

    Accessed: 2025-02-11

    URLhttps://themarkup.org/machine-learning/2021/03/02/ major-universities-are-using-race-as-a-high-impact-predictor-of-student-success. Accessed: 2025-02-11. T. Feathers. Takeaways from our investigation into wisconsin’s racially inequitable dropout algorithm.The Markup, April 27

  6. [6]

    URLhttps://themarkup.org/the-breakdown/2023/04/27/ takeaways-from-our-investigation-into-wisconsins-racially-inequitable-dropout-algorithm. U. Fischer-Abaigar, C. Kern, and J. C. Perdomo. The value of prediction in identifying the worst- off.International Conference on Machine Learning,

  7. [7]

    Fischer-Abaigar, E

    U. Fischer-Abaigar, E. Aiken, C. Kern, and J. C. Perdomo. On the meta-design of allocation problems.arXiv preprint arXiv:2602.08786,

  8. [8]

    B. Green. The false promise of risk assessments: Epistemic reform and the limits of fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* 2020), pages 594–606. Association for Computing Machinery,

  9. [9]

    Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , pages =

    doi: 10.1145/3351095.3372869. URL https://dl.acm.org/doi/10.1145/3351095.3372869. B. Green and Y. Chen. Disparate interactions: An algorithm-in-the-loop analysis of fairness in risk assessments. InProceedings of the conference on fairness, accountability, and transparency, pages 90–99,

  10. [10]

    Guerdan, A

    12 L. Guerdan, A. Coston, Z. S. Wu, and K. Holstein. Ground (less) truth: A causal framework for proxy labels in human-algorithm decision-making. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pages 688–704,

  11. [11]

    Hardt, N

    M. Hardt, N. Megiddo, C. Papadimitriou, and M. Wootters. Strategic classification. InProceedings of the 2016 ACM conference on innovations in theoretical computer science, pages 111–122,

  12. [12]

    Hu and Y

    L. Hu and Y. Chen. Fair classification and social welfare. InProceedings of the 2020 conference on fairness, accountability, and transparency, pages 535–545,

  13. [13]

    J. Y. Kim, W. Boag, F. Gulamali, A. Hasan, H. D. J. Hogg, M. Lifson, D. Mulligan, M. Patel, I. D. Raji, A. Sehgal, et al. Organizational governance of emerging technologies: Ai adoption in health- care. Inproceedings of the 2023 ACM conference on fairness, accountability, and transparency, pages 1396–1417,

  14. [14]

    [Accessed 19-10-2025]. G. A. Klein.Sources of power: How people make decisions. MIT press,

  15. [15]

    V. Lai, C. Chen, Q. V. Liao, A. Smith-Renner, and C. Tan. Towards a science of human-ai decision making: a survey of empirical studies.arXiv preprint arXiv:2112.11471,

  16. [16]

    Accessed: 2026-05-21

    URLhttps://www.propublica.org/article/ how-we-analyzed-the-compas-recidivism-algorithm. Accessed: 2026-05-21. B. Laufer, J. Kleinberg, K. Levy, and H. Nissenbaum. Strategic evaluation. InProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–12,

  17. [17]

    URLhttps://ojs.aaai.org/index.php/AAAI/ article/view/30229

    doi: 10.1609/aaai.v38i20.30229. URLhttps://ojs.aaai.org/index.php/AAAI/ article/view/30229. L. T. Liu, I. D. Raji, A. Zhou, L. Guerdan, J. Hullman, D. Malinsky, B. Wilder, S. Zhang, H. Adam, A. Coston, B. Laufer, E. Nwankwo, M. Zanger-Tishler, E. Ben-Michael, S. Baro- cas, A. Feller, M. Gerchick, T. Gillis, S. Guha, D. Ho, L. Hu, K. Imai, S. Kapoor, J. Lo...

  18. [18]

    MDRC. Evaluation of Pretrial Justice System Reforms That Use the Public Safety Assessment — MDRC — mdrc.org.https://www.mdrc.org/work/publications/ evaluation-pretrial-justice-system-reforms-use-public-safety-assessment-0. [Accessed 20-05-2026]. J. Perdomo, T. Zrnic, C. Mendler-D¨ unner, and M. Hardt. Performative prediction. InInternational Conference on...

  19. [19]

    J. C. Perdomo, T. Britton, M. Hardt, and R. Abebe. Difficult lessons on social prediction from wisconsin public schools.arXiv preprint arXiv:2304.06205,

  20. [20]

    Rahmattalabi, P

    A. Rahmattalabi, P. Vayanos, K. Dullerud, and E. Rice. Learning resource allocation policies from observational data with an application to homeless services delivery. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 1240–1256,

  21. [21]

    I. D. Raji and L. Liu. Evaluating prediction-based interventions with human decision makers in mind.arXiv preprint arXiv:2503.05704,

  22. [22]

    URLhttps: //doi.org/10.1145/3476089

    doi: 10.1145/3476089. URLhttps: //doi.org/10.1145/3476089. D. Saxena, E. S.-Y. Moon, A. Chaurasia, Y. Guan, and S. Guha. Rethinking ”risk” in algorithmic systems through a computational narrative analysis of casenotes in child-welfare. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA,

  23. [23]

    InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23)

    Association for Computing Machinery. ISBN 9781450394215. doi: 10.1145/3544548. 3581308. URLhttps://doi.org/10.1145/3544548.3581308. J. Schuman. Supervised release is not parole.Loy. LAL Rev., 53:587,

  24. [24]

    Xiang and I

    A. Xiang and I. D. Raji. On the legal compatibility of fairness definitions.arXiv preprint arXiv:1912.00761,

  25. [25]

    M. Yin, J. Wortman Vaughan, and H. Wallach. Understanding the effect of accuracy on trust in machine learning models. InProceedings of the 2019 chi conference on human factors in computing systems, pages 1–12,

  26. [26]

    URLhttps://www

    doi: 10.1126/sciadv.adi8411. URLhttps://www. science.org/doi/abs/10.1126/sciadv.adi8411. Y. Zhang, E. Ben-Michael, and K. Imai. Safe policy learning under regression discontinuity designs. arXiv preprint arXiv:2208.13323,