Recognition: 2 theorem links
· Lean TheoremBounded by Risk, Not Capability: Quantifying AI Occupational Substitution Rates via a Tech-Risk Dual-Factor Model
Pith reviewed 2026-05-10 20:06 UTC · model grok-4.3
The pith
AI occupational substitution is bounded by institutional risk and liability rather than technical capability alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By deconstructing 923 occupations into 2,087 detailed work activities and scoring each for technical feasibility alongside business risk via a multi-agent LLM ensemble with variance-based expert validation, we compute Relative Occupational Automation Indices for the U.S. labor market. This produces the finding that non-routine cognitive roles highly dependent on symbolic manipulation face OAI around 0.70 while unstructured physical trades and high-stakes caretaking roles exhibit absolute resilience, quantifying a cognitive risk asymmetry and hypothesizing a compliance premium in which wages increasingly reward risk-absorption capacity.
What carries the argument
Tech-Risk Dual-Factor Model: a framework that separately scores technical feasibility and institutional business risk for each detailed work activity before aggregating into Relative Occupational Automation Indices.
If this is right
- Non-routine cognitive roles face substantially higher substitution exposure than previously estimated under capability-only models.
- Unstructured physical trades and high-stakes caretaking roles remain resilient due to risk factors.
- The traditional routine-biased technological change hypothesis does not fully explain observed patterns.
- Wage resilience will increasingly correlate with an occupation's capacity to absorb compliance and liability risks.
- The indices provide a cross-sectional diagnostic usable for subsequent dynamic modeling of labor reallocation.
Where Pith is reading between the lines
- Sectors with heavy regulation may see slower AI adoption regardless of technical readiness.
- Retraining programs could prioritize risk-management and compliance skills alongside technical ones.
- The model implies that liability insurance markets may become a key bottleneck or enabler for AI deployment.
Load-bearing premise
The multi-agent LLM ensemble plus expert panel validation can reliably quantify both technical feasibility and the institutional premium of professional liability across work activities without systematic scoring bias.
What would settle it
Longitudinal data on actual employment declines, wage stagnation, or hiring patterns for data scientists versus construction workers or nurses that show no greater AI-driven displacement for the former group over the next several years.
Figures
read the original abstract
The deployment of Large Language Models (LLMs) has ignited concerns about technological unemployment. Existing task-based evaluations predominantly measure theoretical "exposure" to AI capabilities, ignoring critical frictions of real-world commercial adoption: liability, compliance, and physical safety. We argue occupations are not eradicated instantaneously, but gradually encroached upon via atomic actions. We introduce a Tech-Risk Dual-Factor Model to re-evaluate this. By deconstructing 923 occupations into 2,087 Detailed Work Activities (DWAs), we utilize a multi-agent LLM ensemble to score both technical feasibility and business risk. Through variance-based Human-in-the-Loop (HITL) validation with an expert panel, we demonstrate a profound cognitive gap: isolated algorithmic probabilities fail to encapsulate the "institutional premium" imposed by experts bounded by professional liability. Applying a strictly algorithmic baseline via mathematical bottleneck aggregation, we calculate Relative Occupational Automation Indices ($OAI$) for the U.S. labor market. Our findings challenge the traditional Routine-Biased Technological Change (RBTC) hypothesis. Non-routine cognitive roles highly dependent on symbolic manipulation (e.g., Data Scientists) face unprecedented exposure ($OAI \approx 0.70$). Conversely, unstructured physical trades and high-stakes caretaking roles exhibit absolute resilience, quantifying a profound "Cognitive Risk Asymmetry." We hypothesize the emergent necessity of a "Compliance Premium," indicating wage resilience increasingly tied to risk-absorption capacity. We frame these findings as a cross-sectional diagnostic of systemic vulnerability, establishing a foundation for subsequent Computable General Equilibrium (CGE) econometric modeling involving dynamic wage elasticity and structural labor reallocation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Tech-Risk Dual-Factor Model, which decomposes 923 U.S. occupations into 2,087 Detailed Work Activities (DWAs). It employs a multi-agent LLM ensemble to score technical feasibility and business risk for each DWA, followed by variance-based Human-in-the-Loop (HITL) validation with an expert panel. Using mathematical bottleneck aggregation, it derives Relative Occupational Automation Indices (OAI) that purportedly reveal a 'Cognitive Risk Asymmetry,' with high OAI values (≈0.70) for non-routine cognitive roles like Data Scientists and near-zero values for physical trades and high-stakes caretaking, challenging the Routine-Biased Technological Change (RBTC) hypothesis and positing a 'Compliance Premium' in wages.
Significance. If the OAI values prove robust after validation, this work would supply a useful cross-sectional diagnostic of AI labor-market exposure that incorporates institutional frictions (liability, compliance) absent from prior capability-only metrics. The atomic DWA decomposition and dual-factor scoring could serve as input for subsequent CGE models of wage elasticity and structural reallocation, while the hypothesized Compliance Premium offers a testable link between risk-absorption capacity and occupational resilience.
major comments (4)
- [Abstract] Abstract: The reported OAI values (e.g., ≈0.70 for Data Scientists) are presented without accompanying quantitative validation metrics, error bars, inter-rater agreement statistics, or sensitivity analyses for the LLM ensemble scores and variance-based expert adjustments. This absence leaves the central claim of Cognitive Risk Asymmetry unsupported by visible evidence.
- [Methods] Methods (Tech-Risk Dual-Factor Model and HITL validation): The variance-based HITL validation is asserted to isolate the 'institutional premium' of professional liability, yet no agreement metrics (Fleiss' kappa, ICC) or calibration against external observables (adoption rates, insurance premia, regulatory barriers) are reported. Without these, systematic LLM bias in weighting symbolic versus physical tasks cannot be ruled out.
- [Results] Results (OAI calculation via bottleneck aggregation): The aggregation is described as strictly algorithmic and baseline, but the upstream LLM-generated ratings and expert adjustments introduce unquantified dependence; the paper provides no robustness checks to prompt variations, ensemble composition, or alternative aggregation rules that would confirm the reported asymmetry is not an artifact of the scoring pipeline.
- [Discussion] Discussion (challenge to RBTC): The claim that findings overturn Routine-Biased Technological Change rests on the OAI differential between cognitive and physical roles; however, absent falsification tests or correlation with real-world substitution data, the asymmetry remains an unvalidated modeling output rather than an empirical refutation.
minor comments (2)
- [Abstract] Abstract: The mathematical definition of the bottleneck aggregation used to obtain OAI is not supplied, nor is the precise formula for combining the two risk factors; a compact equation should appear on first use.
- [Notation] Notation: Ensure all acronyms (OAI, DWA, HITL, RBTC, CGE) are defined at first appearance and used consistently; the term 'Compliance Premium' is introduced as a hypothesis but lacks an operational definition.
Simulated Author's Rebuttal
We thank the referee for these constructive comments, which help strengthen the transparency of the Tech-Risk Dual-Factor Model. We address each major point below and indicate revisions to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The reported OAI values (e.g., ≈0.70 for Data Scientists) are presented without accompanying quantitative validation metrics, error bars, inter-rater agreement statistics, or sensitivity analyses for the LLM ensemble scores and variance-based expert adjustments. This absence leaves the central claim of Cognitive Risk Asymmetry unsupported by visible evidence.
Authors: We agree the abstract omits these supporting statistics. The Methods section details the variance-based HITL process, but the revised manuscript will add a concise summary of key metrics (e.g., average inter-rater agreement and ensemble sensitivity ranges) directly into the abstract to better substantiate the OAI values and Cognitive Risk Asymmetry. revision: yes
-
Referee: [Methods] Methods (Tech-Risk Dual-Factor Model and HITL validation): The variance-based HITL validation is asserted to isolate the 'institutional premium' of professional liability, yet no agreement metrics (Fleiss' kappa, ICC) or calibration against external observables (adoption rates, insurance premia, regulatory barriers) are reported. Without these, systematic LLM bias in weighting symbolic versus physical tasks cannot be ruled out.
Authors: The variance-based HITL is designed to surface institutional factors beyond LLM scores. In revision we will report Fleiss' kappa and ICC for the expert panel. Full calibration to external observables such as insurance premia is not feasible with currently available public data and will be noted as a limitation; we will also discuss steps to test for LLM bias in future extensions. revision: partial
-
Referee: [Results] Results (OAI calculation via bottleneck aggregation): The aggregation is described as strictly algorithmic and baseline, but the upstream LLM-generated ratings and expert adjustments introduce unquantified dependence; the paper provides no robustness checks to prompt variations, ensemble composition, or alternative aggregation rules that would confirm the reported asymmetry is not an artifact of the scoring pipeline.
Authors: We will add explicit robustness checks to the Results section, including re-runs with varied prompts, altered ensemble sizes, and alternative aggregation rules (e.g., mean pooling). These will show that the reported Cognitive Risk Asymmetry remains stable, confirming it is not an artifact of the pipeline. revision: yes
-
Referee: [Discussion] Discussion (challenge to RBTC): The claim that findings overturn Routine-Biased Technological Change rests on the OAI differential between cognitive and physical roles; however, absent falsification tests or correlation with real-world substitution data, the asymmetry remains an unvalidated modeling output rather than an empirical refutation.
Authors: The manuscript frames OAI as a cross-sectional diagnostic rather than a completed empirical refutation of RBTC. The differential is produced by the dual-factor DWA scoring. We will expand the Discussion with explicit caveats on the modeling basis and the requirement for future empirical tests against substitution data, while retaining the contrast with capability-only metrics. revision: partial
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper defines OAI via deconstruction of occupations into DWAs, LLM-ensemble scoring of feasibility and risk factors, variance-based expert validation, and subsequent bottleneck aggregation. This sequence constructs a new index from scored inputs rather than reducing the reported Cognitive Risk Asymmetry or OAI values back to those inputs by definition or self-citation. No equations are shown that equate the final index to its scoring step tautologically, no parameters are fitted to a data subset and relabeled as predictions, and no load-bearing uniqueness theorems or ansatzes are imported from the authors' prior work. The central claims rest on the empirical distribution of the derived scores across occupations, which remains independent of the aggregation formula itself.
Axiom & Free-Parameter Ledger
axioms (3)
- domain assumption Occupations can be decomposed into 2,087 representative Detailed Work Activities without loss of critical context
- domain assumption Multi-agent LLM ensembles produce unbiased scores for both technical feasibility and business risk
- ad hoc to paper Variance-based HITL validation with an expert panel captures the institutional premium of professional liability
invented entities (3)
-
Tech-Risk Dual-Factor Model
no independent evidence
-
Relative Occupational Automation Indices (OAI)
no independent evidence
-
Compliance Premium
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Tech-Risk Dual-Factor Mapping Matrix... AI(DWAi)=f(Ti,Ri) piecewise with thresholds 0/0.3/0.5/0.7/1.0; bottleneck AI(tj)=min AI(d); OAI(ok)=sum wt·AI(t)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Cognitive Risk Asymmetry... Institutional Premium from liability and loss aversion
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The simple macroeconomics of ai
Daron Acemoglu. The simple macroeconomics of ai. Working Paper 32487, National Bureau of Economic Research, 2024. 27
2024
-
[2]
The race between man and machine: Implications of technology for growth, factor shares, and em- ployment.American Economic Review, 108(6):1488–1542, 2018
Daron Acemoglu and Pascual Restrepo. The race between man and machine: Implications of technology for growth, factor shares, and em- ployment.American Economic Review, 108(6):1488–1542, 2018
2018
-
[3]
Automation and new tasks: How technology displaces and reinstates labor.Journal of Economic Perspectives, 33(2):3–30, 2019
Daron Acemoglu and Pascual Restrepo. Automation and new tasks: How technology displaces and reinstates labor.Journal of Economic Perspectives, 33(2):3–30, 2019
2019
-
[4]
Artificial in- telligence and economic growth.National Bureau of Economic Research Working Paper, (w23928), 2017
Philippe Aghion, Benjamin F Jones, and Charles I Jones. Artificial in- telligence and economic growth.National Bureau of Economic Research Working Paper, (w23928), 2017
2017
-
[5]
Harvard Business Press, 2023
Ajay Agrawal, Joshua Gans, and Avi Goldfarb.Power and Prediction: The Disruptive Economics of Artificial Intelligence. Harvard Business Press, 2023
2023
-
[6]
Artificial intelligence: The ambiguous labor market impact of automating prediction.Journal of Economic Perspectives, 33(2):31–50, 2019
Ajay Agrawal, Joshua S Gans, and Avi Goldfarb. Artificial intelligence: The ambiguous labor market impact of automating prediction.Journal of Economic Perspectives, 33(2):31–50, 2019
2019
-
[7]
The expertise economy: How ai can rebuild the middle class
David Autor. The expertise economy: How ai can rebuild the middle class. Working Paper 32140, National Bureau of Economic Research, 2024
2024
-
[8]
The growth of low-skill service jobs and the polarization of the us labor market.American Economic Review, 103(5):1553–1597, 2013
David H Autor and David Dorn. The growth of low-skill service jobs and the polarization of the us labor market.American Economic Review, 103(5):1553–1597, 2013
2013
-
[9]
Autor, Frank Levy, and Richard J
David H. Autor, Frank Levy, and Richard J. Murnane. The skill content of recent technological change: An empirical exploration.The Quarterly Journal of Economics, 118(4):1279–1333, 2003
2003
-
[10]
On the dangers of stochastic parrots: Can language models be too big?\textparrot
Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmar- garet Shmitchell. On the dangers of stochastic parrots: Can language models be too big?\textparrot. InProceedings of the 2021 ACM Con- ference on Fairness, Accountability, and Transparency, pages 610–623, 2021
2021
-
[11]
Automation and jobs: When technology boosts employ- ment.Economic Policy, 34(100):589–626, 2019
James Bessen. Automation and jobs: When technology boosts employ- ment.Economic Policy, 34(100):589–626, 2019
2019
-
[12]
On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Sim- ran Arora, Sydney von Arx, Michael S Bernardo, Michael S Bernstein, Shruti Bhagavatula, et al. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258, 2021. 28
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[13]
General purpose tech- nologies ‘engines of growth’?Journal of Econometrics, 65(1):83–108, 1995
Timothy F Bresnahan and Manuel Trajtenberg. General purpose tech- nologies ‘engines of growth’?Journal of Econometrics, 65(1):83–108, 1995
1995
-
[14]
The potentially large effects of artificial intelligence on economic growth
Joseph Briggs and Devesh Kodnani. The potentially large effects of artificial intelligence on economic growth. Global economics analyst, Goldman Sachs, 2023
2023
-
[15]
Generative ai at work
Erik Brynjolfsson, Danielle Li, and Lindsey R Raymond. Generative ai at work. Working Paper 31161, National Bureau of Economic Research, 2023
2023
-
[16]
Sparks of Artificial General Intelligence: Early experiments with GPT-4
S´ ebastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, et al. Sparks of artificial general intelligence: Early experiments with gpt-4.arXiv preprint arXiv:2303.12712, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[17]
Task- dependent algorithm aversion.Journal of Marketing Research, 56(5):809–825, 2019
Noah Castelo, Maarten W Bos, and Donald R Lehmann. Task- dependent algorithm aversion.Journal of Marketing Research, 56(5):809–825, 2019
2019
-
[18]
How ai affects the labor market: A review of the micro evidence.China Economic Review, 82:102046, 2023
Hong Cheng et al. How ai affects the labor market: A review of the micro evidence.China Economic Review, 82:102046, 2023
2023
-
[19]
Navigating the jagged techno- logical frontier: Field experimental evidence of the effects of ai on knowl- edge worker productivity and quality
Fabrizio Dell’Acqua, Edward McFowland, Ethan R Mollick, Hila Lifshitz-Assaf, Katherine Kellogg, Saran Rajendran, Lisa Krayer, Fran¸ cois Madani, and Karim R Lakhani. Navigating the jagged techno- logical frontier: Field experimental evidence of the effects of ai on knowl- edge worker productivity and quality. Working Paper 24-013, Harvard Business School ...
2023
-
[20]
The growing importance of social skills in the labor market.The Quarterly Journal of Economics, 132(4):1593–1640, 2017
David J Deming. The growing importance of social skills in the labor market.The Quarterly Journal of Economics, 132(4):1593–1640, 2017
2017
-
[21]
Algorithmic aversion: People err away from algorithms after seeing them err.Journal of Experimental Psychology: General, 144(1):114–126, 2015
Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. Algorithmic aversion: People err away from algorithms after seeing them err.Journal of Experimental Psychology: General, 144(1):114–126, 2015
2015
-
[22]
Tyna Eloundou, Sam Manning, Pamela Mishkin, and Daniel Rock. Gpts are gpts: An early look at the labor market impact potential of large language models.arXiv preprint arXiv:2303.10130, 2023
-
[23]
Felten, Manav Raj, and Robert Seamans
Edward Felten, Manav Raj, and Robert Seamans. How will language modelers like chatgpt affect occupations and industries?arXiv preprint arXiv:2303.01157, 2023. 29
-
[24]
GPT-3: Its nature, scope, limits, and consequences.Minds and Machines, 30(4):681–694, 2020
Luciano Floridi and Massimo Chiriatti. GPT-3: Its nature, scope, limits, and consequences.Minds and Machines, 30(4):681–694, 2020
2020
-
[25]
The future of employment: How susceptible are jobs to computerisation?Technological Forecasting and Social Change, 114:254–280, 2017
Carl Benedikt Frey and Michael A Osborne. The future of employment: How susceptible are jobs to computerisation?Technological Forecasting and Social Change, 114:254–280, 2017
2017
-
[26]
The false hope of current approaches to explainable artificial intelligence in health care.The Lancet Digital Health, 3(11):e745–e750, 2021
Marzyeh Ghassemi, Luke Oakden-Rayner, and Andrew L Beam. The false hope of current approaches to explainable artificial intelligence in health care.The Lancet Digital Health, 3(11):e745–e750, 2021
2021
-
[27]
Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023
2023
-
[28]
Prospect theory: An analysis of decision under risk.Econometrica, 47(2):263–291, 1979
Daniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk.Econometrica, 47(2):263–291, 1979
1979
-
[29]
The rise and nature of alterna- tive work arrangements in the united states, 1995–2015.ILR Review, 72(2):382–416, 2019
Lawrence F Katz and Alan B Krueger. The rise and nature of alterna- tive work arrangements in the united states, 1995–2015.ILR Review, 72(2):382–416, 2019
1995
-
[30]
Human decisions and machine predictions.The Quarterly Journal of Economics, 133(1):237–293, 2018
Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. Human decisions and machine predictions.The Quarterly Journal of Economics, 133(1):237–293, 2018
2018
-
[31]
Artificial intelligence and eco- nomic growth.National Bureau of Economic Research Working Paper, (w24174), 2018
Anton Korinek and Joseph E Stiglitz. Artificial intelligence and eco- nomic growth.National Bureau of Economic Research Working Paper, (w24174), 2018
2018
-
[32]
The o-ring theory of economic development.The Quarterly Journal of Economics, 108(3):551–575, 1993
Michael Kremer. The o-ring theory of economic development.The Quarterly Journal of Economics, 108(3):551–575, 1993
1993
-
[33]
Gary Marcus. The next decade in ai: four steps towards robust artificial intelligence.arXiv preprint arXiv:2002.06177, 2020
-
[34]
Harvard University Press, 1988
Hans Moravec.Mind Children: The Future of Robot and Human Intel- ligence. Harvard University Press, 1988
1988
-
[35]
Experimental evidence on the productivity effects of generative artificial intelligence.Science, 381(6653):37–42, 2023
Shakked Noy and Whitney Zhang. Experimental evidence on the productivity effects of generative artificial intelligence.Science, 381(6653):37–42, 2023. 30
2023
-
[36]
Training language models to follow instruc- tions with human feedback.Advances in Neural Information Processing Systems, 35:27730–27744, 2022
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wain- wright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instruc- tions with human feedback.Advances in Neural Information Processing Systems, 35:27730–27744, 2022
2022
-
[37]
Basic Books, 2018
Judea Pearl and Dana Mackenzie.The Book of Why: The New Science of Cause and Effect. Basic Books, 2018
2018
-
[38]
Doubleday, 1966
Michael Polanyi.The Tacit Dimension. Doubleday, 1966
1966
-
[39]
Machine be- haviour.Nature, 568(7753):477–486, 2019
Iyad Rahwan, Manuel Cebrian, Nick Obradovich, Josh Bongard, Jean- Fran¸ cois Bonnefon, Cynthia Breazeal, Jacob W Crandall, Nicholas A Christakis, Iain D Couzin, Matthew O Jackson, et al. Machine be- haviour.Nature, 568(7753):477–486, 2019
2019
-
[40]
Artificial intelligence and management: The automation–augmentation paradox.Academy of Management Review, 46(1):192–210, 2021
Sebastian Raisch and Sebastian Krakowski. Artificial intelligence and management: The automation–augmentation paradox.Academy of Management Review, 46(1):192–210, 2021
2021
-
[41]
Strict liability versus negligence.The Journal of Legal Studies, 9(1):1–25, 1980
Steven Shavell. Strict liability versus negligence.The Journal of Legal Studies, 9(1):1–25, 1980
1980
-
[42]
Oxford University Press, 2015
Richard Susskind and Daniel Susskind.The Future of the Professions: How Technology will Transform the Work of Human Experts. Oxford University Press, 2015
2015
-
[43]
Random House, 2018
Nassim Nicholas Taleb.Skin in the Game: Hidden Asymmetries in Daily Life. Random House, 2018
2018
-
[44]
tech_level
Michael Webb. The impact of artificial intelligence on the labor market. Working paper, Stanford University, 2020. 31 A AI Ensemble System Prompt The data generation process (DGP) for the technical capabilities and risk metrics relied on a highly constrained zero-shot prompt. The system prompt utilized across the LLM ensemble is documented below: You are ...
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.