How Exposed Are UK Jobs to Generative AI? Developing and Applying a Novel Task-Based Index
Pith reviewed 2026-05-19 02:52 UTC · model grok-4.3
The pith
A new index finds that 94% of UK jobs have some exposure to large language models but only 13% are heavily exposed.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Generative AI Susceptibility Index measures the share of a job's tasks that large language models can complete at least 25% faster than existing tools. When applied to British Skills and Employment Survey data, the index shows 94% of UK jobs now carry some exposure while only 13% exceed a heavy-exposure threshold of 0.5, with the highest values concentrated in scientific and technical professions. Overall exposure rose by about 16% of a standard deviation between 2017 and 2023/24, driven by occupational shifts rather than within-job task changes, and the wage premium attached to exposed tasks declined 12% over the same period.
What carries the argument
The Generative AI Susceptibility Index (GAISI), which scores each job as the share of its tasks that LLMs rate as completable at least 25% faster than existing tools, constructed by linking probabilistic LLM ratings to worker-reported task data from the British Skills and Employment Surveys.
If this is right
- Scientific and technical professions carry the highest share of heavily exposed jobs.
- Aggregate exposure increased mainly through shifts of workers into different occupations rather than changes in tasks performed inside occupations.
- Wage premiums for tasks rated as AI-exposed fell by 12% between 2017 and 2023/24.
- Job postings contracted relatively in occupations with higher exposure after the release of ChatGPT.
Where Pith is reading between the lines
- The index could serve as a monitoring tool for policymakers tracking how exposure evolves with successive improvements in model capabilities.
- If exposure remains concentrated in a narrow set of occupations, targeted training programs might focus on those roles rather than broad workforce retraining.
- Longer-term productivity effects would likely appear first in the scientific and technical sectors where GAISI scores are highest.
Load-bearing premise
That LLM judgments about whether a task can be sped up by 25% provide a valid stand-in for real labor-market exposure once matched to survey responses.
What would settle it
Direct measurement of time savings or output gains from LLM use in high-GAISI versus low-GAISI jobs, or continued divergence in hiring and wage trends between high- and low-exposure occupations.
Figures
read the original abstract
Building on the task-based approach to labour markets, we develop the Generative AI Susceptibility Index (GAISI), a job-level measure of UK exposure to large language models (LLMs). Drawing on Eloundou et al. (2024), we use LLMs as probabilistic raters to classify task exposure, linking ratings to worker-reported task data from the British Skills and Employment Surveys. GAISI measures the share of job activities where LLMs can reduce task completion time by at least 25% beyond existing tools. Systematic validations demonstrate high reliability, strong validity, and predictive power over existing exposure measures. By 2023/24, nearly all UK jobs (94%) exhibited some LLM exposure, yet only 13% were heavily exposed (GAISI > 0.5), with the highest concentration in scientific and technical professions. Aggregate exposure rose 16% of one standard deviation since 2017, driven by occupational shifts rather than within-occupation task changes. The wage premium for AI-exposed tasks declined 12% between 2017 and 2023/24, and the period since ChatGPT's release has coincided with a relative contraction of job postings in more AI-exposed occupations. These findings are consistent with generative AI beginning to affect hiring and pay in exposed occupations, though causal attribution requires further research. GAISI offers policymakers and researchers a validated, replicable tool for monitoring AI exposure at the job level as this technology diffuses.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops the Generative AI Susceptibility Index (GAISI), a job-level measure of UK exposure to large language models. It uses LLMs as probabilistic raters to score tasks from the British Skills and Employment Surveys on whether they can be completed at least 25% faster than with existing tools, then aggregates these into an index of the share of job activities meeting the criterion. The paper reports that by 2023/24 nearly all UK jobs (94%) have some exposure while only 13% are heavily exposed (GAISI > 0.5), with highest concentration in scientific and technical professions. Aggregate exposure rose 16% of one standard deviation since 2017 due to occupational shifts, the wage premium for AI-exposed tasks fell 12%, and job postings contracted relatively in more exposed occupations.
Significance. If the LLM ratings validly proxy real exposure, the work supplies a replicable, UK-specific monitoring tool that extends task-based approaches to a new technology. The reported internal reliability, validity, and predictive-power checks are strengths, as is the linkage to worker-reported task data rather than purely occupational aggregates. The descriptive patterns (widespread but shallow exposure, recent shifts, and early labor-market signals) could usefully inform policy discussion, though the authors correctly note that causal attribution requires further research.
major comments (2)
- [Methods (LLM rating and aggregation)] Methods section on LLM rating procedure and aggregation: The central exposure percentages (94% some exposure, 13% GAISI > 0.5) are direct aggregates of the LLM probabilistic ratings. The manuscript should supply fuller detail on the exact prompting template, number of independent LLM evaluations per task, how probabilities are converted to binary exposure indicators, and any sensitivity tests around the 25% time-reduction threshold and the GAISI > 0.5 heavy-exposure cutoff. These choices are load-bearing for the headline numbers and the 16% SD rise since 2017.
- [Validation and robustness checks] Validation section: The paper states that systematic validations demonstrate high reliability, strong validity, and predictive power. However, these appear to be internal consistency and correlation checks. Because the index is interpreted as a measure of actual labor-market exposure, external validation against observed productivity gains, measured time savings, or human-expert benchmarks on a subset of tasks would materially strengthen the claim that LLM ratings serve as a valid proxy, especially given possible LLM biases in assessing tacit knowledge.
minor comments (2)
- [Abstract] Abstract: Define 'some LLM exposure' more explicitly in relation to the GAISI scale or any minimum threshold applied.
- [Results (wage premium and postings)] Results on wage premium and job postings: Provide the precise econometric specification (e.g., controls, fixed effects, sample) used to estimate the 12% decline and the relative contraction in postings.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which help clarify the presentation of our methods and strengthen the interpretation of our validation exercises. We address each major comment below and indicate the revisions we will make to the manuscript.
read point-by-point responses
-
Referee: Methods section on LLM rating procedure and aggregation: The central exposure percentages (94% some exposure, 13% GAISI > 0.5) are direct aggregates of the LLM probabilistic ratings. The manuscript should supply fuller detail on the exact prompting template, number of independent LLM evaluations per task, how probabilities are converted to binary exposure indicators, and any sensitivity tests around the 25% time-reduction threshold and the GAISI > 0.5 heavy-exposure cutoff. These choices are load-bearing for the headline numbers and the 16% SD rise since 2017.
Authors: We agree that greater detail on the LLM rating and aggregation procedure is warranted to support replicability. In the revised manuscript we will expand the Methods section to include the precise prompting template, the number of independent LLM evaluations performed per task, the exact procedure for converting probabilistic ratings into binary exposure indicators, and sensitivity checks that vary both the 25% time-reduction threshold and the GAISI > 0.5 heavy-exposure cutoff. These additions will directly address the load-bearing nature of these choices for the reported exposure shares and the 2017–2023/24 change. revision: yes
-
Referee: Validation section: The paper states that systematic validations demonstrate high reliability, strong validity, and predictive power. However, these appear to be internal consistency and correlation checks. Because the index is interpreted as a measure of actual labor-market exposure, external validation against observed productivity gains, measured time savings, or human-expert benchmarks on a subset of tasks would materially strengthen the claim that LLM ratings serve as a valid proxy, especially given possible LLM biases in assessing tacit knowledge.
Authors: Our validation exercises consist of internal consistency (inter-LLM reliability), convergent validity with prior exposure measures, and predictive tests against observed labor-market outcomes such as wages and job postings. We will revise the Validation section to describe these checks more explicitly and to acknowledge that they remain internal or correlational. We agree that direct external benchmarks—productivity gains, time-use diaries, or expert human ratings—would be desirable; however, such data are not available in the British Skills and Employment Surveys and would require new primary data collection outside the scope of the present study. In the revision we will add an explicit limitations paragraph discussing potential LLM biases in tacit-knowledge tasks and outline directions for future external validation. revision: partial
Circularity Check
No significant circularity in GAISI construction or exposure claims
full rationale
The paper constructs the Generative AI Susceptibility Index (GAISI) by applying LLM probabilistic ratings—drawn from the method in Eloundou et al. (2024), an independent prior study—to task-level data from the British Skills and Employment Surveys. The headline exposure statistics (94% of jobs with some exposure, 13% with GAISI > 0.5) and trends (16% SD rise since 2017) are direct aggregates of these ratings rather than outputs of any fitted model, self-referential equation, or parameter estimated from the target exposure figures themselves. No step renames a known result, imports uniqueness via self-citation, or treats a fitted input as a prediction. Reported validations address reliability and predictive power as separate checks without reducing the core claims to the same inputs by construction. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- 25% time reduction threshold
- GAISI > 0.5 cutoff for heavy exposure
axioms (1)
- domain assumption LLMs can serve as reliable probabilistic raters of task-level time savings beyond existing tools
invented entities (1)
-
Generative AI Susceptibility Index (GAISI)
no independent evidence
Forward citations
Cited by 2 Pith papers
-
What Jobs Can AI Learn? Measuring Exposure by Reinforcement Learning
A new RL Feasibility Index based on task learnability via reinforcement learning diverges from prior AI exposure measures, rating operational jobs like power plant operators as highly feasible while rating creative an...
-
From Exposure to Adoption: Generative AI in European Workplaces
Generative AI adoption in Europe ranges from under 3% to 25%, is steeper for skilled workers in abstract-task jobs and in digitally advanced countries with training, shows a gender gap in exposed roles, and has produc...
Reference graph
Works this paper leans on
-
[1]
Can Large Language Models Transform Computational Social Science?
https://doi.org/10.1162/coli_a_00502 36 Appendices A Appendix: Classification Prompt <System prompt> You are an AI job expert tasked with analyzing job tasks for their exposure to AI, specifically Large Language Models (LLMs). Your goal is to determine the probability of different levels of AI exposure for each task, considering the incremental impact of ...
-
[2]
Analyze each task in the context of the occupation details provided
-
[3]
Consider how an average worker in this occupation would typically perform each task using existing tools
-
[4]
Assess how LLMs could potentially assist with each task, focusing on new capabilities beyond existing tools
-
[5]
For EACH TASK, calculate the probability of it falling into each of 37 the following AI exposure levels: - E0: No Exposure (LLMs do not meaningfully reduce time by 25% or more) - E1: Direct Exposure (LLMs alone can reduce time by at least 25%) - E2: Exposure via Imaginable LLM-Powered Applications (LLMs + additional software could reduce time by at least ...
-
[6]
In your analysis for each task, include: a
Provide a brief justification for each probability distribution. In your analysis for each task, include: a. Task summary in the context of the occupation b. Existing tools typically used for this task c. Potential LLM capabilities that could assist with the task d. Arguments for and against each exposure level First provide a high-level analysis of the e...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.