Who Benefits from AI? Self-Selection, Skill Gap, and the Hidden Costs of AI Feedback

Christoph Riedl; Eric Bogert

arxiv: 2409.18660 · v2 · submitted 2024-09-27 · 💰 econ.GN · cs.AI· cs.HC· q-fin.EC

Who Benefits from AI? Self-Selection, Skill Gap, and the Hidden Costs of AI Feedback

Christoph Riedl , Eric Bogert This is my paper

Pith reviewed 2026-05-23 20:26 UTC · model grok-4.3

classification 💰 econ.GN cs.AIcs.HCq-fin.EC

keywords AI feedbackself-selectionskill gapintellectual diversitynatural experimentschess platformlearning outcomesmotivation

0 comments

The pith

Motivated and higher-skilled users self-select into AI feedback, creating an illusion of effectiveness while widening skill gaps and reducing intellectual diversity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Users choose when to seek AI feedback, and those already motivated and higher-skilled seek it more often and apply it more productively. This selection creates an illusion that AI boosts learning, but the gains disappear once motivation is accounted for in the data. The same mechanism means AI access helps skilled users disproportionately, widening gaps between high- and low-skilled groups. Exposure to the same centralized AI source also causes users to converge on similar ideas, lowering overall intellectual diversity. The paper establishes the diversity drop as causal through 42 platform-level natural experiments on a chess site with over 52,000 users observed across five years.

Core claim

Motivated and higher-skilled individuals self-select into AI feedback use and use it more productively. This self-selection creates an illusion of AI effectiveness because apparent learning gains disappear once endogenous motivation is accounted for. The same selection mechanism widens the skill gap because higher-skilled users benefit disproportionately. Exposure to centralized AI feedback also causes intellectual diversity to decline, and this reduction is shown to be causal by leveraging 42 platform-level natural experiments.

What carries the argument

Endogenous self-selection into AI feedback use, which correlates with pre-existing motivation and skill, together with platform natural experiments that identify convergence on centralized AI input.

If this is right

AI access widens the skill gap because motivated and higher-skilled individuals benefit disproportionately.
Individuals exposed to centralized AI feedback converge on common input, causing intellectual diversity to decline.
The diversity reduction is causal, as shown by the 42 platform-level natural experiments.
Self-selection connects individual learning dynamics to collective outcomes such as organizational learning and human capital development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Designers of optional AI tools may need to address selection effects to prevent unintended widening of skill differences.
The pattern could appear in other optional AI settings such as education or workplace assistance where users decide whether to engage.
Replicating the natural experiment approach in non-chess domains would test whether the diversity convergence holds more broadly.

Load-bearing premise

The 42 platform-level natural experiments isolate the causal impact of exposure to centralized AI feedback on intellectual diversity without confounding from other simultaneous platform changes or unmeasured shifts in user behavior.

What would settle it

Observing that apparent learning gains from AI feedback remain after controlling for individual motivation levels in the chess platform data, or finding no decline in intellectual diversity after the 42 natural experiment exposures.

Figures

Figures reproduced from arXiv: 2409.18660 by Christoph Riedl, Eric Bogert.

**Figure 2.** Figure 2: Platform-level intellectual diversity decreases with more AI analysis experience. Analysis Approach. To establish that this population-level decrease in intellectual diversity (strategy use) is causally driven by AI feedback we combine Regression Discontinuity in Time (RDiT; Hausman and Rapson, 2018) and natural experiments. First, we aggregate all chess openings played on the entire platform on a given da… view at source ↗

**Figure 3.** Figure 3: Analysis of natural experiments on platform [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

read the original abstract

Feedback from artificial intelligence (AI) is increasingly easy to access and research has already established that people learn from it. But individuals choose when and how to seek such feedback, and more engaged and motivated individuals may seek it more, creating an illusion of effectiveness that masks self-selection. We investigate how the endogenous choice to seek AI feedback shapes both individual learning and collective outcomes. Using data from over five years and 52,000 individuals on an online chess platform, we show that motivated and higher-skilled individuals self-select into AI feedback use-and use it more productively. This self-selection creates an illusion of AI effectiveness: apparent learning gains disappear once endogenous motivation is accounted for. This same selection mechanism drives two population-level consequences. Because motivated, higher-skilled individuals benefit disproportionately, AI access widens the skill gap. And because individuals exposed to centralized AI feedback converge on common input from a centralized AI source, intellectual diversity declines. Leveraging 42 platform-level natural experiments, we show this diversity reduction is causal. Self-selection into AI use thus connects individual-level learning dynamics to collective-level consequences-a micro-macro linkage with implications for organizational learning, human capital development, and the design of AI-augmented work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Self-selection into AI feedback explains away apparent learning gains on the chess platform and links to wider skill gaps plus lower diversity, but the natural experiments need more scrutiny on their isolation.

read the letter

The main point is that motivated and higher-skilled chess players on this platform choose to use AI feedback more often and get more from it, which makes the tool look effective in raw comparisons. Once selection is accounted for, those gains shrink or vanish. The same pattern widens skill differences across users and reduces variety in play because everyone draws from one centralized AI source. The authors tie the diversity drop to 42 platform-level natural experiments over five years of data on 52,000 users.

Referee Report

1 major / 1 minor

Summary. The paper analyzes data from over 52,000 individuals on an online chess platform spanning five years. It claims that motivated and higher-skilled users self-select into seeking AI feedback and use it more productively; once endogenous motivation is accounted for, apparent learning gains disappear. This selection mechanism is argued to widen skill gaps at the population level and reduce intellectual diversity, with the diversity reduction established as causal via 42 platform-level natural experiments.

Significance. If the causal identification in the natural experiments is robust, the results would link individual self-selection dynamics to aggregate consequences of AI adoption, highlighting risks of skill divergence and idea homogenization. The large-scale observational design with natural experiments offers potential for credible evidence on these micro-macro linkages in human capital and organizational learning.

major comments (1)

[Abstract (and associated methods description of the natural experiments)] The central causal claim on intellectual diversity reduction rests on the 42 platform-level natural experiments referenced in the abstract. The provided description supplies no information on the timing of the experiments, the precise platform changes involved, the statistical models employed, controls for concurrent shocks, or identification assumptions such as parallel trends or no anticipation effects. This omission makes it impossible to evaluate whether the experiments isolate exposure to centralized AI feedback or whether observed convergence could be driven by other unmeasured factors.

minor comments (1)

[Abstract] The abstract states conclusions about self-selection, illusory gains, skill gaps, and causal diversity loss but supplies no information on statistical models, controls for motivation, or robustness checks, which would aid reader assessment even at the summary level.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comment correctly identifies insufficient detail on the natural experiments in the abstract and methods description. We address this below and will revise accordingly.

read point-by-point responses

Referee: [Abstract (and associated methods description of the natural experiments)] The central causal claim on intellectual diversity reduction rests on the 42 platform-level natural experiments referenced in the abstract. The provided description supplies no information on the timing of the experiments, the precise platform changes involved, the statistical models employed, controls for concurrent shocks, or identification assumptions such as parallel trends or no anticipation effects. This omission makes it impossible to evaluate whether the experiments isolate exposure to centralized AI feedback or whether observed convergence could be driven by other unmeasured factors.

Authors: We agree that the current abstract and methods section provide insufficient detail on the 42 natural experiments to allow full evaluation of the identification strategy. In the revised manuscript we will expand the methods description to report: (i) the exact timing and duration of each experiment, (ii) the specific platform changes that generated the variation in exposure to centralized AI feedback, (iii) the econometric specifications (including fixed effects, clustering, and any difference-in-differences or event-study estimators), (iv) controls for concurrent platform-wide shocks, and (v) explicit discussion of the identifying assumptions (parallel trends, no anticipation, and exclusion restrictions). We will also add a dedicated appendix table summarizing these features for all 42 experiments. These additions will be placed in the main text rather than only the abstract. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical observational study with independent causal claims

full rationale

The paper is an empirical analysis of chess platform data from 52,000 individuals over five years, using self-selection patterns and 42 platform-level natural experiments to link individual AI feedback use to skill gaps and diversity reduction. No derivations, equations, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or described methods. The central claims rest on observable data patterns and natural experiment variation, which are externally falsifiable and not equivalent to inputs by construction. This is the standard case of a self-contained empirical paper with score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are identifiable. The work relies on standard econometric assumptions for correcting self-selection and identifying causal effects in natural experiments, but these are not specified.

pith-pipeline@v0.9.0 · 5752 in / 1172 out tokens · 34972 ms · 2026-05-23T20:26:53.426221+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 1 internal anchor

[1]

rstand downstream labor-market effects (Eloundou et al., 2023; Kim et al., 2024). Besides the question of a main effect of AI on skill development, a key question is whether high- or low-skilled workers and decision-makers benefit more from AI and whether, as a result, AI will increase or decrease the existing skill gap (and as a result related outcomes l...

work page 2023
[2]

Learning from AI may itself depend on relevant prior experience (Brynjolfsson et al., 2023)

and request AI input in the right situations (see section above on feedback seeking). Learning from AI may itself depend on relevant prior experience (Brynjolfsson et al., 2023). Understanding when and how to best use AI systems may constitute an important skill itself. For example, in a study of chess players, Bouacida et al. (2024) find that lower-skill...

work page 2023
[3]

outcome homogenization

and may undermine firms’ competitive advantage (Felin and Holweg, 2024). Such a homogenization risk seems especially probable in the context of generic and centralized AI systems where many individuals (both within and across firms) are exposed to generic and non-specific AI output. Access to the same feedback from a single centralized AI system (such as ...

work page 2024
[4]

Chess has been a focal point of AI research for decades (Shannon, 1950; Turing, 1953)

Setting and Data We use chess as a sample research domain of strategic decision-making in which AI is already widely adopted and the superhuman skill of AI is well known. Chess has been a focal point of AI research for decades (Shannon, 1950; Turing, 1953). Chess has been used to study cognitive performance over human life spans (Strittmatter, Sunde, and Zegners,

work page 1950
[5]

Our data come from lichess.org, a popular and free online chess platform

the effect of masks on performance (Smerdon, 2022), gender differences in competition (De Sousa and Hollard, 2022), personal bests as reference points (Anderson and Green, 2018), the joint effect of intelligence and practice on skill development (Vaci et al., 2019), and in hundreds of other studies on cognition, strategy, and artificial intelligence. Our ...

work page 2022
[6]

correspondence

version is Stockfish 16, and has a chess skill (called Elo) of 3,641 (see more on the chess Elo skill measure further below). For comparison, if Magnus Carlsen, widely regarded as the best player in history, played against this AI at his peak rating (Elo = 2,882), he would be expected to win just 12 out of 1,000 games. The AI feedback informs players of t...

work page 2023
[7]

We include a control variable 𝑇𝑒𝑛𝑢𝑟𝑒!" and its squared term, which captures the number of years an individual i has been active on the platform at time t

12 important control variable (the Section Defining Control Variables in the Appendix provides detailed explanation of each measure). We include a control variable 𝑇𝑒𝑛𝑢𝑟𝑒!" and its squared term, which captures the number of years an individual i has been active on the platform at time t. This allows us to distinguish between learning from AI Feedback and ...

work page 1990
[8]

Standard errors clustered on the individual level in parentheses

Effect of prior AI feedback on current performance. Standard errors clustered on the individual level in parentheses. 4.3 Heterogeneous Treatment by Skill To further investigate whether higher- or lower-skilled decision-makers benefit from AI feedback more than high skill players, we use a Generalized Random Forest (GRF), a non-parametric method that esti...

work page 2019
[9]

chess openings

Strength of the conditional average treatment effect of learning from AI feedback across a range of skill levels. 4.4 AI Feedback Leads to Specialization We propose that specialization is one contributing mechanism behind how decision-makers learn from AI feedback. To explore specialization as a potential mechanism, we focus on the opening moves of a ches...

work page 2000
[10]

=𝛽1𝐴𝑓𝑡𝑒𝑟(

and natural experiments. First, we aggregate all chess openings played on the entire platform on a given day into a diversity metric (1−𝐺𝑖𝑛𝑖). This metric captures how likely it is that two games played on the same day and drawn at random use ρp = −0.59 0.976 0.978 0.980 0.982 0 10 20 30 40 50 AI Feedback Diversity of Chess Openings (1 − Gini) 21 the same...

work page 2020
[11]

great equalizer

Analysis of natural experiments on platform-level intellectual diversity (coefficients from Table A4). Discussion Our analysis shows that decision-makers can learn from AI feedback. However, it also shows that when left to their own devices, decision-makers often use AI in the wrong way: They prefer to seek feedback after experiencing success rather than ...

work page 2023
[12]

Thus, our study supports the body of work suggesting that AI acts as a complement to high skill (Acemoglu and Autor, 2011; Autor, 2024; Autor et al.,

reducing the difference between high and low skilled individuals, we find that this is not what happens in practice (at least in our setting). Thus, our study supports the body of work suggesting that AI acts as a complement to high skill (Acemoglu and Autor, 2011; Autor, 2024; Autor et al.,

work page 2011
[13]

and may thus amplify the existing skill gap. Our focus on longer-term the outcome of learning may also explain why there may be negative long-term effects of AI on inequality despite no apparent negative effects in the short run (Alderucci et al., 2024). We also contribute to research on the role of AI adoption on competitive capabilities. Our multi-level...

work page 2024
[14]

but also by affecting the intellectual diversity of their human 25 users. Reduction in strategy diversity may prove especially detrimental since organizational environments have become more global, dynamic, and competitive, thus increasing demand for flexibility and innovation (Smith and Lewis, 2011). Prior theorizing on human-AI collaboration has suggest...

work page 2011
[15]

more human than human

this pattern may be reversed. As an entire population of AI users is exposed to feedback from the same, centralized AI system, human decision-makers may become more homogeneous in their decision strategies and intellectual diversity decreases. This may undermine an organization's ability to flexibly deal with diverse situations, reduce its problem-solving...

work page 2024
[16]

Strategic Management Journal 43(10): 2066–2100

Optimal distinctiveness across revenue models: Performance effects of differentiation of paid and free products in a mobile app market. Strategic Management Journal 43(10): 2066–2100. Angrist JD, Pischke J-S

work page 2066
[17]

Strategic Management Journal 37(10): 2031–2049

Mental representation and the discovery of new strategies. Strategic Management Journal 37(10): 2031–2049. Csaszar FA, Steinberger T

work page 2031
[18]

doi:10.48550/arXiv.2303.10130 , url =

GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv preprint arXiv:2303.10130. Fazelpour S, De-Arteaga M

work page arXiv
[19]

An Overview of Catastrophic AI Risks

An overview of catastrophic ai risks. arXiv preprint arXiv:2306.12001. Hooper D, Whyld K

work page internal anchor Pith review arXiv
[20]

University of Chicago, Becker Friedman Institute for Economics Working Paper (2024–50)

The Adoption of ChatGPT. University of Chicago, Becker Friedman Institute for Economics Working Paper (2024–50). Joseph J, Sengul M

work page 2024
[21]

arXiv preprint arXiv:2403.16812

Towards human-AI deliberation: Design and evaluation of LLM-empowered deliberative ai for AI-assisted decision-making. arXiv preprint arXiv:2403.16812. Mack O, Khare A, Krämer A, Burgartz T

work page arXiv
[22]

Journal of Human Resources 50(2): 420–445

Control function methods in applied econometrics. Journal of Human Resources 50(2): 420–445. 32 Appendix AI Feedback on the Lichess Platform The focal variable in our study is how often individuals seek AI to analyze their games. The example below shows the output of what the AI reports when it analyzes a game between two of the best players on Lichess. W...

work page 2010

[1] [1]

rstand downstream labor-market effects (Eloundou et al., 2023; Kim et al., 2024). Besides the question of a main effect of AI on skill development, a key question is whether high- or low-skilled workers and decision-makers benefit more from AI and whether, as a result, AI will increase or decrease the existing skill gap (and as a result related outcomes l...

work page 2023

[2] [2]

Learning from AI may itself depend on relevant prior experience (Brynjolfsson et al., 2023)

and request AI input in the right situations (see section above on feedback seeking). Learning from AI may itself depend on relevant prior experience (Brynjolfsson et al., 2023). Understanding when and how to best use AI systems may constitute an important skill itself. For example, in a study of chess players, Bouacida et al. (2024) find that lower-skill...

work page 2023

[3] [3]

outcome homogenization

and may undermine firms’ competitive advantage (Felin and Holweg, 2024). Such a homogenization risk seems especially probable in the context of generic and centralized AI systems where many individuals (both within and across firms) are exposed to generic and non-specific AI output. Access to the same feedback from a single centralized AI system (such as ...

work page 2024

[4] [4]

Chess has been a focal point of AI research for decades (Shannon, 1950; Turing, 1953)

Setting and Data We use chess as a sample research domain of strategic decision-making in which AI is already widely adopted and the superhuman skill of AI is well known. Chess has been a focal point of AI research for decades (Shannon, 1950; Turing, 1953). Chess has been used to study cognitive performance over human life spans (Strittmatter, Sunde, and Zegners,

work page 1950

[5] [5]

Our data come from lichess.org, a popular and free online chess platform

the effect of masks on performance (Smerdon, 2022), gender differences in competition (De Sousa and Hollard, 2022), personal bests as reference points (Anderson and Green, 2018), the joint effect of intelligence and practice on skill development (Vaci et al., 2019), and in hundreds of other studies on cognition, strategy, and artificial intelligence. Our ...

work page 2022

[6] [6]

correspondence

version is Stockfish 16, and has a chess skill (called Elo) of 3,641 (see more on the chess Elo skill measure further below). For comparison, if Magnus Carlsen, widely regarded as the best player in history, played against this AI at his peak rating (Elo = 2,882), he would be expected to win just 12 out of 1,000 games. The AI feedback informs players of t...

work page 2023

[7] [7]

We include a control variable 𝑇𝑒𝑛𝑢𝑟𝑒!" and its squared term, which captures the number of years an individual i has been active on the platform at time t

12 important control variable (the Section Defining Control Variables in the Appendix provides detailed explanation of each measure). We include a control variable 𝑇𝑒𝑛𝑢𝑟𝑒!" and its squared term, which captures the number of years an individual i has been active on the platform at time t. This allows us to distinguish between learning from AI Feedback and ...

work page 1990

[8] [8]

Standard errors clustered on the individual level in parentheses

Effect of prior AI feedback on current performance. Standard errors clustered on the individual level in parentheses. 4.3 Heterogeneous Treatment by Skill To further investigate whether higher- or lower-skilled decision-makers benefit from AI feedback more than high skill players, we use a Generalized Random Forest (GRF), a non-parametric method that esti...

work page 2019

[9] [9]

chess openings

Strength of the conditional average treatment effect of learning from AI feedback across a range of skill levels. 4.4 AI Feedback Leads to Specialization We propose that specialization is one contributing mechanism behind how decision-makers learn from AI feedback. To explore specialization as a potential mechanism, we focus on the opening moves of a ches...

work page 2000

[10] [10]

=𝛽1𝐴𝑓𝑡𝑒𝑟(

and natural experiments. First, we aggregate all chess openings played on the entire platform on a given day into a diversity metric (1−𝐺𝑖𝑛𝑖). This metric captures how likely it is that two games played on the same day and drawn at random use ρp = −0.59 0.976 0.978 0.980 0.982 0 10 20 30 40 50 AI Feedback Diversity of Chess Openings (1 − Gini) 21 the same...

work page 2020

[11] [11]

great equalizer

Analysis of natural experiments on platform-level intellectual diversity (coefficients from Table A4). Discussion Our analysis shows that decision-makers can learn from AI feedback. However, it also shows that when left to their own devices, decision-makers often use AI in the wrong way: They prefer to seek feedback after experiencing success rather than ...

work page 2023

[12] [12]

Thus, our study supports the body of work suggesting that AI acts as a complement to high skill (Acemoglu and Autor, 2011; Autor, 2024; Autor et al.,

reducing the difference between high and low skilled individuals, we find that this is not what happens in practice (at least in our setting). Thus, our study supports the body of work suggesting that AI acts as a complement to high skill (Acemoglu and Autor, 2011; Autor, 2024; Autor et al.,

work page 2011

[13] [13]

and may thus amplify the existing skill gap. Our focus on longer-term the outcome of learning may also explain why there may be negative long-term effects of AI on inequality despite no apparent negative effects in the short run (Alderucci et al., 2024). We also contribute to research on the role of AI adoption on competitive capabilities. Our multi-level...

work page 2024

[14] [14]

but also by affecting the intellectual diversity of their human 25 users. Reduction in strategy diversity may prove especially detrimental since organizational environments have become more global, dynamic, and competitive, thus increasing demand for flexibility and innovation (Smith and Lewis, 2011). Prior theorizing on human-AI collaboration has suggest...

work page 2011

[15] [15]

more human than human

this pattern may be reversed. As an entire population of AI users is exposed to feedback from the same, centralized AI system, human decision-makers may become more homogeneous in their decision strategies and intellectual diversity decreases. This may undermine an organization's ability to flexibly deal with diverse situations, reduce its problem-solving...

work page 2024

[16] [16]

Strategic Management Journal 43(10): 2066–2100

Optimal distinctiveness across revenue models: Performance effects of differentiation of paid and free products in a mobile app market. Strategic Management Journal 43(10): 2066–2100. Angrist JD, Pischke J-S

work page 2066

[17] [17]

Strategic Management Journal 37(10): 2031–2049

Mental representation and the discovery of new strategies. Strategic Management Journal 37(10): 2031–2049. Csaszar FA, Steinberger T

work page 2031

[18] [18]

doi:10.48550/arXiv.2303.10130 , url =

GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv preprint arXiv:2303.10130. Fazelpour S, De-Arteaga M

work page arXiv

[19] [19]

An Overview of Catastrophic AI Risks

An overview of catastrophic ai risks. arXiv preprint arXiv:2306.12001. Hooper D, Whyld K

work page internal anchor Pith review arXiv

[20] [20]

University of Chicago, Becker Friedman Institute for Economics Working Paper (2024–50)

The Adoption of ChatGPT. University of Chicago, Becker Friedman Institute for Economics Working Paper (2024–50). Joseph J, Sengul M

work page 2024

[21] [21]

arXiv preprint arXiv:2403.16812

Towards human-AI deliberation: Design and evaluation of LLM-empowered deliberative ai for AI-assisted decision-making. arXiv preprint arXiv:2403.16812. Mack O, Khare A, Krämer A, Burgartz T

work page arXiv

[22] [22]

Journal of Human Resources 50(2): 420–445

Control function methods in applied econometrics. Journal of Human Resources 50(2): 420–445. 32 Appendix AI Feedback on the Lichess Platform The focal variable in our study is how often individuals seek AI to analyze their games. The example below shows the output of what the AI reports when it analyzes a game between two of the best players on Lichess. W...

work page 2010