The Impact of AI Usage and Informativeness on Skill Development in Logical Reasoning

Catarina Belem; Hongyu Yao; Mark Steyvers; Padhraic Smyth; Shang Wu; Shuyuan Fu

arxiv: 2605.21695 · v1 · pith:VPU4WW6Inew · submitted 2026-05-20 · 💻 cs.AI · cs.HC

The Impact of AI Usage and Informativeness on Skill Development in Logical Reasoning

Shang Wu , Hongyu Yao , Catarina Belem , Shuyuan Fu , Mark Steyvers , Padhraic Smyth This is my paper

Pith reviewed 2026-05-22 09:03 UTC · model grok-4.3

classification 💻 cs.AI cs.HC

keywords AI usageskill developmentlogical reasoningAI informativenesshuman-AI interactionlearning outcomescognitive skillsAI assistance

0 comments

The pith

Heavy AI use in logical reasoning tasks weakens skill development once assistance is removed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests how AI usage levels and the informativeness of AI responses affect skill growth in a controlled logical reasoning setting. Heavy users show weaker performance after AI help is withdrawn compared with light users or matched non-users. Low-information AI fails to help in the moment or afterward and is tied to poorer overall learning, while high-information AI can lift immediate results without average long-term loss though effects vary by person. A reader would care because AI is entering education and work, so knowing when it builds versus replaces human reasoning matters for how we deploy it.

Core claim

In experiments with on-demand AI during logical reasoning, greater AI usage correlated with weaker skill development after removal: heavy users underperformed comparable peers while light users matched those with no AI. These patterns were mediated by AI informativeness. Low-information AI improved neither immediate nor post-removal performance and linked to weaker learning overall. High-information AI raised short-run performance without lowering average post-AI outcomes but showed heterogeneous effects.

What carries the argument

Mediation through AI usage intensity and informativeness: the contrast between heavy versus light use and between high- versus low-information AI outputs determines whether assistance amplifies or substitutes for independent reasoning.

If this is right

High-informativeness AI can support immediate gains while preserving long-term skill levels on average.
Light or limited AI access keeps post-assistance performance comparable to no access at all.
Heavy reliance on low-information AI can substitute for reasoning and reduce independent skill growth.
Regulating AI availability in learning contexts may be needed to avoid undermining skill development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same usage-and-informativeness pattern could be tested in math problem-solving or coding tasks to check domain generality.
Heterogeneous effects under high-information AI suggest room to study which users benefit most and design targeted prompts.
Training people to treat AI as a reasoning partner rather than an answer source might shift heavy users toward lighter, more beneficial patterns.

Load-bearing premise

That differences in how much and what kind of AI people choose can be separated from their starting reasoning ability or motivation so performance gaps can be credited to usage rather than who selected what.

What would settle it

A follow-up experiment that randomly assigns fixed AI-usage quotas and still finds no post-removal performance gap between heavy and light users after initial-ability matching would undermine the central claim.

Figures

Figures reproduced from arXiv: 2605.21695 by Catarina Belem, Hongyu Yao, Mark Steyvers, Padhraic Smyth, Shang Wu, Shuyuan Fu.

**Figure 1.** Figure 1: Study design with three phases: Phase 1 (pre-AI) and Phase 3 (post-AI) human performance was assessed without any assistance. In Phase 2, participants were assigned to either a control group (no AI) or a treatment group with optional AI assistance that varies in AI informativeness. Collectively, these findings underscore that the educational and productivity consequences of AI depend not only on access bu… view at source ↗

**Figure 2.** Figure 2: Examples of user interfaces for the logic-based puzzle task linked to specific positions, for example, an object with a horizontal line was more likely to be placed in Position 4 (see Figure 2a, object F), thereby facilitating learning and knowledge transfer. The study consisted of three phases ( [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of individual changes in Reward Rate (correct objects per minute), defined as Phase 3 − Phase 1. The black dashed line marks zero-change reference, and the colored dashed line denotes the sample mean [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Propensity score matching (PSM): average correctness and response time (with standard error bars) for light and heavy AI usage groups versus matched control groups in Phase 1 (pre-AI) and Phase 3 (post-AI). 4.1. When AI assistance hinders skill development We start by analyzing the relationship between individual AI usage and subsequent skill development. Participants are categorized according to their AI … view at source ↗

**Figure 5.** Figure 5: Reward rate trajectories across experimental phases by AI informativeness. The x-axis denotes Phase 1, Phase 2–first half (2-1), Phase 2–second half (2-2), and Phase 3. Error bars represent standard errors. 4.3. Low-information AI displaces independent effort without performance gains Here we investigate the mechanism underlying the uniform decline observed under lowinformation AI. Unlike the high-informa… view at source ↗

**Figure 6.** Figure 6: Post-AI average reward rates by initial ability level across different AI informativeness conditions. Error bars show 90% confidence intervals; dashed lines show post-AI average reward rates for high- and low-ability groups in the no-AI condition. limited assistance: each request revealed at most one object’s location and, in some cases, repeated information already implied by the problem (e.g., the probl… view at source ↗

**Figure 7.** Figure 7: Phase 2 average problem-level correctness by AI informativeness and usage. Error bars represent standard errors [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Reward rate trajectories across experimental phases by AI informativeness and initial ability. The x-axis denotes Phase 1, Phase 2–first half (2-1), Phase 2–second half (2-2), and Phase 3. Error bars represent standard errors [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

read the original abstract

Artificial intelligence (AI) is being increasingly integrated into human problem-solving, yet its effects on individual skill development remain unclear. We examine how both AI usage and informativeness can shape learning in the context of a controlled logical reasoning task with on-demand access to AI assistance. We find that greater AI usage is associated with weaker skill development: heavy AI users underperform relative to comparable peers, whereas light AI users perform similarly to matched users who do not use AI. We also find in our study that these patterns are mediated by AI informativeness. Low-information AI neither improves immediate performance nor preserves performance after AI assistance is removed, and is linked to weaker learning overall. On the other hand, high-information AI was found to improve short-run performance without reducing post-AI outcomes on average in our experiments, but with heterogeneous effects. Our findings in general suggest that AI can, depending on context, either complement human skill development by amplifying independent reasoning or can act as a substitute that undermines such reasoning, with the implication that regulating AI access and usage will be important for promoting skill development in the presence of AI assistance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript reports results from a controlled logical reasoning experiment with on-demand AI access. It claims that greater AI usage is associated with weaker skill development after AI removal: heavy users underperform relative to matched comparable peers, while light users perform similarly to non-users. These patterns are mediated by AI informativeness, with low-information AI linked to no immediate gains and weaker overall learning, and high-information AI improving short-run performance without average reductions in post-AI outcomes (with noted heterogeneity). The authors conclude that AI can either complement or substitute for independent reasoning depending on usage and informativeness.

Significance. If the identification strategy successfully isolates usage effects from baseline ability and motivation, the findings would be significant for understanding when AI assistance supports versus undermines cognitive skill development. The mediation analysis by informativeness and the distinction between heavy/light usage add nuance beyond simple usage-volume claims. The study also offers falsifiable predictions about post-AI performance that could be tested in follow-up work.

major comments (1)

The central causal interpretation—that heavy AI usage weakens skill development relative to comparable peers—rests on the claim that usage intensity can be isolated from pre-existing differences in reasoning aptitude or motivation. The abstract invokes 'comparable peers' and 'matched users' but supplies no information on sample size, baseline measures, matching procedure, or regression controls. This is load-bearing: without these details, observed post-task gaps could reflect selection rather than usage effects, and the same concern applies to the informativeness mediation (which is only observed among users who actually query the system).

minor comments (2)

The abstract states that high-information AI 'was found to improve short-run performance without reducing post-AI outcomes on average' but does not report effect sizes, confidence intervals, or the exact definition of 'on average' versus heterogeneous effects.
Notation for 'AI informativeness' and 'skill development' should be defined explicitly in the methods section with reference to the specific logical reasoning measures used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed and constructive report. We address the major comment on the identification of usage effects below, and we plan to incorporate clarifications in the revised manuscript.

read point-by-point responses

Referee: The central causal interpretation—that heavy AI usage weakens skill development relative to comparable peers—rests on the claim that usage intensity can be isolated from pre-existing differences in reasoning aptitude or motivation. The abstract invokes 'comparable peers' and 'matched users' but supplies no information on sample size, baseline measures, matching procedure, or regression controls. This is load-bearing: without these details, observed post-task gaps could reflect selection rather than usage effects, and the same concern applies to the informativeness mediation (which is only observed among users who actually query the system).

Authors: We agree that the abstract does not provide sufficient detail on these methodological aspects, which are important for evaluating the causal claims. The full paper includes a description of the sample size and experimental procedure in the Methods section. Baseline measures of reasoning aptitude were collected via a pre-experiment test, and we include these as controls in our main regressions. To address selection into usage intensity, we use a matching procedure based on baseline aptitude, self-reported motivation, and other observables to compare heavy users to similar light or non-users. We will revise the manuscript to explicitly summarize these details in the abstract and to add a subsection on the matching method and its assumptions. We will also expand the discussion of the informativeness mediation to note that it is estimated conditional on AI queries being made and to include additional analyses addressing potential selection into querying. revision: yes

Circularity Check

0 steps flagged

Empirical study reports observed associations with no derivation chain or fitted predictions

full rationale

This is a controlled empirical study that measures participant-chosen AI usage intensity and informativeness during a logical reasoning task, then reports post-task performance associations relative to matched peers. No equations, first-principles derivations, or model parameters are presented as predictions that could reduce to the inputs by construction. Claims rest on direct experimental observations and comparisons rather than self-referential definitions or self-citation chains that bear the load of the central result. The design is therefore self-contained against external benchmarks and exhibits no circularity of the enumerated kinds.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the central claims rest on standard experimental assumptions such as random assignment or statistical controls for individual differences, which are not detailed here.

pith-pipeline@v0.9.0 · 5738 in / 1097 out tokens · 34328 ms · 2026-05-22T09:03:34.714022+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We find that greater AI usage is associated with weaker skill development: heavy AI users underperform relative to comparable peers...
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

propensity score matching (PSM) ... matching Light and Heavy users separately to Zero-usage participants based on Phase 1 performance

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

[1]

Experimental evidence on the productivity effects of generative artificial intelligence

Noy S, Zhang W. Experimental evidence on the productivity effects of generative artificial intelligence. Science. 2023;381(6654):187-92

work page 2023
[2]

ChatGPT for good? On opportunities and challenges of large language models for education

Kasneci E, Seßler K, K ¨uchemann S, Bannert M, Dementieva D, Fischer F, et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual dif- ferences. 2023;103:102274

work page 2023
[3]

The impact of AI on developer productivity: Evidence from Github Copilot

Peng S, Kalliamvakou E, Cihon P, Demirer M. The impact of AI on developer productivity: Evidence from Github Copilot. arXiv preprint arXiv:230206590. 2023

work page 2023
[4]

Expectation vs

Vaithilingam P, Zhang T, Glassman EL. Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models. In: CHI conference on human factors in computing systems extended abstracts; 2022. p. 1-7

work page 2022
[5]

Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transportation research part A: policy and practice

Kalra N, Paddock SM. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transportation research part A: policy and practice. 2016;94:182-93

work page 2016
[6]

Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study

Budzy ´n K, Roma´nczyk M, Kitala D, Kołodziej P, Bugajski M, Adami HO, et al. Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study. The Lancet Gastroenterology & Hepatology. 2025

work page 2025
[7]

Bayesian modeling of human–AI complementarity

Steyvers M, Tejeda H, Kerrigan G, Smyth P. Bayesian modeling of human–AI complementarity. Pro- ceedings of the National Academy of Sciences. 2022;119(11):e2111547119

work page 2022
[8]

Learning to complement humans

Wilder B, Horvitz E, Kamar E. Learning to complement humans. arXiv preprint arXiv:200500582. 2020

work page 2020
[9]

Roles of artificial intelligence in collaboration with humans: Au- tomation, augmentation, and the future of work

F ¨ugener A, Walzner DD, Gupta A. Roles of artificial intelligence in collaboration with humans: Au- tomation, augmentation, and the future of work. Management Science. 2026;72(1):538-57

work page 2026
[10]

AI tools in society: Impacts on cognitive offloading and the future of critical thinking

Gerlich M. AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies. 2025;15(1):6

work page 2025
[11]

Does using artificial intelligence assistance accelerate skill decay and hinder skill development without performers’ awareness? Cognitive Research: Principles and Implications

Macnamara BN, Berber I, C ¸ avus ¸o˘glu MC, Krupinski EA, Nallapareddy N, Nelson NE, et al. Does using artificial intelligence assistance accelerate skill decay and hinder skill development without performers’ awareness? Cognitive Research: Principles and Implications. 2024;9(1):46

work page 2024
[12]

Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making

Zhang Y , Liao QV , Bellamy RK. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In: Proceedings of the 2020 conference on fairness, accountability, and transparency; 2020. p. 295-305

work page 2020
[13]

Explainable AI improves task perfor- mance in human–AI collaboration

Senoner J, Schallmoser S, Kratzwald B, Feuerriegel S, Netland T. Explainable AI improves task perfor- mance in human–AI collaboration. Scientific reports. 2024;14(1):31150

work page 2024
[14]

To trust or to think: cognitive forcing functions can reduce overre- liance on AI in AI-assisted decision-making

Buc ¸inca Z, Malaya MB, Gajos KZ. To trust or to think: cognitive forcing functions can reduce overre- liance on AI in AI-assisted decision-making. Proceedings of the ACM on Human-computer Interaction. 2021;5(CSCW1):1-21

work page 2021
[15]

Cognitive forcing for better decision-making: reducing overreliance on AI systems through partial explanations

de Jong S, Paananen V , Tag B, van Berkel N. Cognitive forcing for better decision-making: reducing overreliance on AI systems through partial explanations. Proceedings of the ACM on Human-Computer Interaction. 2025;9(2):1-30

work page 2025
[16]

On human predictions with explanations and predictions of machine learning models: A case study on deception detection

Lai V , Tan C. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In: Proceedings of the conference on fairness, accountability, and transparency; 2019. p. 29-38

work page 2019
[17]

Generative AI at work

Brynjolfsson E, Li D, Raymond L. Generative AI at work. The Quarterly Journal of Economics. 2025;140(2):889-942

work page 2025
[18]

Mind in society: The development of higher psychological processes

Vygotsky LS. Mind in society: The development of higher psychological processes. vol. 86. Harvard university press; 1978

work page 1978
[19]

The effects of generative AI agents and scaffolding on enhancing students’ comprehension of visual learning analytics

Yan L, Martinez-Maldonado R, Jin Y , Echeverria V , Milesi M, Fan J, et al. The effects of generative AI agents and scaffolding on enhancing students’ comprehension of visual learning analytics. Computers & Education. 2025:105322

work page 2025
[20]

Do people engage cognitively with AI? Impact of AI assistance on incidental learning

Gajos KZ, Mamykina L. Do people engage cognitively with AI? Impact of AI assistance on incidental learning. In: Proceedings of the 27th International Conference on Intelligent User Interfaces; 2022. p. 794-806

work page 2022
[21]

Action vs

Poulidis S, Ge H, Bastani H, Bastani O. Action vs. attention signals for human-AI collaboration: Evi- dence from chess. The Wharton School Research Paper. 2025

work page 2025
[22]

Your brain on Chat- GPT: Accumulation of cognitive debt when using an AI assistant for essay writing task

Kosmyna N, Hauptmann E, Yuan YT, Situ J, Liao XH, Beresnitzky A V , et al. Your brain on Chat- GPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv preprint arXiv:250608872. 2025

work page 2025
[23]

How AI impacts skill formation

Shen JH, Tamkin A. How AI impacts skill formation. arXiv preprint arXiv:260120245. 2026

work page 2026
[24]

Learning with AI assistance: A path to better task performance or dependence? In: Proceedings of the ACM Collective Intelligence Conference; 2024

Karny S, Mayer LW, Ayoub J, Song M, Su H, Tian D, et al. Learning with AI assistance: A path to better task performance or dependence? In: Proceedings of the ACM Collective Intelligence Conference; 2024. p. 10-7

work page 2024
[25]

Pretest-posttest designs and measurement of change

Dimitrov DM, Rumrill PDJ. Pretest-posttest designs and measurement of change. Work. 2003;20(2):159-65

work page 2003
[26]

Personalized help for optimizing low-skilled users’ strategy

Gu F, Wongkamjan W, Boyd-Graber JL, Kummerfeld JK, Peskoff D, May J. Personalized help for optimizing low-skilled users’ strategy. In: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (V olume 2: Short Papers); 2025. p. 65-74

work page 2025
[27]

Accuracy-time tradeoffs in AI-assisted decision making under time pressure

Swaroop S, Buc ¸inca Z, Gajos KZ, Doshi-Velez F. Accuracy-time tradeoffs in AI-assisted decision making under time pressure. In: Proceedings of the 29th International Conference on Intelligent User Interfaces; 2024. p. 138-54

work page 2024
[28]

How time pressure in different phases of decision-making influences human-AI collaboration

Cao S, Gomez C, Huang CM. How time pressure in different phases of decision-making influences human-AI collaboration. Proceedings of the ACM on Human-computer Interaction. 2023;7(CSCW2):1- 26

work page 2023
[29]

Who goes first? Influences of human-AI workflow on decision making in clinical imaging

Fogliato R, Chappidi S, Lungren M, Fisher P, Wilson D, Fitzke M, et al. Who goes first? Influences of human-AI workflow on decision making in clinical imaging. In: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency; 2022. p. 1362-74

work page 2022
[30]

The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers

Lee HP, Sarkar A, Tankelevitch L, Drosos I, Rintel S, Banks R, et al. The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems; 2025. p. 1-22

work page 2025
[31]

Social desirability, anonymity, and Internet-based questionnaires

Joinson A. Social desirability, anonymity, and Internet-based questionnaires. Behavior research meth- ods, instruments, & computers. 1999;31(3):433-8

work page 1999
[32]

Expla- nations can reduce overreliance on AI systems during decision-making

Vasconcelos H, J ¨orke M, Grunde-McLaughlin M, Gerstenberg T, Bernstein MS, Krishna R. Expla- nations can reduce overreliance on AI systems during decision-making. Proceedings of the ACM on Human-Computer Interaction. 2023;7(CSCW1):1-38

work page 2023
[33]

How displaying AI confidence affects reliance and hybrid human-AI performance

Tejeda Lemus H, Kumar A, Steyvers M. How displaying AI confidence affects reliance and hybrid human-AI performance. In: HHAI 2023: Augmenting Human Intellect. IOS Press; 2023. p. 234-42

work page 2023
[34]

The trust recovery journey

Kahr PK, Rooks G, Snijders C, Willemsen MC. The trust recovery journey. The effect of timing of errors on the willingness to follow AI advice. In: Proceedings of the 29th International Conference on Intelligent User Interfaces; 2024. p. 609-22

work page 2024
[35]

Understanding the impact of explanations on advice- taking: a user study for AI-based clinical Decision Support Systems

Panigutti C, Beretta A, Giannotti F, Pedreschi D. Understanding the impact of explanations on advice- taking: a user study for AI-based clinical Decision Support Systems. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems; 2022. p. 1-9

work page 2022
[36]

Improving human situation awareness in AI-advised decision making

Srivastava DK, Lilly JM, Feigh KM. Improving human situation awareness in AI-advised decision making. In: 2022 IEEE 3rd International Conference on Human-Machine Systems (ICHMS). IEEE

work page 2022
[37]

Designing for appropriate reliance: The roles of AI uncertainty presentation, initial user decision, and user demographics in AI-assisted decision-making

Cao S, Liu A, Huang CM. Designing for appropriate reliance: The roles of AI uncertainty presentation, initial user decision, and user demographics in AI-assisted decision-making. Proceedings of the ACM on Human-Computer Interaction. 2024;8(CSCW1):1-32

work page 2024
[38]

Optimizing human-AI collaboration: Effects of motivation and accu- racy information in AI-supported decision-making

Eisbach S, Langer M, Hertel G. Optimizing human-AI collaboration: Effects of motivation and accu- racy information in AI-supported decision-making. Computers in Human Behavior: Artificial Humans. 2023;1(2):100015

work page 2023
[39]

Toward a unified view of the speed-accuracy trade-off

Standage D, Wang DH, Heitz RP, Simen P. Toward a unified view of the speed-accuracy trade-off. Frontiers in Neuroscience. 2015;9:139

work page 2015
[40]

An introduction to propensity score methods for reducing the effects of confounding in observational studies

Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research. 2011;46(3):399-424

work page 2011
[41]

Generative AI usage and exam performance

Wecks JO, V oshaar J, Plate BJ, Zimmermann J. Generative AI usage and exam performance. arXiv preprint arXiv:240419699. 2024

work page 2024
[42]

Deskilling, upskilling, and reskilling: a case for hybrid intelligence

Rafner JF, Dellermann D, Hjorth A, Veraszt ´o D, Kampf CE, Mackay W, et al. Deskilling, upskilling, and reskilling: a case for hybrid intelligence. Morals & Machines. 2021;1(2):24-39

work page 2021
[43]

AI-induced deskilling in medicine: a mixed-method review and research agenda for healthcare and beyond

Natali C, Marconi L, Dias Duran LD, Cabitza F. AI-induced deskilling in medicine: a mixed-method review and research agenda for healthcare and beyond. Artificial Intelligence Review. 2025;58(11):356

work page 2025
[44]

Learning password best practices through in-task instruction

Ma Q, Zhou Y , Kaushik S, Joshi A, Majumdar A, Apthorpe N, et al. Learning password best practices through in-task instruction. arXiv preprint arXiv:260106650. 2026

work page 2026
[45]

Pacing for mastery: Optimizing LLM interactions for learning

Tran K, Gao G, Lombard A, Yu T, Jiang H, Yeh TY . Pacing for mastery: Optimizing LLM interactions for learning. In: Proceedings of the 57th ACM Technical Symposium on Computer Science Education V . 1; 2026. p. 1068-74

work page 2026

[1] [1]

Experimental evidence on the productivity effects of generative artificial intelligence

Noy S, Zhang W. Experimental evidence on the productivity effects of generative artificial intelligence. Science. 2023;381(6654):187-92

work page 2023

[2] [2]

ChatGPT for good? On opportunities and challenges of large language models for education

Kasneci E, Seßler K, K ¨uchemann S, Bannert M, Dementieva D, Fischer F, et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual dif- ferences. 2023;103:102274

work page 2023

[3] [3]

The impact of AI on developer productivity: Evidence from Github Copilot

Peng S, Kalliamvakou E, Cihon P, Demirer M. The impact of AI on developer productivity: Evidence from Github Copilot. arXiv preprint arXiv:230206590. 2023

work page 2023

[4] [4]

Expectation vs

Vaithilingam P, Zhang T, Glassman EL. Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models. In: CHI conference on human factors in computing systems extended abstracts; 2022. p. 1-7

work page 2022

[5] [5]

Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transportation research part A: policy and practice

Kalra N, Paddock SM. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transportation research part A: policy and practice. 2016;94:182-93

work page 2016

[6] [6]

Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study

Budzy ´n K, Roma´nczyk M, Kitala D, Kołodziej P, Bugajski M, Adami HO, et al. Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study. The Lancet Gastroenterology & Hepatology. 2025

work page 2025

[7] [7]

Bayesian modeling of human–AI complementarity

Steyvers M, Tejeda H, Kerrigan G, Smyth P. Bayesian modeling of human–AI complementarity. Pro- ceedings of the National Academy of Sciences. 2022;119(11):e2111547119

work page 2022

[8] [8]

Learning to complement humans

Wilder B, Horvitz E, Kamar E. Learning to complement humans. arXiv preprint arXiv:200500582. 2020

work page 2020

[9] [9]

Roles of artificial intelligence in collaboration with humans: Au- tomation, augmentation, and the future of work

F ¨ugener A, Walzner DD, Gupta A. Roles of artificial intelligence in collaboration with humans: Au- tomation, augmentation, and the future of work. Management Science. 2026;72(1):538-57

work page 2026

[10] [10]

AI tools in society: Impacts on cognitive offloading and the future of critical thinking

Gerlich M. AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies. 2025;15(1):6

work page 2025

[11] [11]

Does using artificial intelligence assistance accelerate skill decay and hinder skill development without performers’ awareness? Cognitive Research: Principles and Implications

Macnamara BN, Berber I, C ¸ avus ¸o˘glu MC, Krupinski EA, Nallapareddy N, Nelson NE, et al. Does using artificial intelligence assistance accelerate skill decay and hinder skill development without performers’ awareness? Cognitive Research: Principles and Implications. 2024;9(1):46

work page 2024

[12] [12]

Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making

Zhang Y , Liao QV , Bellamy RK. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In: Proceedings of the 2020 conference on fairness, accountability, and transparency; 2020. p. 295-305

work page 2020

[13] [13]

Explainable AI improves task perfor- mance in human–AI collaboration

Senoner J, Schallmoser S, Kratzwald B, Feuerriegel S, Netland T. Explainable AI improves task perfor- mance in human–AI collaboration. Scientific reports. 2024;14(1):31150

work page 2024

[14] [14]

To trust or to think: cognitive forcing functions can reduce overre- liance on AI in AI-assisted decision-making

Buc ¸inca Z, Malaya MB, Gajos KZ. To trust or to think: cognitive forcing functions can reduce overre- liance on AI in AI-assisted decision-making. Proceedings of the ACM on Human-computer Interaction. 2021;5(CSCW1):1-21

work page 2021

[15] [15]

Cognitive forcing for better decision-making: reducing overreliance on AI systems through partial explanations

de Jong S, Paananen V , Tag B, van Berkel N. Cognitive forcing for better decision-making: reducing overreliance on AI systems through partial explanations. Proceedings of the ACM on Human-Computer Interaction. 2025;9(2):1-30

work page 2025

[16] [16]

On human predictions with explanations and predictions of machine learning models: A case study on deception detection

Lai V , Tan C. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In: Proceedings of the conference on fairness, accountability, and transparency; 2019. p. 29-38

work page 2019

[17] [17]

Generative AI at work

Brynjolfsson E, Li D, Raymond L. Generative AI at work. The Quarterly Journal of Economics. 2025;140(2):889-942

work page 2025

[18] [18]

Mind in society: The development of higher psychological processes

Vygotsky LS. Mind in society: The development of higher psychological processes. vol. 86. Harvard university press; 1978

work page 1978

[19] [19]

The effects of generative AI agents and scaffolding on enhancing students’ comprehension of visual learning analytics

Yan L, Martinez-Maldonado R, Jin Y , Echeverria V , Milesi M, Fan J, et al. The effects of generative AI agents and scaffolding on enhancing students’ comprehension of visual learning analytics. Computers & Education. 2025:105322

work page 2025

[20] [20]

Do people engage cognitively with AI? Impact of AI assistance on incidental learning

Gajos KZ, Mamykina L. Do people engage cognitively with AI? Impact of AI assistance on incidental learning. In: Proceedings of the 27th International Conference on Intelligent User Interfaces; 2022. p. 794-806

work page 2022

[21] [21]

Action vs

Poulidis S, Ge H, Bastani H, Bastani O. Action vs. attention signals for human-AI collaboration: Evi- dence from chess. The Wharton School Research Paper. 2025

work page 2025

[22] [22]

Your brain on Chat- GPT: Accumulation of cognitive debt when using an AI assistant for essay writing task

Kosmyna N, Hauptmann E, Yuan YT, Situ J, Liao XH, Beresnitzky A V , et al. Your brain on Chat- GPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv preprint arXiv:250608872. 2025

work page 2025

[23] [23]

How AI impacts skill formation

Shen JH, Tamkin A. How AI impacts skill formation. arXiv preprint arXiv:260120245. 2026

work page 2026

[24] [24]

Learning with AI assistance: A path to better task performance or dependence? In: Proceedings of the ACM Collective Intelligence Conference; 2024

Karny S, Mayer LW, Ayoub J, Song M, Su H, Tian D, et al. Learning with AI assistance: A path to better task performance or dependence? In: Proceedings of the ACM Collective Intelligence Conference; 2024. p. 10-7

work page 2024

[25] [25]

Pretest-posttest designs and measurement of change

Dimitrov DM, Rumrill PDJ. Pretest-posttest designs and measurement of change. Work. 2003;20(2):159-65

work page 2003

[26] [26]

Personalized help for optimizing low-skilled users’ strategy

Gu F, Wongkamjan W, Boyd-Graber JL, Kummerfeld JK, Peskoff D, May J. Personalized help for optimizing low-skilled users’ strategy. In: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (V olume 2: Short Papers); 2025. p. 65-74

work page 2025

[27] [27]

Accuracy-time tradeoffs in AI-assisted decision making under time pressure

Swaroop S, Buc ¸inca Z, Gajos KZ, Doshi-Velez F. Accuracy-time tradeoffs in AI-assisted decision making under time pressure. In: Proceedings of the 29th International Conference on Intelligent User Interfaces; 2024. p. 138-54

work page 2024

[28] [28]

How time pressure in different phases of decision-making influences human-AI collaboration

Cao S, Gomez C, Huang CM. How time pressure in different phases of decision-making influences human-AI collaboration. Proceedings of the ACM on Human-computer Interaction. 2023;7(CSCW2):1- 26

work page 2023

[29] [29]

Who goes first? Influences of human-AI workflow on decision making in clinical imaging

Fogliato R, Chappidi S, Lungren M, Fisher P, Wilson D, Fitzke M, et al. Who goes first? Influences of human-AI workflow on decision making in clinical imaging. In: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency; 2022. p. 1362-74

work page 2022

[30] [30]

The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers

Lee HP, Sarkar A, Tankelevitch L, Drosos I, Rintel S, Banks R, et al. The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems; 2025. p. 1-22

work page 2025

[31] [31]

Social desirability, anonymity, and Internet-based questionnaires

Joinson A. Social desirability, anonymity, and Internet-based questionnaires. Behavior research meth- ods, instruments, & computers. 1999;31(3):433-8

work page 1999

[32] [32]

Expla- nations can reduce overreliance on AI systems during decision-making

Vasconcelos H, J ¨orke M, Grunde-McLaughlin M, Gerstenberg T, Bernstein MS, Krishna R. Expla- nations can reduce overreliance on AI systems during decision-making. Proceedings of the ACM on Human-Computer Interaction. 2023;7(CSCW1):1-38

work page 2023

[33] [33]

How displaying AI confidence affects reliance and hybrid human-AI performance

Tejeda Lemus H, Kumar A, Steyvers M. How displaying AI confidence affects reliance and hybrid human-AI performance. In: HHAI 2023: Augmenting Human Intellect. IOS Press; 2023. p. 234-42

work page 2023

[34] [34]

The trust recovery journey

Kahr PK, Rooks G, Snijders C, Willemsen MC. The trust recovery journey. The effect of timing of errors on the willingness to follow AI advice. In: Proceedings of the 29th International Conference on Intelligent User Interfaces; 2024. p. 609-22

work page 2024

[35] [35]

Understanding the impact of explanations on advice- taking: a user study for AI-based clinical Decision Support Systems

Panigutti C, Beretta A, Giannotti F, Pedreschi D. Understanding the impact of explanations on advice- taking: a user study for AI-based clinical Decision Support Systems. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems; 2022. p. 1-9

work page 2022

[36] [36]

Improving human situation awareness in AI-advised decision making

Srivastava DK, Lilly JM, Feigh KM. Improving human situation awareness in AI-advised decision making. In: 2022 IEEE 3rd International Conference on Human-Machine Systems (ICHMS). IEEE

work page 2022

[37] [37]

Designing for appropriate reliance: The roles of AI uncertainty presentation, initial user decision, and user demographics in AI-assisted decision-making

Cao S, Liu A, Huang CM. Designing for appropriate reliance: The roles of AI uncertainty presentation, initial user decision, and user demographics in AI-assisted decision-making. Proceedings of the ACM on Human-Computer Interaction. 2024;8(CSCW1):1-32

work page 2024

[38] [38]

Optimizing human-AI collaboration: Effects of motivation and accu- racy information in AI-supported decision-making

Eisbach S, Langer M, Hertel G. Optimizing human-AI collaboration: Effects of motivation and accu- racy information in AI-supported decision-making. Computers in Human Behavior: Artificial Humans. 2023;1(2):100015

work page 2023

[39] [39]

Toward a unified view of the speed-accuracy trade-off

Standage D, Wang DH, Heitz RP, Simen P. Toward a unified view of the speed-accuracy trade-off. Frontiers in Neuroscience. 2015;9:139

work page 2015

[40] [40]

An introduction to propensity score methods for reducing the effects of confounding in observational studies

Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research. 2011;46(3):399-424

work page 2011

[41] [41]

Generative AI usage and exam performance

Wecks JO, V oshaar J, Plate BJ, Zimmermann J. Generative AI usage and exam performance. arXiv preprint arXiv:240419699. 2024

work page 2024

[42] [42]

Deskilling, upskilling, and reskilling: a case for hybrid intelligence

Rafner JF, Dellermann D, Hjorth A, Veraszt ´o D, Kampf CE, Mackay W, et al. Deskilling, upskilling, and reskilling: a case for hybrid intelligence. Morals & Machines. 2021;1(2):24-39

work page 2021

[43] [43]

AI-induced deskilling in medicine: a mixed-method review and research agenda for healthcare and beyond

Natali C, Marconi L, Dias Duran LD, Cabitza F. AI-induced deskilling in medicine: a mixed-method review and research agenda for healthcare and beyond. Artificial Intelligence Review. 2025;58(11):356

work page 2025

[44] [44]

Learning password best practices through in-task instruction

Ma Q, Zhou Y , Kaushik S, Joshi A, Majumdar A, Apthorpe N, et al. Learning password best practices through in-task instruction. arXiv preprint arXiv:260106650. 2026

work page 2026

[45] [45]

Pacing for mastery: Optimizing LLM interactions for learning

Tran K, Gao G, Lombard A, Yu T, Jiang H, Yeh TY . Pacing for mastery: Optimizing LLM interactions for learning. In: Proceedings of the 57th ACM Technical Symposium on Computer Science Education V . 1; 2026. p. 1068-74

work page 2026