Overreliance in Writing Tasks: Exploring Similarity-Based Measures of AI Influence on Writing and Proposing a Reflective Writing Interface Intervention

Nicholas Vincent; Vitor H. A. Welzel

arxiv: 2605.15322 · v1 · pith:N3HQVOANnew · submitted 2026-05-14 · 💻 cs.HC

Overreliance in Writing Tasks: Exploring Similarity-Based Measures of AI Influence on Writing and Proposing a Reflective Writing Interface Intervention

Vitor H. A. Welzel , Nicholas Vincent This is my paper

Pith reviewed 2026-05-19 15:52 UTC · model grok-4.3

classification 💻 cs.HC

keywords AI overreliancegenerative AI writingtextual similarityreflective interfacehuman-AI interactionwriting tasksuser study

0 comments

The pith

AI assistance is linked to greater reuse of its suggestions in users' final writing, and a reflective interface may increase awareness of that influence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how generative AI affects open-ended writing by tracking how much of the AI's suggested text ends up in participants' completed work. A study with 47 people completing analysis and synthesis tasks found that those using AI showed more textual overlap with the suggestions provided. Building on this, the authors created an interactive interface that surfaces AI outputs and prompts reflection during the writing process. A small follow-up think-aloud test with four users indicated the interface can help people notice how they incorporate AI material and engage with it more deliberately. Readers would care because the work supplies measurable ways to observe AI influence and a practical design approach for encouraging thoughtful use.

Core claim

In a mixed-methods study, 47 participants completed writing tasks with or without generative AI assistance. Quantification of textual overlap showed that AI assistance was associated with patterns of suggestion reuse in the final writing. Analysis of participant reflections supported this pattern. A follow-up think-aloud study with a reflective writing interface (n=4) suggested that the interface can increase awareness of how AI outputs are incorporated and support more conscious engagement with the assistance.

What carries the argument

Similarity-based measures of textual overlap between AI suggestions and participants' final writing, serving as a proxy for measuring AI influence on the output.

If this is right

AI assistance during writing tasks correlates with higher rates of reusing specific suggested phrases or passages.
A reflective interface that highlights AI contributions can raise users' awareness of how those contributions appear in their work.
Interface features prompting reflection may lead to more deliberate decisions about incorporating AI-generated material.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The overlap measurement approach could be extended into automated tools that flag potential AI influence for users in real time.
Similar methods might apply to studying AI effects in other text-based creative tasks such as report drafting or content planning.
The reflective interface design points toward broader interface strategies for supporting user agency when working with generative tools.

Load-bearing premise

That the amount of shared text between AI suggestions and a participant's final writing reliably indicates overreliance or influence rather than other reasons such as adopting good ideas or natural stylistic similarity.

What would settle it

A larger controlled study that finds no measurable difference in textual overlap between groups that did and did not receive AI suggestions during the same writing tasks.

Figures

Figures reproduced from arXiv: 2605.15322 by Nicholas Vincent, Vitor H. A. Welzel.

**Figure 1.** Figure 1: Experiment interface showing: (a) reading of the reference text, (b) presentation of AI suggestions, and (c) participant [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 2.** Figure 2: Participants draft text in the main editor (left) while a side panel (right) provides feedback on similarity to an AI [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

read the original abstract

As generative AI (GenAI) systems become increasingly proficient at simulating human-like and well-reasoned text, users may attribute authority to AI outputs, shaping how they engage with writing and reasoning tasks. While prior work has raised concerns about AI overreliance, empirical approaches for observing this phenomenon during open-ended writing remain limited. In this paper, we examine how GenAI assistance influences users' interactions with AI suggestions during writing. We report results from a mixed-methods study in which 47 participants completed analysis and synthesis writing tasks with or without AI assistance. We quantify the textual overlap between AI suggestions and participants' writing and analyze participants' reflections. Our results show that AI assistance is associated with patterns of suggestion reuse. Building on these findings, we design and evaluate an interactive writing interface that may support reflection on the usage of the AI suggestions during writing. Evidence from a small follow-up think-aloud study (n = 4) suggests that the interface can increase users' awareness of how AI outputs are incorporated into their writing and may support more conscious engagement with AI assistance. Together, our findings contribute empirical methods for studying AI adoption in writing contexts and demonstrate how interface design can shape user-AI interaction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper measures AI suggestion reuse via textual similarity in writing tasks and tests a reflective interface to reduce overreliance, but the overlap proxy lacks controls to separate influence from natural convergence.

read the letter

The main things to know are that this work quantifies how AI assistance correlates with higher textual overlap between suggestions and final output in analysis/synthesis tasks, and it prototypes an interface meant to prompt users to reflect on that reuse. The n=47 mixed-methods study shows an association, and the n=4 think-aloud gives early hints that the interface can raise awareness of how AI text gets incorporated. That pairing of a measurable reuse signal with a design intervention is the concrete step forward from earlier overreliance discussions. It moves the conversation toward observable behaviors rather than just surveys or self-reports, which is practical for HCI work on writing tools. The authors also keep the scope realistic by focusing on open-ended writing instead of claiming broad effects on reasoning. The soft spot is the measurement itself. Textual overlap can come from participants arriving at similar ideas or phrasing on their own, especially when the task involves synthesis, so the observed difference between conditions does not automatically equal overreliance or direct influence. The abstract gives no detail on the exact similarity function, statistical adjustments, or checks against independent baselines, and the follow-up sample is too small to test whether the interface actually changes behavior beyond awareness. Those gaps make the central claim suggestive rather than robust. This paper is for HCI and education-technology researchers who build or evaluate AI writing assistants and want empirical handles on user adoption. Someone running studies on tool design or looking for ways to instrument reuse would find usable ideas here, even if they end up tightening the metrics. It is worth sending to peer review so the authors can add controls and expand the intervention evaluation; the topic is timely and the approach is grounded enough to benefit from referee input.

Referee Report

3 major / 1 minor

Summary. The paper reports results from a mixed-methods study with 47 participants completing analysis and synthesis writing tasks with or without GenAI assistance. It quantifies textual overlap between AI suggestions and final participant writing to show an association with patterns of suggestion reuse, analyzes participant reflections, and proposes a reflective writing interface evaluated via a small think-aloud study (n=4) suggesting increased awareness of AI incorporation.

Significance. If the textual overlap measure can be shown to capture AI-driven influence beyond baseline task-induced convergence or stylistic alignment, the work offers useful empirical methods for studying AI adoption in open-ended writing and illustrates how interface design might promote more conscious engagement. The mixed-methods approach and the concrete interface proposal are strengths that could inform tool development if measurement validity is strengthened.

major comments (3)

[Methods] Methods: The paper provides no detail on the exact similarity metrics for quantifying textual overlap, statistical controls used, or exclusion criteria applied in the n=47 study; without these, the reported association between AI assistance and suggestion reuse cannot be fully evaluated for robustness.
[Results] Results: The central claim that observed overlap indicates AI influence or overreliance lacks controls such as independent-generation baselines or human-coded distinctions between literal reuse, paraphrase, and conceptual adoption; the overlap could instead reflect task demands or idea convergence in the analysis/synthesis tasks.
[Follow-up Study] Follow-up evaluation: The n=4 think-aloud study is presented as suggestive evidence that the reflective interface increases awareness, but the small sample and lack of quantitative outcome measures limit any generalizable claims about supporting conscious engagement with AI assistance.

minor comments (1)

[Abstract] Clarify in the abstract and methods whether the unassisted condition involved any form of external reference material that could produce comparable overlap by chance.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback, which has helped us identify areas for clarification and improvement. We address each major comment below, indicating revisions where we can strengthen the manuscript without misrepresenting our work.

read point-by-point responses

Referee: [Methods] Methods: The paper provides no detail on the exact similarity metrics for quantifying textual overlap, statistical controls used, or exclusion criteria applied in the n=47 study; without these, the reported association between AI assistance and suggestion reuse cannot be fully evaluated for robustness.

Authors: We acknowledge the need for greater methodological transparency. The original manuscript described the overall approach to measuring textual overlap but omitted precise implementation details. In the revision, we will add a dedicated Methods subsection specifying the similarity metrics (cosine similarity over sentence embeddings combined with n-gram overlap), the statistical models (including regression controls for task type and participant variables), and exclusion criteria (e.g., incomplete tasks or technical failures). This will enable full evaluation of robustness. revision: yes
Referee: [Results] Results: The central claim that observed overlap indicates AI influence or overreliance lacks controls such as independent-generation baselines or human-coded distinctions between literal reuse, paraphrase, and conceptual adoption; the overlap could instead reflect task demands or idea convergence in the analysis/synthesis tasks.

Authors: We agree this is a substantive limitation in causal interpretation. Our design included a no-AI control condition showing significantly lower overlap, which we will highlight more explicitly as evidence against purely task-driven convergence. However, we did not collect independent-generation baselines or perform human coding of reuse types. We will revise the Results and Discussion to explicitly discuss these gaps as limitations, temper claims about 'overreliance,' and frame the overlap measure as one indicator supported by qualitative reflections rather than conclusive proof. We maintain the between-condition difference provides useful evidence of AI-specific patterns but will avoid overstatement. revision: partial
Referee: [Follow-up Study] Follow-up evaluation: The n=4 think-aloud study is presented as suggestive evidence that the reflective interface increases awareness, but the small sample and lack of quantitative outcome measures limit any generalizable claims about supporting conscious engagement with AI assistance.

Authors: We fully agree that the small sample and qualitative focus limit generalizability. The follow-up was explicitly positioned as an exploratory think-aloud evaluation to gather design insights, not a confirmatory test. In the revision, we will strengthen language to emphasize its preliminary, suggestive nature, explicitly note the absence of quantitative measures, and outline directions for future larger-scale studies with controlled quantitative outcomes. No overgeneralized claims will remain. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical mixed-methods study

full rationale

This is an empirical mixed-methods study reporting results from 47 participants in assisted vs. unassisted writing tasks plus a small follow-up think-aloud study. The central observations concern measured textual overlap between AI suggestions and final writing, plus participant reflections on an interface intervention. No mathematical derivations, equations, fitted parameters, or self-citation chains appear in the provided text that would reduce any claim to its inputs by construction. The overlap measure is presented as an empirical proxy for suggestion reuse rather than a self-definitional or tautological restatement. The study is self-contained against external benchmarks (participant data and reflections) and does not invoke uniqueness theorems or ansatzes from prior self-work. This is the expected honest non-finding for an observational HCI paper without derivation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that textual similarity is a meaningful indicator of AI influence and that increased awareness from the interface equates to reduced overreliance; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Textual overlap metrics reliably capture the degree of AI suggestion reuse and influence during writing.
Invoked to link measured similarity to patterns of overreliance in the main study results.

pith-pipeline@v0.9.0 · 5754 in / 1217 out tokens · 48159 ms · 2026-05-19T15:52:09.324194+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We quantify the textual overlap between AI suggestions and participants' writing and analyze participants' reflections. Our results show that AI assistance is associated with patterns of suggestion reuse.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 2 internal anchors

[1]

Odeyinka Abiola, Adebayo Abayomi-Alli, Oluwasefunmi Arogundade Tale, Sanjay Misra, and Olusola Abayomi-Alli. 2023. Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and Text Blob analyser.Journal of Electrical Systems and Information Technology10, 1 (2023), 5

work page 2023
[2]

Matheel Al-Rawas, Omar Qader, Nurul Othman, Noor Ismail, Rosnani Mamat, Mohamad Syahrizal Halim, Johari Abdullah, and Tahir Noorani. 2025. Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector.Scientific Reports15 (04 2025). doi:10.1038/s41598-025-95387-y

work page doi:10.1038/s41598-025-95387-y 2025
[3]

Hussam Alkaissi and Samy Mcfarlane. 2023. Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.Cureus15 (02 2023). doi:10.7759/cureus.35179

work page doi:10.7759/cureus.35179 2023
[4]

Garrett Allen, Mike Beijen, David Maxwell, and Ujwal Gadiraju. 2023. In a Hurry: How Time Constraints and the Presentation of Web Search Results Affect User Behaviour and Experience. InInternational Conference on Web Engineering. Springer, 221–235

work page 2023
[5]

Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Mac...

work page doi:10.1145/3411764.3445717 2021
[6]

1956.Taxonomy of educational objectives: The classification of educational goals

Benjamin S Bloom, Max D Engelhart, Edward J Furst, Walker H Hill, David R Krathwohl, et al. 1956.Taxonomy of educational objectives: The classification of educational goals. Handbook 1: Cognitive domain. Longman New York

work page 1956
[7]

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making.Proceedings of the ACM on Human-Computer Interaction5, CSCW1 (April 2021), 1–21. doi:10.1145/3449287

work page internal anchor Pith review doi:10.1145/3449287 2021
[8]

Cecilia Ka Yuk Chan and Louisa H.Y. Tsi. 2023. The AI Revolution in Education: Will AI Replace or Assist Teachers in Higher Education? ArXivabs/2305.01185 (2023). https://api.semanticscholar.org/CorpusID:258436716

work page arXiv 2023
[9]

Xinyue Chen, Kunlin Ruan, Kexin Phyllis Ju, Nathan Yap, and Xu Wang. 2025. More AI Assistance Reduces Cognitive Engagement: Examining the AI Assistance Dilemma in AI-Supported Note-Taking.Proceedings of the ACM on Human-Computer Interaction9, 7 (Oct. 2025), 1–29. doi:10.1145/3757632

work page doi:10.1145/3757632 2025
[10]

Valdemar Danry, Pat Pataranutaporn, Matthew Groh, and Ziv Epstein. 2025. Deceptive Explanations by Large Language Models Lead People to Change their Beliefs About Misinformation More Often than Honest Explanations. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, U...

work page doi:10.1145/3706598.3713408 2025
[11]

Sander de Jong, Ville Paananen, Benjamin Tag, and Niels van Berkel. 2025. Cognitive Forcing for Better Decision-Making: Reducing Overreliance on AI Systems Through Partial Explanations.Proc. ACM Hum.-Comput. Interact.9, 2, Article CSCW048 (May 2025), 30 pages. doi:10.1145/3710946 FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Welzel and Vincent

work page doi:10.1145/3710946 2025
[12]

Upol Ehsan and Mark O Riedl. 2020. Human-centered explainable ai: Towards a reflective sociotechnical approach. InInternational conference on human-computer interaction. Springer, 449–466

work page 2020
[13]

Liye Fu, Benjamin Newman, Maurice Jakesch, and Sarah Kreps. 2023. Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 103, 13 pages. doi:10.11...

work page doi:10.1145/3544548.3581351 2023
[14]

Darren Gergle and Desney S Tan. 2014. Experimental research in HCI. InWays of Knowing in HCI. Springer, 191–227

work page 2014
[15]

Ella Glikson and Omri Asscher. 2022. AI-mediated apology in a multilingual work context: Implications for perceived authenticity and willingness to forgive.Computers in Human Behavior140 (11 2022), 107592. doi:10.1016/j.chb.2022.107592

work page doi:10.1016/j.chb.2022.107592 2022
[16]

S Goldwasser, S Micali, and C Rackoff. 1985. The knowledge complexity of interactive proof-systems. InProceedings of the Seventeenth Annual ACM Symposium on Theory of Computing(Providence, Rhode Island, USA)(STOC ’85). Association for Computing Machinery, New York, NY, USA, 291–304. doi:10.1145/22145.22178

work page doi:10.1145/22145.22178 1985
[17]

A decision theoretic framework for measuring AI reliance

Ziyang Guo, Yifan Wu, Jason D. Hartline, and Jessica Hullman. 2024. A Decision Theoretic Framework for Measuring AI Reliance. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY, USA, 221–236. doi:10.1145/3630106.3658901

work page doi:10.1145/3630106.3658901 2024
[18]

Hadassah Harland, Richard Dazeley, Hashini Senaratne, Peter Vamplew, Francisco Cruz, and Bahareh Nakisa. 2025. AI apology: a critical review of apology in AI systems.Artificial Intelligence Review58, 12 (2025), 369

work page 2025
[19]

Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. InAdvances in psychology. Vol. 52. Elsevier, 139–183

work page 1988
[20]

O. Henry. 1906. After Twenty Years. InThe Four Million. McClure, Phillips & Co., New York. Originally published in 1906; short story

work page 1906
[21]

Emily Sein Yue Elim Hui. 2025. Incorporating Bloom’s taxonomy into promoting cognitive thinking mechanism in artificial intelligence-supported learning environments.Interactive Learning Environments33, 2 (2025), 1087–1100. arXiv:https://doi.org/10.1080/10494820.2024.2364237 doi:10.1080/10494820.2024.2364237

work page doi:10.1080/10494820.2024.2364237 2025
[22]

Paul Jaccard. 1901. Etude comparative de la distribution florale dans une portion des Alpes et des Jura.Bulletin de la Societe Vaudoise des Sciences Naturelles37 (1901), 547–579

work page 1901
[23]

2011.Thinking, fast and slow

Daniel Kahneman. 2011.Thinking, fast and slow. Farrar, Straus and Giroux, New York. https://www.amazon.de/Thinking-Fast-Slow- Daniel-Kahneman/dp/0374275637/ref=wl_it_dp_o_pdT1_nS_nC?ie=UTF8&colid=151193SNGKJT9&coliid=I3OCESLZCVDFL7

work page arXiv 2011
[24]

Ece Kamar. 2016. Directions in hybrid intelligence: complementing AI systems with human intelligence. InProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence(New York, New York, USA)(IJCAI’16). AAAI Press, 4070–4073

work page 2016
[25]

Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1(Valencia, Spain)(AAMAS ’12). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 467–474

work page 2012
[26]

I’m Not Sure, But

Sunnie S. Y. Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, and Jennifer Wortman Vaughan. 2024. "I’m Not Sure, But... ": Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Asso...

work page doi:10.1145/3630106.3658941 2024
[27]

Sunnie S. Y. Kim, Jennifer Wortman Vaughan, Q. Vera Liao, Tania Lombrozo, and Olga Russakovsky. 2025. Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). ACM, 1–19. doi:10.1145/3706598.3714020

work page doi:10.1145/3706598.3714020 2025
[28]

Nataliya Kosmyna, Eugene Hauptmann, Ye Tong Yuan, Jessica Situ, Xian-Hao Liao, Ashly Vivian Beresnitzky, Iris Braunstein, and Pattie Maes. 2025. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv:2506.08872 [cs.AI] https://arxiv.org/abs/2506.08872

work page internal anchor Pith review Pith/arXiv arXiv 2025
[29]

Hao-Ping (Hank) Lee, Advait Sarkar, Lev Tankelevitch, Ian Drosos, Sean Rintel, Richard Banks, and Nicholas Wilson. 2025. The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CH...

work page doi:10.1145/3706598.3713778 2025
[30]

Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. InCHI Conference on Human Factors in Computing Systems (CHI ’22). ACM, 1–19. doi:10.1145/3491102.3502030

work page doi:10.1145/3491102.3502030 2022
[31]

Steven Loria and contributors. 2026. TextBlob Documentation (Release 0.19.0). Read the Docs. Accessed 2026-01-06

work page 2026
[32]

Hancock, Mor Naaman, Malte Jung, and Jess Hohenstein

Hannah Mieczkowski, Jeffrey T. Hancock, Mor Naaman, Malte Jung, and Jess Hohenstein. 2021. AI-Mediated Communication: Language Use and Interpersonal Effects in a Referential Communication Task.Proc. ACM Hum.-Comput. Interact.5, CSCW1, Article 17 (April 2021), 14 pages. doi:10.1145/3449091

work page doi:10.1145/3449091 2021
[33]

Mohsin Murtaza, Chi-Tsun Cheng, Bader Albahlal, Muhana Muslam, and Mansoor Raza. 2025. The impact of LLM chatbots on learning outcomes in advanced driver assistance systems education.Scientific Reports15 (03 2025). doi:10.1038/s41598-025-91330-3

work page doi:10.1038/s41598-025-91330-3 2025
[34]

David Navon and Daniel Gopher. 1979. On the economy of the human-processing system.Psychological review86, 3 (1979), 214. Overreliance in Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page 1979
[35]

Abdul Wahab Qurashi, Violeta Holmes, and Anju P Johnson. 2020. Document processing: Methods for semantic text similarity analysis. In2020 international conference on INnovations in Intelligent SysTems and Applications (INISTA). IEEE, 1–6

work page 2020
[36]

Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 3982–3992

work page 2019
[37]

Jenna Russell, Marzena Karpinska, and Mohit Iyyer. 2025. People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 5342–5373

work page 2025
[38]

Anjali Singh, Karan Taneja, Zhitong Guan, and Avijit Ghosh. 2025. Protecting Human Cognition in the Age of AI. arXiv:2502.12447 [cs.CY] https://arxiv.org/abs/2502.12447

work page arXiv 2025
[39]

Kristen Sussman and Daniel Carter. 2025. Detecting Effects of AI-Mediated Communication on Language Complexity and Sentiment. In Companion Proceedings of the ACM on Web Conference 2025(Sydney NSW, Australia)(WWW ’25). Association for Computing Machinery, New York, NY, USA, 2689–2693. doi:10.1145/3701716.3717543

work page doi:10.1145/3701716.3717543 2025
[40]

Ningzhi Tang, Meng Chen, Zheng Ning, Aakash Bansal, Yu Huang, Collin McMillan, and Toby Jia-Jun Li. 2023. An Empirical Study of Developer Behaviors for Validating and Repairing AI-Generated Code. (3 2023). doi:10.1184/R1/22223533.v1

work page doi:10.1184/r1/22223533.v1 2023
[41]

Sacip Toker and Mahir Akgun. 2024. The Role of Task Complexity in Reducing AI Plagiarism: A Study of Generative AI Tools. arXiv:2412.13412 [cs.HC] https://arxiv.org/abs/2412.13412

work page arXiv 2024
[42]

Michael Tomasello, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll. 2005. Understanding and Sharing Intentions: The Origins of Cultural Cognition.Behavioral and Brain Sciences28 (11 2005), 675–735. doi:10.1017/S0140525X05000129

work page doi:10.1017/s0140525x05000129 2005
[43]

K Vani and Deepa Gupta. 2015. Investigating the impact of combined similarity metrics and POS tagging in extrinsic text plagiarism detection system. In2015 international conference on advances in computing, communications and informatics (ICACCI). IEEE, 1578–1584

work page 2015
[44]

Helena Vasconcelos, Matthew Jörke, Madeleine Grunde-McLaughlin, Tobias Gerstenberg, Michael S Bernstein, and Ranjay Krishna. 2023. Explanations can reduce overreliance on ai systems during decision-making.Proceedings of the ACM on Human-Computer Interaction7, CSCW1 (2023), 1–38

work page 2023
[45]

Veniamin Veselovsky, Manoel Horta Ribeiro, and Robert West. 2023. Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks. arXiv:2306.07899 [cs.CL] https://arxiv.org/abs/2306.07899

work page arXiv 2023
[46]

Zachary Wojtowicz and Simon DeDeo. 2025. Undermining mental proof: how AI can make cooperation harder by making thinking easier. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intel...

work page doi:10.1609/aaai.v39i2.32151 2025
[47]

Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. InProceedings of the 27th International Conference on Intelligent User Interfaces(Helsinki, Finland)(IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. doi:10.1145/3490099.3511105

work page doi:10.1145/3490099.3511105 2022
[48]

Vera and Bellamy, Rachel K

Yunfeng Zhang, Q. Vera Liao, and Rachel K. E. Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. InProceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). ACM, 295–305. doi:10.1145/3351095.3372852

work page doi:10.1145/3351095.3372852 2020
[49]

came a thousand miles to stand here tonight

Qingjuan Zhao, Jianwei Niu, and Xuefeng Liu. 2022. ALS-MRS: Incorporating aspect-level sentiment for abstractive multi-review summarization.Knowledge-Based Systems258 (2022), 109942. doi:10.1016/j.knosys.2022.109942 A Task Prompts and AI Suggestions Task A - Analysis Prompt.Evaluate Bob’s decision to wait at the old restaurant site for twenty years. Judge...

work page doi:10.1016/j.knosys.2022.109942 2022

[1] [1]

Odeyinka Abiola, Adebayo Abayomi-Alli, Oluwasefunmi Arogundade Tale, Sanjay Misra, and Olusola Abayomi-Alli. 2023. Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and Text Blob analyser.Journal of Electrical Systems and Information Technology10, 1 (2023), 5

work page 2023

[2] [2]

Matheel Al-Rawas, Omar Qader, Nurul Othman, Noor Ismail, Rosnani Mamat, Mohamad Syahrizal Halim, Johari Abdullah, and Tahir Noorani. 2025. Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector.Scientific Reports15 (04 2025). doi:10.1038/s41598-025-95387-y

work page doi:10.1038/s41598-025-95387-y 2025

[3] [3]

Hussam Alkaissi and Samy Mcfarlane. 2023. Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.Cureus15 (02 2023). doi:10.7759/cureus.35179

work page doi:10.7759/cureus.35179 2023

[4] [4]

Garrett Allen, Mike Beijen, David Maxwell, and Ujwal Gadiraju. 2023. In a Hurry: How Time Constraints and the Presentation of Web Search Results Affect User Behaviour and Experience. InInternational Conference on Web Engineering. Springer, 221–235

work page 2023

[5] [5]

Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Mac...

work page doi:10.1145/3411764.3445717 2021

[6] [6]

1956.Taxonomy of educational objectives: The classification of educational goals

Benjamin S Bloom, Max D Engelhart, Edward J Furst, Walker H Hill, David R Krathwohl, et al. 1956.Taxonomy of educational objectives: The classification of educational goals. Handbook 1: Cognitive domain. Longman New York

work page 1956

[7] [7]

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making.Proceedings of the ACM on Human-Computer Interaction5, CSCW1 (April 2021), 1–21. doi:10.1145/3449287

work page internal anchor Pith review doi:10.1145/3449287 2021

[8] [8]

Cecilia Ka Yuk Chan and Louisa H.Y. Tsi. 2023. The AI Revolution in Education: Will AI Replace or Assist Teachers in Higher Education? ArXivabs/2305.01185 (2023). https://api.semanticscholar.org/CorpusID:258436716

work page arXiv 2023

[9] [9]

Xinyue Chen, Kunlin Ruan, Kexin Phyllis Ju, Nathan Yap, and Xu Wang. 2025. More AI Assistance Reduces Cognitive Engagement: Examining the AI Assistance Dilemma in AI-Supported Note-Taking.Proceedings of the ACM on Human-Computer Interaction9, 7 (Oct. 2025), 1–29. doi:10.1145/3757632

work page doi:10.1145/3757632 2025

[10] [10]

Valdemar Danry, Pat Pataranutaporn, Matthew Groh, and Ziv Epstein. 2025. Deceptive Explanations by Large Language Models Lead People to Change their Beliefs About Misinformation More Often than Honest Explanations. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, U...

work page doi:10.1145/3706598.3713408 2025

[11] [11]

Sander de Jong, Ville Paananen, Benjamin Tag, and Niels van Berkel. 2025. Cognitive Forcing for Better Decision-Making: Reducing Overreliance on AI Systems Through Partial Explanations.Proc. ACM Hum.-Comput. Interact.9, 2, Article CSCW048 (May 2025), 30 pages. doi:10.1145/3710946 FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Welzel and Vincent

work page doi:10.1145/3710946 2025

[12] [12]

Upol Ehsan and Mark O Riedl. 2020. Human-centered explainable ai: Towards a reflective sociotechnical approach. InInternational conference on human-computer interaction. Springer, 449–466

work page 2020

[13] [13]

Liye Fu, Benjamin Newman, Maurice Jakesch, and Sarah Kreps. 2023. Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 103, 13 pages. doi:10.11...

work page doi:10.1145/3544548.3581351 2023

[14] [14]

Darren Gergle and Desney S Tan. 2014. Experimental research in HCI. InWays of Knowing in HCI. Springer, 191–227

work page 2014

[15] [15]

Ella Glikson and Omri Asscher. 2022. AI-mediated apology in a multilingual work context: Implications for perceived authenticity and willingness to forgive.Computers in Human Behavior140 (11 2022), 107592. doi:10.1016/j.chb.2022.107592

work page doi:10.1016/j.chb.2022.107592 2022

[16] [16]

S Goldwasser, S Micali, and C Rackoff. 1985. The knowledge complexity of interactive proof-systems. InProceedings of the Seventeenth Annual ACM Symposium on Theory of Computing(Providence, Rhode Island, USA)(STOC ’85). Association for Computing Machinery, New York, NY, USA, 291–304. doi:10.1145/22145.22178

work page doi:10.1145/22145.22178 1985

[17] [17]

A decision theoretic framework for measuring AI reliance

Ziyang Guo, Yifan Wu, Jason D. Hartline, and Jessica Hullman. 2024. A Decision Theoretic Framework for Measuring AI Reliance. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY, USA, 221–236. doi:10.1145/3630106.3658901

work page doi:10.1145/3630106.3658901 2024

[18] [18]

Hadassah Harland, Richard Dazeley, Hashini Senaratne, Peter Vamplew, Francisco Cruz, and Bahareh Nakisa. 2025. AI apology: a critical review of apology in AI systems.Artificial Intelligence Review58, 12 (2025), 369

work page 2025

[19] [19]

Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. InAdvances in psychology. Vol. 52. Elsevier, 139–183

work page 1988

[20] [20]

O. Henry. 1906. After Twenty Years. InThe Four Million. McClure, Phillips & Co., New York. Originally published in 1906; short story

work page 1906

[21] [21]

Emily Sein Yue Elim Hui. 2025. Incorporating Bloom’s taxonomy into promoting cognitive thinking mechanism in artificial intelligence-supported learning environments.Interactive Learning Environments33, 2 (2025), 1087–1100. arXiv:https://doi.org/10.1080/10494820.2024.2364237 doi:10.1080/10494820.2024.2364237

work page doi:10.1080/10494820.2024.2364237 2025

[22] [22]

Paul Jaccard. 1901. Etude comparative de la distribution florale dans une portion des Alpes et des Jura.Bulletin de la Societe Vaudoise des Sciences Naturelles37 (1901), 547–579

work page 1901

[23] [23]

2011.Thinking, fast and slow

Daniel Kahneman. 2011.Thinking, fast and slow. Farrar, Straus and Giroux, New York. https://www.amazon.de/Thinking-Fast-Slow- Daniel-Kahneman/dp/0374275637/ref=wl_it_dp_o_pdT1_nS_nC?ie=UTF8&colid=151193SNGKJT9&coliid=I3OCESLZCVDFL7

work page arXiv 2011

[24] [24]

Ece Kamar. 2016. Directions in hybrid intelligence: complementing AI systems with human intelligence. InProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence(New York, New York, USA)(IJCAI’16). AAAI Press, 4070–4073

work page 2016

[25] [25]

Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1(Valencia, Spain)(AAMAS ’12). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 467–474

work page 2012

[26] [26]

I’m Not Sure, But

Sunnie S. Y. Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, and Jennifer Wortman Vaughan. 2024. "I’m Not Sure, But... ": Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Asso...

work page doi:10.1145/3630106.3658941 2024

[27] [27]

Sunnie S. Y. Kim, Jennifer Wortman Vaughan, Q. Vera Liao, Tania Lombrozo, and Olga Russakovsky. 2025. Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). ACM, 1–19. doi:10.1145/3706598.3714020

work page doi:10.1145/3706598.3714020 2025

[28] [28]

Nataliya Kosmyna, Eugene Hauptmann, Ye Tong Yuan, Jessica Situ, Xian-Hao Liao, Ashly Vivian Beresnitzky, Iris Braunstein, and Pattie Maes. 2025. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv:2506.08872 [cs.AI] https://arxiv.org/abs/2506.08872

work page internal anchor Pith review Pith/arXiv arXiv 2025

[29] [29]

Hao-Ping (Hank) Lee, Advait Sarkar, Lev Tankelevitch, Ian Drosos, Sean Rintel, Richard Banks, and Nicholas Wilson. 2025. The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CH...

work page doi:10.1145/3706598.3713778 2025

[30] [30]

Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. InCHI Conference on Human Factors in Computing Systems (CHI ’22). ACM, 1–19. doi:10.1145/3491102.3502030

work page doi:10.1145/3491102.3502030 2022

[31] [31]

Steven Loria and contributors. 2026. TextBlob Documentation (Release 0.19.0). Read the Docs. Accessed 2026-01-06

work page 2026

[32] [32]

Hancock, Mor Naaman, Malte Jung, and Jess Hohenstein

Hannah Mieczkowski, Jeffrey T. Hancock, Mor Naaman, Malte Jung, and Jess Hohenstein. 2021. AI-Mediated Communication: Language Use and Interpersonal Effects in a Referential Communication Task.Proc. ACM Hum.-Comput. Interact.5, CSCW1, Article 17 (April 2021), 14 pages. doi:10.1145/3449091

work page doi:10.1145/3449091 2021

[33] [33]

Mohsin Murtaza, Chi-Tsun Cheng, Bader Albahlal, Muhana Muslam, and Mansoor Raza. 2025. The impact of LLM chatbots on learning outcomes in advanced driver assistance systems education.Scientific Reports15 (03 2025). doi:10.1038/s41598-025-91330-3

work page doi:10.1038/s41598-025-91330-3 2025

[34] [34]

David Navon and Daniel Gopher. 1979. On the economy of the human-processing system.Psychological review86, 3 (1979), 214. Overreliance in Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page 1979

[35] [35]

Abdul Wahab Qurashi, Violeta Holmes, and Anju P Johnson. 2020. Document processing: Methods for semantic text similarity analysis. In2020 international conference on INnovations in Intelligent SysTems and Applications (INISTA). IEEE, 1–6

work page 2020

[36] [36]

Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 3982–3992

work page 2019

[37] [37]

Jenna Russell, Marzena Karpinska, and Mohit Iyyer. 2025. People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 5342–5373

work page 2025

[38] [38]

Anjali Singh, Karan Taneja, Zhitong Guan, and Avijit Ghosh. 2025. Protecting Human Cognition in the Age of AI. arXiv:2502.12447 [cs.CY] https://arxiv.org/abs/2502.12447

work page arXiv 2025

[39] [39]

Kristen Sussman and Daniel Carter. 2025. Detecting Effects of AI-Mediated Communication on Language Complexity and Sentiment. In Companion Proceedings of the ACM on Web Conference 2025(Sydney NSW, Australia)(WWW ’25). Association for Computing Machinery, New York, NY, USA, 2689–2693. doi:10.1145/3701716.3717543

work page doi:10.1145/3701716.3717543 2025

[40] [40]

Ningzhi Tang, Meng Chen, Zheng Ning, Aakash Bansal, Yu Huang, Collin McMillan, and Toby Jia-Jun Li. 2023. An Empirical Study of Developer Behaviors for Validating and Repairing AI-Generated Code. (3 2023). doi:10.1184/R1/22223533.v1

work page doi:10.1184/r1/22223533.v1 2023

[41] [41]

Sacip Toker and Mahir Akgun. 2024. The Role of Task Complexity in Reducing AI Plagiarism: A Study of Generative AI Tools. arXiv:2412.13412 [cs.HC] https://arxiv.org/abs/2412.13412

work page arXiv 2024

[42] [42]

Michael Tomasello, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll. 2005. Understanding and Sharing Intentions: The Origins of Cultural Cognition.Behavioral and Brain Sciences28 (11 2005), 675–735. doi:10.1017/S0140525X05000129

work page doi:10.1017/s0140525x05000129 2005

[43] [43]

K Vani and Deepa Gupta. 2015. Investigating the impact of combined similarity metrics and POS tagging in extrinsic text plagiarism detection system. In2015 international conference on advances in computing, communications and informatics (ICACCI). IEEE, 1578–1584

work page 2015

[44] [44]

Helena Vasconcelos, Matthew Jörke, Madeleine Grunde-McLaughlin, Tobias Gerstenberg, Michael S Bernstein, and Ranjay Krishna. 2023. Explanations can reduce overreliance on ai systems during decision-making.Proceedings of the ACM on Human-Computer Interaction7, CSCW1 (2023), 1–38

work page 2023

[45] [45]

Veniamin Veselovsky, Manoel Horta Ribeiro, and Robert West. 2023. Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks. arXiv:2306.07899 [cs.CL] https://arxiv.org/abs/2306.07899

work page arXiv 2023

[46] [46]

Zachary Wojtowicz and Simon DeDeo. 2025. Undermining mental proof: how AI can make cooperation harder by making thinking easier. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intel...

work page doi:10.1609/aaai.v39i2.32151 2025

[47] [47]

Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. InProceedings of the 27th International Conference on Intelligent User Interfaces(Helsinki, Finland)(IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. doi:10.1145/3490099.3511105

work page doi:10.1145/3490099.3511105 2022

[48] [48]

Vera and Bellamy, Rachel K

Yunfeng Zhang, Q. Vera Liao, and Rachel K. E. Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. InProceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). ACM, 295–305. doi:10.1145/3351095.3372852

work page doi:10.1145/3351095.3372852 2020

[49] [49]

came a thousand miles to stand here tonight

Qingjuan Zhao, Jianwei Niu, and Xuefeng Liu. 2022. ALS-MRS: Incorporating aspect-level sentiment for abstractive multi-review summarization.Knowledge-Based Systems258 (2022), 109942. doi:10.1016/j.knosys.2022.109942 A Task Prompts and AI Suggestions Task A - Analysis Prompt.Evaluate Bob’s decision to wait at the old restaurant site for twenty years. Judge...

work page doi:10.1016/j.knosys.2022.109942 2022