pith. sign in

arxiv: 2605.15322 · v1 · pith:N3HQVOANnew · submitted 2026-05-14 · 💻 cs.HC

Overreliance in Writing Tasks: Exploring Similarity-Based Measures of AI Influence on Writing and Proposing a Reflective Writing Interface Intervention

Pith reviewed 2026-05-19 15:52 UTC · model grok-4.3

classification 💻 cs.HC
keywords AI overreliancegenerative AI writingtextual similarityreflective interfacehuman-AI interactionwriting tasksuser study
0
0 comments X

The pith

AI assistance is linked to greater reuse of its suggestions in users' final writing, and a reflective interface may increase awareness of that influence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how generative AI affects open-ended writing by tracking how much of the AI's suggested text ends up in participants' completed work. A study with 47 people completing analysis and synthesis tasks found that those using AI showed more textual overlap with the suggestions provided. Building on this, the authors created an interactive interface that surfaces AI outputs and prompts reflection during the writing process. A small follow-up think-aloud test with four users indicated the interface can help people notice how they incorporate AI material and engage with it more deliberately. Readers would care because the work supplies measurable ways to observe AI influence and a practical design approach for encouraging thoughtful use.

Core claim

In a mixed-methods study, 47 participants completed writing tasks with or without generative AI assistance. Quantification of textual overlap showed that AI assistance was associated with patterns of suggestion reuse in the final writing. Analysis of participant reflections supported this pattern. A follow-up think-aloud study with a reflective writing interface (n=4) suggested that the interface can increase awareness of how AI outputs are incorporated and support more conscious engagement with the assistance.

What carries the argument

Similarity-based measures of textual overlap between AI suggestions and participants' final writing, serving as a proxy for measuring AI influence on the output.

If this is right

  • AI assistance during writing tasks correlates with higher rates of reusing specific suggested phrases or passages.
  • A reflective interface that highlights AI contributions can raise users' awareness of how those contributions appear in their work.
  • Interface features prompting reflection may lead to more deliberate decisions about incorporating AI-generated material.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The overlap measurement approach could be extended into automated tools that flag potential AI influence for users in real time.
  • Similar methods might apply to studying AI effects in other text-based creative tasks such as report drafting or content planning.
  • The reflective interface design points toward broader interface strategies for supporting user agency when working with generative tools.

Load-bearing premise

That the amount of shared text between AI suggestions and a participant's final writing reliably indicates overreliance or influence rather than other reasons such as adopting good ideas or natural stylistic similarity.

What would settle it

A larger controlled study that finds no measurable difference in textual overlap between groups that did and did not receive AI suggestions during the same writing tasks.

Figures

Figures reproduced from arXiv: 2605.15322 by Nicholas Vincent, Vitor H. A. Welzel.

Figure 1
Figure 1. Figure 1: Experiment interface showing: (a) reading of the reference text, (b) presentation of AI suggestions, and (c) participant [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Participants draft text in the main editor (left) while a side panel (right) provides feedback on similarity to an AI [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
read the original abstract

As generative AI (GenAI) systems become increasingly proficient at simulating human-like and well-reasoned text, users may attribute authority to AI outputs, shaping how they engage with writing and reasoning tasks. While prior work has raised concerns about AI overreliance, empirical approaches for observing this phenomenon during open-ended writing remain limited. In this paper, we examine how GenAI assistance influences users' interactions with AI suggestions during writing. We report results from a mixed-methods study in which 47 participants completed analysis and synthesis writing tasks with or without AI assistance. We quantify the textual overlap between AI suggestions and participants' writing and analyze participants' reflections. Our results show that AI assistance is associated with patterns of suggestion reuse. Building on these findings, we design and evaluate an interactive writing interface that may support reflection on the usage of the AI suggestions during writing. Evidence from a small follow-up think-aloud study (n = 4) suggests that the interface can increase users' awareness of how AI outputs are incorporated into their writing and may support more conscious engagement with AI assistance. Together, our findings contribute empirical methods for studying AI adoption in writing contexts and demonstrate how interface design can shape user-AI interaction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper reports results from a mixed-methods study with 47 participants completing analysis and synthesis writing tasks with or without GenAI assistance. It quantifies textual overlap between AI suggestions and final participant writing to show an association with patterns of suggestion reuse, analyzes participant reflections, and proposes a reflective writing interface evaluated via a small think-aloud study (n=4) suggesting increased awareness of AI incorporation.

Significance. If the textual overlap measure can be shown to capture AI-driven influence beyond baseline task-induced convergence or stylistic alignment, the work offers useful empirical methods for studying AI adoption in open-ended writing and illustrates how interface design might promote more conscious engagement. The mixed-methods approach and the concrete interface proposal are strengths that could inform tool development if measurement validity is strengthened.

major comments (3)
  1. [Methods] Methods: The paper provides no detail on the exact similarity metrics for quantifying textual overlap, statistical controls used, or exclusion criteria applied in the n=47 study; without these, the reported association between AI assistance and suggestion reuse cannot be fully evaluated for robustness.
  2. [Results] Results: The central claim that observed overlap indicates AI influence or overreliance lacks controls such as independent-generation baselines or human-coded distinctions between literal reuse, paraphrase, and conceptual adoption; the overlap could instead reflect task demands or idea convergence in the analysis/synthesis tasks.
  3. [Follow-up Study] Follow-up evaluation: The n=4 think-aloud study is presented as suggestive evidence that the reflective interface increases awareness, but the small sample and lack of quantitative outcome measures limit any generalizable claims about supporting conscious engagement with AI assistance.
minor comments (1)
  1. [Abstract] Clarify in the abstract and methods whether the unassisted condition involved any form of external reference material that could produce comparable overlap by chance.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback, which has helped us identify areas for clarification and improvement. We address each major comment below, indicating revisions where we can strengthen the manuscript without misrepresenting our work.

read point-by-point responses
  1. Referee: [Methods] Methods: The paper provides no detail on the exact similarity metrics for quantifying textual overlap, statistical controls used, or exclusion criteria applied in the n=47 study; without these, the reported association between AI assistance and suggestion reuse cannot be fully evaluated for robustness.

    Authors: We acknowledge the need for greater methodological transparency. The original manuscript described the overall approach to measuring textual overlap but omitted precise implementation details. In the revision, we will add a dedicated Methods subsection specifying the similarity metrics (cosine similarity over sentence embeddings combined with n-gram overlap), the statistical models (including regression controls for task type and participant variables), and exclusion criteria (e.g., incomplete tasks or technical failures). This will enable full evaluation of robustness. revision: yes

  2. Referee: [Results] Results: The central claim that observed overlap indicates AI influence or overreliance lacks controls such as independent-generation baselines or human-coded distinctions between literal reuse, paraphrase, and conceptual adoption; the overlap could instead reflect task demands or idea convergence in the analysis/synthesis tasks.

    Authors: We agree this is a substantive limitation in causal interpretation. Our design included a no-AI control condition showing significantly lower overlap, which we will highlight more explicitly as evidence against purely task-driven convergence. However, we did not collect independent-generation baselines or perform human coding of reuse types. We will revise the Results and Discussion to explicitly discuss these gaps as limitations, temper claims about 'overreliance,' and frame the overlap measure as one indicator supported by qualitative reflections rather than conclusive proof. We maintain the between-condition difference provides useful evidence of AI-specific patterns but will avoid overstatement. revision: partial

  3. Referee: [Follow-up Study] Follow-up evaluation: The n=4 think-aloud study is presented as suggestive evidence that the reflective interface increases awareness, but the small sample and lack of quantitative outcome measures limit any generalizable claims about supporting conscious engagement with AI assistance.

    Authors: We fully agree that the small sample and qualitative focus limit generalizability. The follow-up was explicitly positioned as an exploratory think-aloud evaluation to gather design insights, not a confirmatory test. In the revision, we will strengthen language to emphasize its preliminary, suggestive nature, explicitly note the absence of quantitative measures, and outline directions for future larger-scale studies with controlled quantitative outcomes. No overgeneralized claims will remain. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical mixed-methods study

full rationale

This is an empirical mixed-methods study reporting results from 47 participants in assisted vs. unassisted writing tasks plus a small follow-up think-aloud study. The central observations concern measured textual overlap between AI suggestions and final writing, plus participant reflections on an interface intervention. No mathematical derivations, equations, fitted parameters, or self-citation chains appear in the provided text that would reduce any claim to its inputs by construction. The overlap measure is presented as an empirical proxy for suggestion reuse rather than a self-definitional or tautological restatement. The study is self-contained against external benchmarks (participant data and reflections) and does not invoke uniqueness theorems or ansatzes from prior self-work. This is the expected honest non-finding for an observational HCI paper without derivation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that textual similarity is a meaningful indicator of AI influence and that increased awareness from the interface equates to reduced overreliance; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Textual overlap metrics reliably capture the degree of AI suggestion reuse and influence during writing.
    Invoked to link measured similarity to patterns of overreliance in the main study results.

pith-pipeline@v0.9.0 · 5754 in / 1217 out tokens · 48159 ms · 2026-05-19T15:52:09.324194+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 2 internal anchors

  1. [1]

    Odeyinka Abiola, Adebayo Abayomi-Alli, Oluwasefunmi Arogundade Tale, Sanjay Misra, and Olusola Abayomi-Alli. 2023. Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and Text Blob analyser.Journal of Electrical Systems and Information Technology10, 1 (2023), 5

  2. [2]

    Matheel Al-Rawas, Omar Qader, Nurul Othman, Noor Ismail, Rosnani Mamat, Mohamad Syahrizal Halim, Johari Abdullah, and Tahir Noorani. 2025. Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector.Scientific Reports15 (04 2025). doi:10.1038/s41598-025-95387-y

  3. [3]

    Hussam Alkaissi and Samy Mcfarlane. 2023. Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.Cureus15 (02 2023). doi:10.7759/cureus.35179

  4. [4]

    Garrett Allen, Mike Beijen, David Maxwell, and Ujwal Gadiraju. 2023. In a Hurry: How Time Constraints and the Presentation of Web Search Results Affect User Behaviour and Experience. InInternational Conference on Web Engineering. Springer, 221–235

  5. [5]

    Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Mac...

  6. [6]

    1956.Taxonomy of educational objectives: The classification of educational goals

    Benjamin S Bloom, Max D Engelhart, Edward J Furst, Walker H Hill, David R Krathwohl, et al. 1956.Taxonomy of educational objectives: The classification of educational goals. Handbook 1: Cognitive domain. Longman New York

  7. [7]

    Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making.Proceedings of the ACM on Human-Computer Interaction5, CSCW1 (April 2021), 1–21. doi:10.1145/3449287

  8. [8]

    Cecilia Ka Yuk Chan and Louisa H.Y. Tsi. 2023. The AI Revolution in Education: Will AI Replace or Assist Teachers in Higher Education? ArXivabs/2305.01185 (2023). https://api.semanticscholar.org/CorpusID:258436716

  9. [9]

    Xinyue Chen, Kunlin Ruan, Kexin Phyllis Ju, Nathan Yap, and Xu Wang. 2025. More AI Assistance Reduces Cognitive Engagement: Examining the AI Assistance Dilemma in AI-Supported Note-Taking.Proceedings of the ACM on Human-Computer Interaction9, 7 (Oct. 2025), 1–29. doi:10.1145/3757632

  10. [10]

    Valdemar Danry, Pat Pataranutaporn, Matthew Groh, and Ziv Epstein. 2025. Deceptive Explanations by Large Language Models Lead People to Change their Beliefs About Misinformation More Often than Honest Explanations. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, U...

  11. [11]

    Sander de Jong, Ville Paananen, Benjamin Tag, and Niels van Berkel. 2025. Cognitive Forcing for Better Decision-Making: Reducing Overreliance on AI Systems Through Partial Explanations.Proc. ACM Hum.-Comput. Interact.9, 2, Article CSCW048 (May 2025), 30 pages. doi:10.1145/3710946 FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Welzel and Vincent

  12. [12]

    Upol Ehsan and Mark O Riedl. 2020. Human-centered explainable ai: Towards a reflective sociotechnical approach. InInternational conference on human-computer interaction. Springer, 449–466

  13. [13]

    Liye Fu, Benjamin Newman, Maurice Jakesch, and Sarah Kreps. 2023. Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 103, 13 pages. doi:10.11...

  14. [14]

    Darren Gergle and Desney S Tan. 2014. Experimental research in HCI. InWays of Knowing in HCI. Springer, 191–227

  15. [15]

    Ella Glikson and Omri Asscher. 2022. AI-mediated apology in a multilingual work context: Implications for perceived authenticity and willingness to forgive.Computers in Human Behavior140 (11 2022), 107592. doi:10.1016/j.chb.2022.107592

  16. [16]

    S Goldwasser, S Micali, and C Rackoff. 1985. The knowledge complexity of interactive proof-systems. InProceedings of the Seventeenth Annual ACM Symposium on Theory of Computing(Providence, Rhode Island, USA)(STOC ’85). Association for Computing Machinery, New York, NY, USA, 291–304. doi:10.1145/22145.22178

  17. [17]

    A decision theoretic framework for measuring AI reliance

    Ziyang Guo, Yifan Wu, Jason D. Hartline, and Jessica Hullman. 2024. A Decision Theoretic Framework for Measuring AI Reliance. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY, USA, 221–236. doi:10.1145/3630106.3658901

  18. [18]

    Hadassah Harland, Richard Dazeley, Hashini Senaratne, Peter Vamplew, Francisco Cruz, and Bahareh Nakisa. 2025. AI apology: a critical review of apology in AI systems.Artificial Intelligence Review58, 12 (2025), 369

  19. [19]

    Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. InAdvances in psychology. Vol. 52. Elsevier, 139–183

  20. [20]

    O. Henry. 1906. After Twenty Years. InThe Four Million. McClure, Phillips & Co., New York. Originally published in 1906; short story

  21. [21]

    Emily Sein Yue Elim Hui. 2025. Incorporating Bloom’s taxonomy into promoting cognitive thinking mechanism in artificial intelligence-supported learning environments.Interactive Learning Environments33, 2 (2025), 1087–1100. arXiv:https://doi.org/10.1080/10494820.2024.2364237 doi:10.1080/10494820.2024.2364237

  22. [22]

    Paul Jaccard. 1901. Etude comparative de la distribution florale dans une portion des Alpes et des Jura.Bulletin de la Societe Vaudoise des Sciences Naturelles37 (1901), 547–579

  23. [23]

    2011.Thinking, fast and slow

    Daniel Kahneman. 2011.Thinking, fast and slow. Farrar, Straus and Giroux, New York. https://www.amazon.de/Thinking-Fast-Slow- Daniel-Kahneman/dp/0374275637/ref=wl_it_dp_o_pdT1_nS_nC?ie=UTF8&colid=151193SNGKJT9&coliid=I3OCESLZCVDFL7

  24. [24]

    Ece Kamar. 2016. Directions in hybrid intelligence: complementing AI systems with human intelligence. InProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence(New York, New York, USA)(IJCAI’16). AAAI Press, 4070–4073

  25. [25]

    Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1(Valencia, Spain)(AAMAS ’12). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 467–474

  26. [26]

    I’m Not Sure, But

    Sunnie S. Y. Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, and Jennifer Wortman Vaughan. 2024. "I’m Not Sure, But... ": Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Asso...

  27. [27]

    Sunnie S. Y. Kim, Jennifer Wortman Vaughan, Q. Vera Liao, Tania Lombrozo, and Olga Russakovsky. 2025. Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). ACM, 1–19. doi:10.1145/3706598.3714020

  28. [28]

    Nataliya Kosmyna, Eugene Hauptmann, Ye Tong Yuan, Jessica Situ, Xian-Hao Liao, Ashly Vivian Beresnitzky, Iris Braunstein, and Pattie Maes. 2025. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv:2506.08872 [cs.AI] https://arxiv.org/abs/2506.08872

  29. [29]

    Hao-Ping (Hank) Lee, Advait Sarkar, Lev Tankelevitch, Ian Drosos, Sean Rintel, Richard Banks, and Nicholas Wilson. 2025. The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CH...

  30. [30]

    Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. InCHI Conference on Human Factors in Computing Systems (CHI ’22). ACM, 1–19. doi:10.1145/3491102.3502030

  31. [31]

    Steven Loria and contributors. 2026. TextBlob Documentation (Release 0.19.0). Read the Docs. Accessed 2026-01-06

  32. [32]

    Hancock, Mor Naaman, Malte Jung, and Jess Hohenstein

    Hannah Mieczkowski, Jeffrey T. Hancock, Mor Naaman, Malte Jung, and Jess Hohenstein. 2021. AI-Mediated Communication: Language Use and Interpersonal Effects in a Referential Communication Task.Proc. ACM Hum.-Comput. Interact.5, CSCW1, Article 17 (April 2021), 14 pages. doi:10.1145/3449091

  33. [33]

    Mohsin Murtaza, Chi-Tsun Cheng, Bader Albahlal, Muhana Muslam, and Mansoor Raza. 2025. The impact of LLM chatbots on learning outcomes in advanced driver assistance systems education.Scientific Reports15 (03 2025). doi:10.1038/s41598-025-91330-3

  34. [34]

    David Navon and Daniel Gopher. 1979. On the economy of the human-processing system.Psychological review86, 3 (1979), 214. Overreliance in Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

  35. [35]

    Abdul Wahab Qurashi, Violeta Holmes, and Anju P Johnson. 2020. Document processing: Methods for semantic text similarity analysis. In2020 international conference on INnovations in Intelligent SysTems and Applications (INISTA). IEEE, 1–6

  36. [36]

    Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 3982–3992

  37. [37]

    Jenna Russell, Marzena Karpinska, and Mohit Iyyer. 2025. People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 5342–5373

  38. [38]

    Anjali Singh, Karan Taneja, Zhitong Guan, and Avijit Ghosh. 2025. Protecting Human Cognition in the Age of AI. arXiv:2502.12447 [cs.CY] https://arxiv.org/abs/2502.12447

  39. [39]

    Kristen Sussman and Daniel Carter. 2025. Detecting Effects of AI-Mediated Communication on Language Complexity and Sentiment. In Companion Proceedings of the ACM on Web Conference 2025(Sydney NSW, Australia)(WWW ’25). Association for Computing Machinery, New York, NY, USA, 2689–2693. doi:10.1145/3701716.3717543

  40. [40]

    Ningzhi Tang, Meng Chen, Zheng Ning, Aakash Bansal, Yu Huang, Collin McMillan, and Toby Jia-Jun Li. 2023. An Empirical Study of Developer Behaviors for Validating and Repairing AI-Generated Code. (3 2023). doi:10.1184/R1/22223533.v1

  41. [41]

    Sacip Toker and Mahir Akgun. 2024. The Role of Task Complexity in Reducing AI Plagiarism: A Study of Generative AI Tools. arXiv:2412.13412 [cs.HC] https://arxiv.org/abs/2412.13412

  42. [42]

    Michael Tomasello, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll. 2005. Understanding and Sharing Intentions: The Origins of Cultural Cognition.Behavioral and Brain Sciences28 (11 2005), 675–735. doi:10.1017/S0140525X05000129

  43. [43]

    K Vani and Deepa Gupta. 2015. Investigating the impact of combined similarity metrics and POS tagging in extrinsic text plagiarism detection system. In2015 international conference on advances in computing, communications and informatics (ICACCI). IEEE, 1578–1584

  44. [44]

    Helena Vasconcelos, Matthew Jörke, Madeleine Grunde-McLaughlin, Tobias Gerstenberg, Michael S Bernstein, and Ranjay Krishna. 2023. Explanations can reduce overreliance on ai systems during decision-making.Proceedings of the ACM on Human-Computer Interaction7, CSCW1 (2023), 1–38

  45. [45]

    Veniamin Veselovsky, Manoel Horta Ribeiro, and Robert West. 2023. Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks. arXiv:2306.07899 [cs.CL] https://arxiv.org/abs/2306.07899

  46. [46]

    Zachary Wojtowicz and Simon DeDeo. 2025. Undermining mental proof: how AI can make cooperation harder by making thinking easier. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intel...

  47. [47]

    Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. InProceedings of the 27th International Conference on Intelligent User Interfaces(Helsinki, Finland)(IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. doi:10.1145/3490099.3511105

  48. [48]

    Vera and Bellamy, Rachel K

    Yunfeng Zhang, Q. Vera Liao, and Rachel K. E. Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. InProceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). ACM, 295–305. doi:10.1145/3351095.3372852

  49. [49]

    came a thousand miles to stand here tonight

    Qingjuan Zhao, Jianwei Niu, and Xuefeng Liu. 2022. ALS-MRS: Incorporating aspect-level sentiment for abstractive multi-review summarization.Knowledge-Based Systems258 (2022), 109942. doi:10.1016/j.knosys.2022.109942 A Task Prompts and AI Suggestions Task A - Analysis Prompt.Evaluate Bob’s decision to wait at the old restaurant site for twenty years. Judge...