{"total":19,"items":[{"citing_arxiv_id":"2606.22748","ref_index":112,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AI Fiction in the Wild","primary_cat":"cs.CL","submitted_at":"2026-06-22T01:29:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Analysis of 500k ChatGPT logs shows over one-third of conversations generate fiction, dominated by power users with repetitive and niche patterns.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.29018","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Adopt $\\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild","primary_cat":"cs.AI","submitted_at":"2026-05-27T19:17:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Longitudinal analysis of Bing Copilot users shows sticky individual LLM habits, activity-level differences in task complexity and success, and that WildChat is skewed toward power users.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.23262","ref_index":1,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Design and Report Benchmarks for Knowledge Work","primary_cat":"cs.AI","submitted_at":"2026-05-22T06:03:01+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"state what is represented by the task, which setting simplifications are made for evaluation, and which parts of the broader work remain outside the score. Section 2 explains why work activity, tested setting, and downstream use matter for benchmark interpretation. Section 3 develops this three-step design and reporting approach as a reporting structure: (1) identify the work activity, (2) specify the tested setting, and (3) score the proper work product. It also derives an O*NET-based inventory for identifying work activities [National Center for O*NET Development, 2026]. Section 4 demonstrates the approach through three benchmark case analyses: GDPVAL, OFFICEQA PRO, and APEX-SWE. Section 5 discusses limitations, alternative interpretations, and future directions."},{"citing_arxiv_id":"2605.23177","ref_index":78,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Cognitive offloading and the speedup illusion in human-AI interaction","primary_cat":"cs.CY","submitted_at":"2026-05-22T02:53:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Preregistered behavioral study identifies a speedup illusion where users overestimate time savings from AI assistance on cognitive tasks despite no actual difference in completion times.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.23159","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Generative AI and the Reorganization of Labor Demand","primary_cat":"econ.GN","submitted_at":"2026-05-22T02:18:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Firms adjust to generative AI by reallocating hiring (52% of exposure decline) and redesigning tasks within jobs (39.5%), with senior roles shifting earlier via reallocation and junior roles using mixed channels.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.22687","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The efficiency-gain illusion: People underestimate the rate of AI use and overestimate its benefits on simple tasks","primary_cat":"cs.CY","submitted_at":"2026-05-21T16:28:20+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Three pre-registered studies with 2691 participants show people underestimate their AI usage rate and overestimate efficiency gains on simple tasks, with prior use entrenching further adoption.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14090","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Synthetic Sociality: How Generative Models Privatize the Social Fabric","primary_cat":"cs.CY","submitted_at":"2026-05-13T20:19:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Generative models privatize social relations by automating social capacities into synthetic forms owned by private companies.","context_count":1,"top_context_role":"other","top_context_polarity":"unclear","context_text":"https: //doi.org/10.1093/acprof:oso/9780198508410.003.0005 DOI: 10.1093/acprof:oso/9780198508410.003.0005. [39] Jeffrey T Hancock, Mor Naaman, and Karen Levy. 2020. AI-Mediated Communication: Definition, Research Agenda, and Ethical Considerations. Journal of Computer-Mediated Communication25, 1 (March 2020), 89-100. https://doi.org/10.1093/jcmc/zmz022 [40] Kunal Handa, Alex Tamkin, Miles McCain, Saffron Huang, Esin Durmus, Sarah Heck, Jared Mueller, Jerry Hong, Stuart Ritchie, Tim Belonax, Kevin K. Troy, Dario Amodei, Jared Kaplan, Jack Clark, and Deep Ganguli. 2025. Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations. arXiv:2503.04761 [cs.CY] https://arxiv.org/abs/2503."},{"citing_arxiv_id":"2605.23958","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AI in the Enterprise: How People Use M365 Copilot Chat","primary_cat":"cs.CY","submitted_at":"2026-05-11T23:13:26+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Large-scale classification of M365 Copilot Chat sessions shows writing dominates usage with a shift toward content creation over search, varying by occupation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05767","ref_index":35,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Priming, Path-dependence, and Plasticity: Understanding the molding of user-LLM interaction and its implications from (many) chat logs in the wild","primary_cat":"cs.HC","submitted_at":"2026-05-07T07:00:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Large-scale analysis of wild LLM chat logs finds that user interaction patterns stabilize quickly after initial use and correlate with long-term outcomes like retention, creating an agency paradox of limited exploration in unconstrained systems.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"[34] Zeyu He, Saniya Naphade, and Ting-Hao Kenneth Huang. 2025. Prompting in the Dark: Assessing Human Performance in Prompt Engineering for Data La- beling When Gold Labels Are Absent. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems(Tokyo, Japan)(CHI '25). Associa- tion for Computing Machinery, New York, NY, USA, Article 1195, 33 pages. doi:10.1145/3706598.3714319 [35] Grace Huckins. 2025. Why GPT-4O's sudden shutdown left people grieving. MIT Technology Review(Aug 2025). https://www.technologyreview.com/2025/ 08/15/1121900/gpt4o-grief-ai-companion/ [36] Tingting Jiang, Zhumo Sun, Shiting Fu, and Yan Lv. 2024. Human-AI inter- action research agenda: A user-centered perspective.Data and Information Management8, 4 (2024), 100078."},{"citing_arxiv_id":"2604.27231","ref_index":45,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Upskilling with Generative AI: Practices and Challenges for Freelance Knowledge Workers","primary_cat":"cs.HC","submitted_at":"2026-04-29T22:02:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Freelancers use generative AI to support exploratory skill acquisition but not as their main resource due to reliability issues, leading to a shift toward survival-oriented upskilling and the emergence of invisible competencies that lack market validation.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"rather than long-term professional growth. At the same time, freelancers did not treat generative AI as their primary or fully trusted learning source. A key limitation they felt with generative AI tools was the lack of contextual personaliza- tion. LLM-generated learning content often felt generic and did not reflect freelancers' specific, situated needs [45]. To fill this gap, free- lancers turned to peer learning for more relevant, experience-based knowledge. However, peer learning was constrained by the com- petitive nature of freelance work, which made many freelancers hesitant to share what they know [6]. Within this landscape, free- lancers increasingly described generative AI as a lower-risk option"},{"citing_arxiv_id":"2604.22503","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Measuring and Mitigating Persona Distortions from AI Writing Assistance","primary_cat":"cs.CL","submitted_at":"2026-04-24T12:31:11+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18849","ref_index":52,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Exposure to Adoption: Generative AI in European Workplaces","primary_cat":"econ.GN","submitted_at":"2026-04-20T21:14:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Generative AI adoption in Europe ranges from under 3% to 25%, is steeper for skilled workers in abstract-task jobs and in digitally advanced countries with training, shows a gender gap in exposed roles, and has produced no detectable shift in reported task content so far.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.15597","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LLMs Corrupt Your Documents When You Delegate","primary_cat":"cs.CL","submitted_at":"2026-04-17T00:33:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLMs corrupt an average of 25% of document content during long delegated editing workflows across 52 domains, even frontier models, and agentic tools do not mitigate the issue.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16283","ref_index":19,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Can the Recovery Mechanism Survive AI? Skill Formation, Labor, and What Current Measurement Misses","primary_cat":"cs.CY","submitted_at":"2026-04-12T05:42:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Generative AI may break the education-based recovery mechanism for technological displacement, as evidence shows performance gains without learning gains and current measurements miss the knowledge dimension of cognition.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.06906","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The AI Skills Shift: Mapping Skill Obsolescence, Emergence, and Transition Pathways in the LLM Era","primary_cat":"cs.CL","submitted_at":"2026-04-08T10:05:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Benchmarking four LLMs on O*NET skills yields SAFI scores showing mathematics and programming as most automatable while active listening and reading comprehension are least, with 78.7% of real AI interactions being augmentation rather than replacement.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.17617","ref_index":16,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agentic Search in the Wild: Intents and Trajectory Dynamics from 14M+ Real Search Requests","primary_cat":"cs.IR","submitted_at":"2026-01-24T22:42:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Large-scale log study of 14M+ agentic searches finds short sessions, intent-specific repetition patterns, and that 54% of new query terms trace to prior retrieved evidence.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.10652","ref_index":39,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Vibe Coding in Product Teams: Reconfiguring AI-Assisted Workflows, Prototyping, and Collaboration","primary_cat":"cs.HC","submitted_at":"2025-09-12T19:28:38+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Interviews reveal a four-stage vibe coding workflow that accelerates prototyping while introducing tensions between quick efficiency and reflective design intention, plus asymmetries in trust and ownership.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.22440","ref_index":13,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Model Design to Organizational Design: Complexity Redistribution and Trade-Offs in Generative AI","primary_cat":"cs.CY","submitted_at":"2025-06-10T15:22:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"LLMs relocate rather than eliminate trade-offs among generality, accuracy, and simplicity, shifting complexity to infrastructure, compliance, and expertise and redefining competitive advantage around managing that shift.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2505.06120","ref_index":27,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LLMs Get Lost In Multi-Turn Conversation","primary_cat":"cs.CL","submitted_at":"2025-05-09T15:21:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLMs drop 39% in performance during multi-turn conversations due to premature assumptions and inability to recover from early errors.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"when they know what they need (i.e., they can fully specify their requirements in an instruction), but also when they don't. In such cases, users might start with an underspecified instruction and further clarify their needs through turn interactions. Though studies of LLM conversation logs have confirmed that underspecification in user instructions is prevalent [27], LLM systems are typically evaluated in single-turn, fully-specified settings. Even though a growing body of work proposes to evaluate LLMs in amulti-turn fashion, we identify in our review (Section 2) that most prior work treats the conversation as episodic: conversation turns might relate to each other, but the conversation can effectively be decomposed as an array of subtasks that can be evaluated in isolation."}],"limit":50,"offset":0}