The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents
Pith reviewed 2026-05-21 03:55 UTC · model grok-4.3
The pith
Analysis of 1,524 workplace AI incident reports finds that 83% stem from misalignments where workers need precise and personal systems but receive basic ones optimized for speed.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We analyzed 1,524 reports of incidents in which AI systems were used to perform 171 occupational tasks across 12 industry sectors. Using an LLM-as-an-expert approach, we extracted the main traits of the AI systems involved in those incidents using an established framework of twelve traits. We then compared them with the traits that 202 workers highly familiar with those tasks would have preferred. We found that as many as 83% of workplace incidents stem from worker-AI misalignments. In most cases, workers wanted systems that are precise, insightful, or personal, but instead received systems that are basic, simple, or general. We also compared the traits causing the incidents with the traits
What carries the argument
Comparison of twelve AI system traits extracted from incident reports against the traits preferred by workers and by developers.
If this is right
- Workplace AI incidents are likely to persist without design corrections that better align systems with workers' needs.
- The mismatch causes an invisible erosion of worker agency.
- Organizational productivity declines as a result of these repeated incidents.
- Developers' overfocus on efficiency and speed accounts for 74% of the misalignments, especially in people-facing occupations such as human resources.
Where Pith is reading between the lines
- Organizations could reduce incidents by involving workers earlier in specifying the desired traits of AI tools for their tasks.
- The same mismatch pattern may appear in consumer or public-sector AI uses and could be studied with similar incident-report methods.
- Monitoring shifts in dominant AI traits over time, such as the rise of imaginative systems after generative AI arrived, offers a way to anticipate new incident categories.
Load-bearing premise
The LLM extraction of the twelve traits from incident reports accurately identifies the design features that caused the incidents without bias, and the surveyed workers' stated preferences are representative of what would have prevented those specific incidents.
What would settle it
Redesign several AI systems for the same tasks to match the worker-preferred traits identified in the study, then measure whether the rate of reported incidents drops compared with control systems left unchanged.
Figures
read the original abstract
Recent human-computer interaction (HCI) research has revealed a widespread misalignment between how developers design workplace artificial intelligence (AI) systems, and what workers actually need from them. Yet, little research has examined the effects of this gap, or how it may cause harm. We analyzed 1,524 reports of incidents in which AI systems were used to perform 171 occupational tasks across 12 industry sectors. Using an Large Language Model (LLM)-as-an-expert approach, we extracted the main traits of the AI systems involved in those incidents using an established framework of twelve traits. We then compared them with the traits that 202 workers highly familiar with those tasks would have preferred. We found that as many as 83\% of workplace incidents stem from worker-AI misalignments. In most cases, workers wanted systems that are precise, insightful, or personal, but instead received systems that are basic, simple, or general. Over the years, fast AI caused a considerable number of incidents, yet these declined, and imaginative AI, with the mass introduction of generative AI, started to cause incidents. We also compared the traits causing the incidents with the traits that 197 developers building AI systems for those tasks would have preferred. If the traits causing the incidents were the same as those designed by developers, then developers may be responsible for those incidents. We found that 74\% of task misalignments could be attributed to developers who tended to overfocus on efficiency and speed, especially for systems performing tasks in people-facing occupations such as those in the human resources sector. Our results call for design interventions that better align AI development with workers' needs, as without such corrections, workplace AI incidents are likely to persist, causing the invisible erosion of worker agency and organizational productivity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes 1,524 reports of workplace AI incidents across 171 occupational tasks in 12 sectors. Using an LLM-as-an-expert method to extract twelve traits from the incident descriptions, it compares these extracted traits against preferences stated by 202 workers and 197 developers. The central claims are that up to 83% of incidents arise from worker-AI misalignments (workers preferring precise/insightful/personal traits but receiving basic/simple/general ones) and that 74% of task misalignments can be attributed to developers who over-prioritize efficiency and speed.
Significance. If the core percentages hold after validation, the work supplies a large-scale empirical mapping from specific design-trait gaps to documented incidents, strengthening the case for worker-centered AI design interventions in HCI. The scale of the incident corpus and the direct comparison to both worker and developer preferences are notable strengths that could inform practical guidelines.
major comments (2)
- [Methods] Methods (LLM-as-an-expert trait extraction): The description of the LLM prompt and trait-labeling procedure supplies no human validation, inter-rater reliability statistics, prompt-sensitivity tests, or ablation on how ambiguous reports were resolved. Because the 83% misalignment figure and the subsequent 74% developer-attribution claim rest entirely on these labels, any systematic bias in the extraction directly undermines both headline statistics.
- [Results] Results (83% and 74% claims): No error bars, exclusion criteria, or robustness checks are reported for the misalignment percentages or the developer-attribution step. Without these, it is impossible to determine whether the reported proportions are stable under reasonable variations in trait definitions or coding decisions.
minor comments (2)
- [Abstract] Abstract: The statement that 'fast AI caused a considerable number of incidents, yet these declined, and imaginative AI... started to cause incidents' would be clearer with explicit time periods or supporting counts from the dataset.
- [Throughout] Notation: The twelve-trait framework is referenced repeatedly; a brief table or appendix listing the exact trait definitions and their opposites would improve readability for readers unfamiliar with the source framework.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. The comments highlight important opportunities to strengthen the methodological transparency and statistical robustness of our findings. We address each major comment below and commit to revisions that directly respond to the concerns raised.
read point-by-point responses
-
Referee: [Methods] Methods (LLM-as-an-expert trait extraction): The description of the LLM prompt and trait-labeling procedure supplies no human validation, inter-rater reliability statistics, prompt-sensitivity tests, or ablation on how ambiguous reports were resolved. Because the 83% misalignment figure and the subsequent 74% developer-attribution claim rest entirely on these labels, any systematic bias in the extraction directly undermines both headline statistics.
Authors: We agree that the current description of the LLM-as-an-expert procedure lacks explicit validation steps. In the revised manuscript we will add a dedicated validation subsection that reports: (1) a human validation study on a random sample of 150 incident reports independently coded by two domain experts, with inter-rater reliability measured by Cohen’s kappa; (2) prompt-sensitivity experiments in which we re-extract traits using three alternative prompt phrasings and quantify label stability; and (3) a clear protocol for resolving ambiguous reports, including the use of multiple LLM runs followed by majority vote. These additions will allow readers to assess potential systematic bias in the trait labels that underpin the 83 % and 74 % figures. revision: yes
-
Referee: [Results] Results (83% and 74% claims): No error bars, exclusion criteria, or robustness checks are reported for the misalignment percentages or the developer-attribution step. Without these, it is impossible to determine whether the reported proportions are stable under reasonable variations in trait definitions or coding decisions.
Authors: We acknowledge that the reported percentages currently lack accompanying uncertainty estimates and sensitivity analyses. In the revision we will: (1) compute and report 95 % bootstrap confidence intervals for both the 83 % misalignment rate and the 74 % developer-attribution rate; (2) explicitly state the exclusion criteria applied to the 1,524 reports (e.g., insufficient textual detail for reliable trait extraction); and (3) present robustness checks that recompute the key percentages after perturbing trait boundary definitions and after excluding the most ambiguous 10 % of reports. These additions will demonstrate the stability of the headline statistics under reasonable variations in coding decisions. revision: yes
Circularity Check
No circularity: empirical comparison of LLM-extracted traits against independent surveys
full rationale
The paper conducts an empirical analysis of 1,524 incident reports by applying an LLM to extract traits from an established twelve-trait framework, then directly compares those extracted traits to separate survey responses from 202 workers and 197 developers. No equations, fitted parameters, or derivations are described that reduce to the inputs by construction. The 83% and 74% figures arise from counting mismatches between the two independently collected datasets rather than from any self-definitional loop or renamed fit. Self-citations, if present, are not load-bearing for the central claims, which rest on the external survey data and incident corpus.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The established framework of twelve traits accurately and exhaustively captures the design characteristics relevant to workplace AI incidents.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We found that as many as 83% of workplace incidents stem from worker-AI misalignments... 74% of task misalignments could be attributed to developers who tended to overfocus on efficiency and speed.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosureabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Using an Large Language Model (LLM)-as-an-expert approach, we extracted the main traits of the AI systems involved in those incidents using an established framework of twelve traits.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Daron Acemoglu, David Autor, and Simon Johnson. 2023. Can we have pro-worker AI.Choosing a path(2023)
work page 2023
-
[2]
AI, Algorithmic, and Automation Incidents and Controversies (AIAAIC). 2026. AIAAIC Repository of AI, Algorithmic, and Automation Incidents and Controversies. https://www.aiaaic.org/. Accessed: 2026-01-06
work page 2026
-
[3]
Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI Interaction. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–13. doi:10....
-
[4]
Elske Ammenwerth, Carola Iller, and Cornelia Mahler. 2006. IT-adoption and the interaction of task, technology and individuals: a fit framework and a case study.BMC Medical Informatics and Decision Making6, 1 (Jan. 2006), 3. doi:10.1186/1472-6947-6-3
-
[5]
Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, et al. 2021. A general language assistant as a laboratory for alignment.arXiv preprint arXiv:2112.00861(2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[6]
David H. Autor. 2013. The “task approach” to labor markets: an overview.Journal for Labour Market Research46, 3 (Sept. 2013), 185–199. doi:10.1007/s12651-013-0128-z
-
[7]
Ezra Awumey, Sauvik Das, and Jodi Forlizzi. 2024. A systematic review of biometric monitoring in the workplace: analyzing socio- technical harms in development, deployment and use. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 920–932
work page 2024
-
[8]
Ezra Awumey, Sauvik Das, and Jodi Forlizzi. 2024. A Systematic Review of Biometric Monitoring in the Workplace: Analyzing Socio-technical Harms in Development, Deployment and Use. InThe 2024 ACM Conference on Fairness Accountability and Transparency. ACM, Rio de Janeiro Brazil, 920–932. doi:10.1145/3630106.3658945
-
[9]
Jascha Bareis and Christian Katzenbach. 2022. Talking AI into being: The narratives and imaginaries of national AI strategies and their performative politics.Science, Technology, & Human Values47, 5 (2022), 855–881
work page 2022
-
[10]
Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM, Virtual Event Canada, 610–623. doi:10.1145/3442188.3445922
-
[11]
Federico Bianchi, Amanda Cercas Curry, and Dirk Hovy. 2023. Artificial intelligence accidents waiting to happen?Journal of Artificial Intelligence Research76 (2023), 193–199
work page 2023
-
[12]
Brown, Johnathan Flowers, Anthony Ventresque, and Christopher L
Abeba Birhane, Elayne Ruane, Thomas Laurent, Matthew S. Brown, Johnathan Flowers, Anthony Ventresque, and Christopher L. Dancy
-
[13]
InProceedings of the 2022 ACM conference on fairness, accountability, and transparency
The forgotten margins of AI ethics. InProceedings of the 2022 ACM conference on fairness, accountability, and transparency. The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents FAccT ’26, June 25–28, 2026, Montreal, QC, Canada 948–958
work page 2022
-
[14]
Kim M Blankenship, Samuel R Friedman, Shari Dworkin, and Joanne E Mantell. 2006. Structural interventions: concepts, challenges and opportunities for research.Journal of Urban Health83, 1 (2006), 59–72
work page 2006
-
[15]
Edyta Bogucka, Sanja Šćepanović, and Daniele Quercia. 2024. Atlas of AI Risks: Enhancing Public Understanding of AI Risks.Proceedings of the AAAI Conference on Human Computation and Crowdsourcing12 (Oct. 2024), 33–43. doi:10.1609/hcomp.v12i1.31598
-
[16]
Edyta Paulina Bogucka, Marios Constantinides, Julia De Miguel Velazquez, Sanja Scepanovic, Daniele Quercia, and Andrés Gvirtz. 2024. The Atlas of AI Incidents in Mobile Computing: Visualizing the Risks and Benefits of AI Gone Mobile. InAdjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction(Melbourne, VIC, Australia)(...
- [17]
-
[18]
Michelle Brachman, Amina El-Ashry, Casey Dugan, and Werner Geyer. 2025. Current and future use of large language models for knowledge work.Proceedings of the ACM on Human-Computer Interaction9, 7 (2025), 1–24
work page 2025
-
[19]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners.Advances in neural information processing systems33 (2020), 1877–1901
work page 2020
-
[20]
Erik Brynjolfsson, Danielle Li, and Lindsey Raymond. 2025. Generative AI at work.The Quarterly Journal of Economics140, 2 (2025), 889–942
work page 2025
-
[21]
Myra Cheng, Sunny Yu, Cinoo Lee, Pranav Khadpe, Lujain Ibrahim, and Dan Jurafsky. 2025. Social sycophancy: A broader understanding of llm sycophancy.arXiv preprint arXiv:2505.13995(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[22]
Cliffe Dekker Hofmeyr. 2025. Another episode of fabricated citations, real repercussions: South African courts show no tolerance for AI-hallucinated cases. https://www.cliffedekkerhofmeyr.com/en/news/publications/2025/Practice/Employment-Law/combined- employment-and-knowledge-management-alert-4-july-Another-episode-of-fabricated-citations-real-repercussio...
work page 2025
-
[23]
Jacob Cohen. 1960. A coefficient of agreement for nominal scales.Educational and psychological measurement20, 1 (1960), 37–46
work page 1960
-
[24]
Julia De Miguel Velázquez, Sanja Šćepanović, Andrés Gvirtz, and Daniele Quercia. 2024. Decoding Real-World Artificial Intelligence Incidents.Computer57, 11 (2024), 71–81. doi:10.1109/MC.2024.3432492
-
[25]
Catherine D’ignazio and Lauren F Klein. 2023.Data feminism. MIT press
work page 2023
-
[26]
Mengchen Dong, Jane Rebecca Conway, Jean-François Bonnefon, Azim Shariff, and Iyad Rahwan. 2024. Fears about artificial intelligence across 20 countries and six domains of application.American Psychologist(2024)
work page 2024
-
[27]
Mengchen Dong, Jane Rebecca Conway, Jean-François Bonnefon, Azim Shariff, and Iyad Rahwan. 2024. Fears about artificial intelligence across 20 countries and six domains of application.American Psychologist(2024). doi:10.1037/amp0001454 Place: US Publisher: American Psychological Association
-
[28]
Madeleine Clare Elish. 2019. Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction.Engaging Science, Technology, and Society5 (March 2019), 40–60. doi:10.17351/ests2019.260
-
[29]
Madeleine Clare Elish and Danah Boyd. 2018. Situating methods in the magic of Big Data and AI.Communication monographs85, 1 (2018), 57–80
work page 2018
-
[30]
Michael Feffer, Nikolas Martelaro, and Hoda Heidari. 2023. The AI Incident Database as an Educational Tool to Raise Awareness of AI Harms: A Classroom Exploration of Efficacy, Limitations, & Future Improvements. InProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO ’23). Association for Computing M...
-
[31]
Diana E Forsythe. 1993. Engineering knowledge: The construction of knowledge in artificial intelligence.Social studies of science23, 3 (1993), 445–477
work page 1993
-
[32]
Sarah E. Fox, Vera Khovanskaya, Clara Crivellaro, Niloufar Salehi, Lynn Dombrowski, Chinmay Kulkarni, Lilly Irani, and Jodi Forlizzi
-
[33]
Worker-Centered Design: Expanding HCI Methods for Supporting Labor. InExtended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–8. doi:10.1145/3334480.3375157
-
[34]
Fox, Samantha Shorey, Esther Y
Sarah E. Fox, Samantha Shorey, Esther Y. Kang, Dominique Montiel Valle, and Estefania Rodriguez. 2023. Patchwork: The Hidden, Human Labor of AI Integration within Essential Work.Proc. ACM Hum.-Comput. Interact.7, CSCW1 (April 2023), 81:1–81:20. doi:10.1145/3579514
-
[35]
Iason Gabriel, Arianna Manzini, Geoff Keeling, Lisa Anne Hendricks, Verena Rieser, Hasan Iqbal, Nenad Tomašev, Ira Ktena, Zachary Kenton, Mikel Rodriguez, Seliem El-Sayed, Sasha Brown, Canfer Akbulut, Andrew Trask, Edward Hughes, A. Stevie Bergman, Renee Shelby, Nahema Marchal, Conor Griffin, Juan Mateos-Garcia, Laura Weidinger, Winnie Street, Benjamin La...
-
[36]
Anna Gausen, Bhaskar Mitra, and Siân Lindley. 2024. A Framework for Exploring the Consequences of AI-Mediated Enterprise Knowledge Access and Identifying Risks to Workers. InThe 2024 ACM Conference on Fairness, Accountability, and Transparency. ACM, Rio de Janeiro Brazil, 207–220. doi:10.1145/3630106.3658900
-
[37]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford
-
[38]
URL https://cacm.acm.org/research/ datasheets-for-datasets/
Datasheets for datasets.Commun. ACM64, 12 (Dec. 2021), 86–92. doi:10.1145/3458723
-
[39]
Tarleton Gillespie, Ryland Shaw, Mary L Gray, and Jina Suh. 2026. AI Red-Teaming Is a Sociotechnical Problem.Commun. ACM(2026)
work page 2026
-
[40]
Delaram Golpayegani, Harshvardhan J Pandit, and Dave Lewis. 2023. To be high-risk, or not to be—semantic specifications and implications of the ai act’s high-risk ai applications and harmonised standards. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 905–915
work page 2023
-
[41]
As an individual, I suppose you can’t really do much
Sinem Görücü, Yuheng Ren, Gabrielle Samuel, and Georgia Panagiotidou. 2025. " As an individual, I suppose you can’t really do much": Environmental Sustainability Perceptions of Machine Learning Practitioners. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. 1312–1324
work page 2025
-
[42]
Eileen Guo, Jeroen van Raalte, Justin-Casimir Braun, Gabriel Geiger, Amanda Silverman, Eva Constantaras, Melissa Heikkilä, Tahmeed Shafiq, Alice Milliken, Crofton Black, and Daniel Howden. 2025. The Limits of Ethical AI. https://www.lighthousereports.com/ investigation/the-limits-of-ethical-ai/. Accessed: 2026-01-14
work page 2025
-
[43]
Sacha Gutierrez, Dennis Nguyen, and Karin van Es. 2025. Tool, companion or a catalyst force? Exploring sociotechnical imaginaries Within AI livestreams’ communities of practice.Big Data & Society12, 4 (Dec. 2025), 20539517251381663. doi:10.1177/20539517251381663 Publisher: SAGE Publications Ltd
-
[44]
Kunal Handa, Alex Tamkin, Miles McCain, Saffron Huang, Esin Durmus, Sarah Heck, Jared Mueller, Jerry Hong, Stuart Ritchie, Tim Belonax, et al. 2025. Which Economic Tasks are Performed with AI.Evidence from Millions of Claude Conversations(2025)
work page 2025
-
[45]
Emma Harvey, Allison Koenecke, and Rene F. Kizilcec. 2025. "Don’t Forget the Teachers": Towards an Educator-Centered Understanding of Harms from Large Language Models in Education. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–19. doi:10.1145/3706598.3713210
-
[46]
Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wallach. 2019. Improving fairness in machine learning systems: What do industry practitioners need?. InProceedings of the 2019 CHI conference on human factors in computing systems. 1–16
work page 2019
-
[47]
Alexis Shore Ingber and Nazanin Andalibi. 2025. Emotion AI in Job Interviews: Injustice, Emotional Labor, Identity, and Privacy. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. ACM, Athens Greece, 1–17. doi:10.1145/3715275. 3732002
-
[48]
Maia Jacobs, Melanie F. Pradier, Thomas H. McCoy, Roy H. Perlis, Finale Doshi-Velez, and Krzysztof Z. Gajos. 2021. How machine- learning recommendations influence clinician treatment selections: the example of antidepressant selection.Translational Psychiatry 11, 1 (2021), 108. doi:10.1038/s41398-021-01224-x
-
[49]
Maurice Jakesch, Zana Buçinca, Saleema Amershi, and Alexandra Olteanu. 2022. How Different Groups Prioritize Ethical Values for Responsible AI. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22). Association for Computing Machinery, New York, NY, USA, 310–323. doi:10.1145/3531146.3533097
-
[50]
Stefan Jänicke, Greta Franzini, Muhammad Faisal Cheema, Gerik Scheuermann, et al. 2015. On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges.EuroVis (STARs)2015 (2015), 83–103
work page 2015
-
[51]
Jiaming Ji, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Borong Zhang, Donghai Hong, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Lukas Vierling, Zhaowei Zhang, Fanzhi Zeng, Juntao Dai, Xuehai Pan, Hua Xu, Aidan O’Gara, Kwan Ng, Brian Tse, Jie Fu, Stephen Mcaleer, Yanfeng Wang, Mingchuan Yang, Yunhuai Liu, Yizhou Wang, Song-Chun Zhu, Yike Guo, Yaodong Yang, a...
-
[52]
Mackenzie Jorgensen, Hannah Richert, Elizabeth Black, Natalia Criado, and Jose Such. 2023. Not so fair: The impact of presumably fair machine learning models. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 297–311
work page 2023
-
[53]
Nadia Karizat, Alexandra H Vinson, Shobita Parthasarathy, and Nazanin Andalibi. 2024. Patent applications as glimpses into the sociotechnical imaginary: ethical speculation on the imagined futures of emotion AI for mental health monitoring and detection. Proceedings of the ACM on Human-Computer Interaction8, CSCW1 (2024), 1–43
work page 2024
-
[54]
Anna Kawakami, Jordan Taylor, Sarah Fox, Haiyi Zhu, and Kenneth Holstein. 2026. AI failure loops in devalued work: The confluence of overconfidence in AI and underconfidence in worker expertise.Big Data & Society13, 1 (2026), 20539517261424164
work page 2026
-
[55]
Os Keyes, Jevan Hutson, and Meredith Durbin. 2019. A Mulching Proposal: Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry. InExtended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk)(CHI EA ’19). Association for Computing Machinery, New York, NY, USA, 1–11. doi:...
-
[56]
Hannah Rose Kirk, Iason Gabriel, Chris Summerfield, Bertie Vidgen, and Scott A Hale. 2025. Why human–AI relationships need socioaffective alignment.Humanities and Social Sciences Communications12, 1 (2025), 1–9
work page 2025
-
[57]
Kupferschmidt, Kieran O’Doherty, and Joshua A
Kristina L. Kupferschmidt, Kieran O’Doherty, and Joshua A. Skorburg. 2025. Write on Paper, Wrong in Practice: Why LLMs Still Struggle with Writing Clinical Notes.Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society8, 2 (Oct. 2025), 1524–1534. doi:10.1609/aies.v8i2.36651
-
[58]
J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data.biometrics(1977), 159–174
work page 1977
-
[59]
Hao-Ping Lee, Advait Sarkar, Lev Tankelevitch, Ian Drosos, Sean Rintel, Richard Banks, and Nicholas Wilson. 2025. The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. InProceedings of the 2025 CHI conference on human factors in computing systems. 1–22
work page 2025
-
[60]
Hao-Ping Lee, Yu-Ju Yang, Thomas Serban Von Davier, Jodi Forlizzi, and Sauvik Das. 2024. Deepfakes, phrenology, surveillance, and more! a taxonomy of ai privacy risks. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–19
work page 2024
-
[61]
John D Lee and Katrina A See. 2004. Trust in automation: Designing for appropriate reliance.Human Factors46, 1 (2004), 50–80
work page 2004
-
[62]
Nancy G. Leveson. 2011.Engineering a safer world: systems thinking applied to safety. MIT press, Cambridge (Mass.)
work page 2011
-
[63]
Megan Li, Wendy Bickersteth, Ningjing Tang, Lorrie Cranor, Jason Hong, Hong Shen, and Hoda Heidari. 2025. A Closer Look at the Existing Risks of Generative AI: Mapping the Who, What, and How of Real-World Incidents.Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society8, 2 (Oct. 2025), 1561–1573. doi:10.1609/aies.v8i2.36655
-
[64]
Isabella Loaiza and Roberto Rigobon. 2024. The EPOCH of AI: Human-Machine Complementarities at Work. doi:10.2139/ssrn.5028371
-
[65]
Jonathan Lynn, Rachel Y. Kim, Sicun Gao, Daniel Schneider, Sachin S. Pandya, and Min Kyung Lee. 2025. Regulating Algorithmic Management: A Multi-Stakeholder Study of Challenges in Aligning Software and the Law for Workplace Scheduling. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. ACM, Athens Greece, 547–572. doi:...
-
[66]
Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach
Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. ...
-
[67]
Nestor Maslej, Loredana Fattorini, Raymond Perrault, Yolanda Gil, Vanessa Parli, Njenga Kariuki, Emily Capstick, Anka Reuel, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Russell Wald, Toby Walsh, Armin Hamrah, Lapo Santarlasci, Julia Betts Lotufo, Alexandra Rome, Andrew Shi, and Sukrut O...
-
[68]
Sean McGregor. 2021. Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database.Proceedings of the AAAI Conference on Artificial Intelligence35, 17 (May 2021), 15458–15463. doi:10.1609/aaai.v35i17.17817 Number: 17
-
[69]
Kevin R McKee, Xuechunzi Bai, and Susan T Fiske. 2024. Warmth and competence in human-agent cooperation.Autonomous Agents and Multi-Agent Systems38, 1 (2024), 23
work page 2024
-
[70]
National Center for O*NET Development. 2026. O*NET Database. https://www.onetcenter.org/database.html. Accessed: 2026-03-24
work page 2026
-
[71]
Nataliya Nedzhvetskaya and JS Tan. 2024. No Simple Fix: How AI Harms Reflect Power and Jurisdiction in the Workplace. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24). Association for Computing Machinery, New York, NY, USA, 422–432. doi:10.1145/3630106.3658915
-
[72]
2013.The design of everyday things: Revised and expanded edition
Don Norman. 2013.The design of everyday things: Revised and expanded edition. Basic books
work page 2013
-
[73]
2025.Towards a common reporting framework for AI incidents
OECD. 2025.Towards a common reporting framework for AI incidents. OECD Artificial Intelligence Papers. The Organisation for Economic Co-operation and Development (OECD). doi:10.1787/f326d4ac-en Edition: 34 Series: OECD Artificial Intelligence Papers
-
[74]
Kazuo Okamura and Seiji Yamada. 2020. Adaptive trust calibration for human-AI collaboration.PLOS ONE15, 2 (2020), e0229132
work page 2020
-
[75]
Lauren Olson, Ricarda Anna-Lena Fischer, Florian Kunneman, and Emitzá Guzmán. 2025. Who Speaks for Ethics? How Demograph- ics Shape Ethical Advocacy in Software Development. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. 2847–2862
work page 2025
-
[76]
OpenAI. 2023. GPT-5: Large Language Model. https://openai.com/research/gpt-5. Accessed: 2026-03-24
work page 2023
-
[77]
1984.Normal Accidents: Living with High-Risk Technologies
Charles Perrow. 1984.Normal Accidents: Living with High-Risk Technologies. Basic Books
work page 1984
-
[78]
Inioluwa Deborah Raji, Andrew Smart, Rebecca N White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. InProceedings of the 2020 conference on fairness, accountability, and transparency. 33–44
work page 2020
-
[79]
Bogdana Rakova, Jingying Yang, Henriette Cramer, and Rumman Chowdhury. 2021. Where responsible AI meets reality: Practitioner perspectives on enablers for shifting organizational practices.Proceedings of the ACM on Human-Computer Interaction5, CSCW1 (2021), 1–23
work page 2021
-
[80]
Jaspreet Ranjit, Ke Zhou, Swabha Swayamdipta, and Daniele Quercia. 2026. Are We Automating the Joy Out of Work? Designing AI to Augment Work, Not Meaning. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. 1–46
work page 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.