Beyond the Single Turn: Reframing Refusals as Dynamic Experiences Embedded in the Context of Mental Health Support Interactions with LLMs

Alice Qian; Blake Bullwinkel; Esther Howe; Hoda Heidari; Hong Shen; Jina Suh; Ningjing Tang; Paola Pedrelli; Qiaosi Wang

arxiv: 2602.01694 · v3 · pith:D7HW75Y6new · submitted 2026-02-02 · 💻 cs.HC

Beyond the Single Turn: Reframing Refusals as Dynamic Experiences Embedded in the Context of Mental Health Support Interactions with LLMs

Ningjing Tang , Alice Qian , Qiaosi Wang , Esther Howe , Blake Bullwinkel , Paola Pedrelli , Jina Suh , Hoda Heidari

show 1 more author

Hong Shen

This is my paper

Pith reviewed 2026-05-25 06:46 UTC · model grok-4.3

classification 💻 cs.HC

keywords LLM refusalsmental health supportuser experiencemixed methodsAI safetyrefusal mechanismshuman-AI interaction

0 comments

The pith

LLM refusals in mental health support form multi-phase experiences instead of isolated single-turn events.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that when large language models refuse to engage with sensitive mental health topics, users experience this not as a single response but through five connected phases. These phases are pre-refusal expectation formation, the triggering and encounter with the refusal, how the refusal message is framed, the provision of resource referrals, and the post-refusal outcomes. A sympathetic reader would care because these refusals have been linked to real-world harms, and current design focuses only on whether the model correctly follows policy in one turn rather than the full user journey. The study uses surveys of 53 people and interviews with 16 to build a framework that treats refusals as embedded in ongoing support interactions.

Core claim

Through surveys (N=53) and in-depth interviews (N=16) with individuals using LLMs for mental health support and mental health professionals, we reveal that refusals are not isolated, single-turn system behaviors but rather constitute dynamic, multi-phase experiences: pre-refusal expectation formation, refusal triggering and encounter, refusal message framing, resource referral provision, and post-refusal outcomes. We contribute a multi-phase framework for evaluating refusals beyond binary policy compliance accuracy and design recommendations for future refusal mechanisms.

What carries the argument

The multi-phase framework that identifies five stages in the refusal experience to move evaluation from single-turn compliance to holistic user trajectories.

If this is right

Refusal design must account for pre-refusal expectations users bring to the interaction.
Message framing and resource referrals need to be coordinated to avoid negative post-refusal outcomes.
Evaluation metrics for LLM safeguards should incorporate the full sequence of phases rather than binary accuracy.
Future mechanisms can be improved by addressing how refusals affect users' continued support-seeking behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the five phases are consistent, then refusal systems could be prototyped and tested against each phase separately.
The framework could be applied to compare different LLM platforms' refusal behaviors in mental health contexts.
Extending the study to observe actual interactions rather than self-reports might validate or refine the phases.
Similar phase models might apply to refusals in other high-stakes domains like crisis support.

Load-bearing premise

The self-reported experiences from the 53 survey participants and 16 interviewees represent the main patterns in how people use LLMs for mental health support.

What would settle it

A study that collects refusal experiences from a larger and more varied group of users and checks whether the same five phases reliably describe their accounts.

Figures

Figures reproduced from arXiv: 2602.01694 by Alice Qian, Blake Bullwinkel, Esther Howe, Hoda Heidari, Hong Shen, Jina Suh, Ningjing Tang, Paola Pedrelli, Qiaosi Wang.

**Figure 1.** Figure 1: Our proposed multi-phase framework of understanding LLM refusals in mental health support interactions as dynamic experiences. This framing is the result of 53 surveys and 16 interviews with end-users and mental health professionals. The framework reveals how refusal experiences unfold in phases including expectation formation, intent recognition, refusal framing, resource provision, and post-refusal outco… view at source ↗

read the original abstract

Content Warning: This paper contains participant quotes and discussions related to mental health challenges, emotional distress, and suicidal ideation. Large language models (LLMs) are increasingly used for mental health support, yet the model safeguards -- particularly refusals to engage with sensitive content -- remain poorly understood from the perspectives of users and mental health professionals (MHPs) and have been reported to cause real-world harms. This paper presents findings from a sequential mixed-methods study examining how LLM refusals are experienced and interpreted in mental health support interactions. Through surveys (N=53) and in-depth interviews (N=16) with individuals using LLMs for mental health support and MHPs, we reveal that refusals are not isolated, single-turn system behaviors but rather constitute dynamic, multi-phase experiences: pre-refusal expectation formation, refusal triggering and encounter, refusal message framing, resource referral provision, and post-refusal outcomes. We contribute a multi-phase framework for evaluating refusals beyond binary policy compliance accuracy and design recommendations for future refusal mechanisms. These findings suggest that understanding LLM refusals requires moving beyond single-turn interactions toward recognizing them as holistic experiences embedded within users' support-seeking trajectories and the broader LLM design pipeline.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes LLM refusals in mental health support as a five-phase process drawn from user and clinician reports, which is a useful shift but rests on limited qualitative data.

read the letter

The main point here is that refusals should be treated as extended experiences rather than isolated system outputs. The authors map five phases—expectation formation, triggering, message framing, referral, and post-refusal outcomes—from surveys with 53 participants and interviews with 16 mental health professionals. This moves the discussion past binary compliance checks toward something that matches how users actually encounter these systems during distress.

Referee Report

2 major / 1 minor

Summary. The paper claims that LLM refusals in mental health support interactions are not isolated single-turn system behaviors but instead constitute dynamic, multi-phase experiences. Drawing on a sequential mixed-methods study with surveys (N=53) and interviews (N=16) involving users and mental health professionals, it identifies five phases—pre-refusal expectation formation, refusal triggering and encounter, refusal message framing, resource referral provision, and post-refusal outcomes—and contributes a framework for evaluating refusals beyond binary policy compliance, along with design recommendations.

Significance. If the framework is robustly supported, the work is significant for HCI and AI ethics as it shifts focus from technical refusal accuracy to holistic, user-embedded experiences in sensitive mental health contexts. This could inform more context-aware safeguard designs and reduce reported harms from abrupt refusals. The mixed-methods contribution provides empirical user perspectives on an under-examined aspect of LLM deployment.

major comments (2)

[Abstract/Methods] Abstract and Methods: The claim that the data support the five-phase framework is presented without any description of the thematic analysis process, inter-rater reliability measures, coding procedures, or how saturation was determined. This absence directly affects the ability to evaluate whether the phases represent core dynamics or are shaped by the specific sample.
[Methods] Methods/Participant section: No details are given on sampling strategy, platform diversity (e.g., GPT-4 vs. Claude vs. open-source models), or stratification by distress severity. The central claim that the phases capture general refusal experiences therefore rests on an unexamined assumption that the N=53 + N=16 self-selected participants are representative across user groups and LLM interfaces.

minor comments (1)

[Abstract] Abstract: The content warning is appropriate; consider adding a brief note on the potential emotional impact for readers engaging with the participant quotes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which highlights important areas for improving the transparency and rigor of our methods description. We agree that expanding these sections will strengthen the manuscript and will incorporate the suggested revisions.

read point-by-point responses

Referee: [Abstract/Methods] Abstract and Methods: The claim that the data support the five-phase framework is presented without any description of the thematic analysis process, inter-rater reliability measures, coding procedures, or how saturation was determined. This absence directly affects the ability to evaluate whether the phases represent core dynamics or are shaped by the specific sample.

Authors: We agree that the manuscript would benefit from a more explicit account of the qualitative analysis. In the revised version, we will add a dedicated 'Data Analysis' subsection under Methods that details the thematic analysis process (inductive open coding of interview transcripts and open-ended survey responses, followed by iterative grouping into phases), inter-rater reliability (two coders independently coded 20% of the data with Cohen's kappa reported), coding procedures (codebook development and refinement), and saturation determination (no new themes after the 12th interview). This will clarify how the five-phase framework emerged from the data. revision: yes
Referee: [Methods] Methods/Participant section: No details are given on sampling strategy, platform diversity (e.g., GPT-4 vs. Claude vs. open-source models), or stratification by distress severity. The central claim that the phases capture general refusal experiences therefore rests on an unexamined assumption that the N=53 + N=16 self-selected participants are representative across user groups and LLM interfaces.

Authors: We acknowledge this gap in reporting. We will expand the Participants subsection to describe the convenience sampling strategy (recruitment via Reddit communities, mental health forums, and professional networks for MHPs), self-reported LLM platform usage (noting diversity across GPT-4, Claude, and other models), and data collected on distress levels (without formal stratification due to the exploratory nature and ethical constraints on screening). We will also revise the framing throughout to present the framework as derived from this purposive sample and explicitly discuss limitations on generalizability, rather than implying broad representativeness. revision: yes

Circularity Check

0 steps flagged

No circularity; framework constructed directly from participant data

full rationale

The paper derives its five-phase framework exclusively through thematic analysis of new survey (N=53) and interview (N=16) data. No equations, fitted parameters, predictions, or self-citations appear in the derivation. The phases are presented as emergent from the collected responses rather than defined in terms of themselves or reduced to prior author work. This matches the default case of a self-contained empirical study with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a qualitative empirical study; the framework is induced from interview and survey responses. No free parameters, mathematical axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5779 in / 1157 out tokens · 30848 ms · 2026-05-25T06:46:29.100545+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Engagement-Optimized Care: When LLMs become Mental Health Infrastructure
cs.CY 2026-05 unverdicted novelty 7.0

A longitudinal qualitative study of 18 US users finds that LLMs deliver socioemotional support but also foster dependency, one-sided validation, and privacy risks because their designs prioritize engagement over well-...

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Leah Hope Ajmani, Arka Ghosh, Benjamin Kaveladze, Eugenia Kim, Keertana Namuduri, Theresa Nguyen, Ebele Okoli, Jessica Schleider, Denae Ford, and Jina Suh. 2025. Seeking late night life lines: Experiences of conversational AI use in mental health crisis.arXiv [cs.HC] (Dec. 2025)

work page 2025
[2]

Andrew Selsky, Associated Press and Leah Willingham, Associated Press. 2022. How some encounters between police and people with mental illness can turn tragic. https://www.pbs.org/newshour/health/how-some-encounters-between-police-and-people-with-mental- illness-can-turn-tragic. Accessed: 2026-1-8

work page 2022
[3]

Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, K...

work page 2026
[4]

Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A Smith, Yejin Choi, and Hannaneh Hajishirzi. 2024. The art of saying no: Contextual noncompliance in language models.arXiv [cs.CL](July 2024)

work page 2024
[5]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.Qual. Res. Psychol.3, 2 (Jan. 2006), 77–101

work page 2006
[6]

2012.Thematic analysis.American Psychological Association

Virginia Braun and Victoria Clarke. 2012.Thematic analysis.American Psychological Association

work page 2012
[7]

Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis.Qualitative research in sport, exercise and health11, 4 (2019), 589–597

work page 2019
[8]

California State Legislature. 2025. Senate Bill 243: Companion Chatbots. Approved by Governor October 13, 2025. https://leginfo. legislature.ca.gov/faces/billNavClient.xhtml?bill_id=202520260SB243 Chapter 677, Statutes of 2025

work page 2025
[9]

Mohit Chandra, Suchismita Naik, Denae Ford, Ebele Okoli, Munmun De Choudhury, Mahsa Ershadi, Gonzalo Ramos, Javier Hernandez, Ananya Bhattacharjee, Shahed Warreth, et al. 2025. From Lived Experience to Insight: Unpacking the Psychological Risks of Using AI Conversational Agents. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Tra...

work page 2025
[10]

Mohit Chandra, Suchismita Naik, Denae Ford, Ebele Okoli, Munmun De Choudhury, Mahsa Ershadi, Gonzalo Ramos, Javier Hernandez, Ananya Bhattacharjee, Shahed Warreth, and Jina Suh. 2024. From lived experience to insight: Unpacking the psychological risks of using AI conversational agents.arXiv [cs.HC](Dec. 2024)

work page 2024
[11]

Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J Pappas, Florian Tramer, Hamed Hassani, and Eric Wong. 2024. JailbreakBench: An open robustness benchmark for jailbreaking large language models.arXiv [cs.CR](March 2024)

work page 2024
[12]

Khaoula Chehbouni, Mohammed Haddou, Jackie Chi Kit Cheung, and Golnoosh Farnadi. 2025. Neither valid nor reliable? Investigating the use of LLMs as judges.arXiv [cs.CL](Aug. 2025)

work page 2025
[13]

Adam Dahlgren Lindström, Leila Methnani, Lea Krause, Petter Ericson, Íñigo Martínez de Rituerto de Troya, Dimitri Coelho Mollo, and Roel Dobbe. 2025. Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback.Ethics Inf. Technol.27, 2 (June 2025), 28

work page 2025
[14]

Munmun De Choudhury and Sushovan De. 2014. Mental health discourse on reddit: Self-disclosure, social support, and anonymity. Proceedings of the International AAAI Conference on Web and Social Media8, 1 (May 2014), 71–80

work page 2014
[15]

Mary J De Silva, Erica Breuer, Lucy Lee, Laura Asher, Neerja Chowdhary, Crick Lund, and Vikram Patel. 2014. Theory of Change: a theory-driven approach to enhance the Medical Research Council’s framework for complex interventions.Trials15, 1 (July 2014), 267

work page 2014
[16]

John Draper and Richard T McKeon. 2024. The journey toward 988: A historical perspective on crisis hotlines in the United States. Psychiatr. Clin. North Am.47, 3 (Sept. 2024), 473–490

work page 2024
[17]

John Draper, Gillian Murphy, Eduardo Vega, David W Covington, and Richard McKeon. 2015. Helping callers to the National Suicide Prevention Lifeline who are at imminent risk of suicide: the importance of active engagement, active rescue, and collaboration between crisis and emergency services.Suicide Life Threat. Behav.45, 3 (June 2015), 261–270

work page 2015
[18]

Foundational Contributors, Ahmed El-Kishky, Daniel Selsam, Francis Song, Giambattista Parascandolo, Hongyu Ren, Hunter Lightman, Hyung Won, Ilge Akkaya, I Sutskever, Jason Wei, Jonathan Gordon, K Cobbe, Kevin Yu, Lukasz Kondraciuk, Max Schwarzer, Mostafa Rohaninejad, Noam Brown, Shengjia Zhao, Trapit Bansal, Vineet Kosaraju, Wenda Zhou Leadership, J Pacho...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[19]

I cannot write this because it violates our content policy

Lan Gao, Oscar Chen, Rachel Lee, Nick Feamster, Chenhao Tan, and Marshini Chetty. 2025. “I cannot write this because it violates our content policy”: Understanding content moderation policies and user experiences in generative AI products.arXiv [cs.HC](June 2025)

work page 2025
[20]

Clifford Geertz. 1973. The impact of the concept of culture on the concept of man

work page 1973
[21]

Su Golder, Shahd Ahmed, Gill Norman, and Andrew Booth. 2017. Attitudes toward the ethics of research using social media: A systematic review.J. Med. Internet Res.19, 6 (June 2017), e195

work page 2017
[23]

Melody Y Guan, Manas Joglekar, Eric Wallace, Saachi Jain, Boaz Barak, Alec Helyar, Rachel Dias, Andrea Vallone, Hongyu Ren, Jason Wei, Hyung Won Chung, Sam Toyer, Johannes Heidecke, Alex Beutel, and Amelia Glaese. 2024. Deliberative Alignment: Reasoning enables safer language models.arXiv [cs.CL](Dec. 2024)

work page 2024
[24]

Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, and Nouha Dziri. 2024. WildGuard: Open one-stop moderation tools for safety risks, jailbreaks, and refusals of LLMs.arXiv [cs.CL](June 2024)

work page 2024
[25]

for an app supposed to make its users feel better, it sure is a joke

Md Romael Haque and Sabirat Rubya. 2022. “for an app supposed to make its users feel better, it sure is a joke” - an analysis of user reviews of mobile mental health applications.Proc. ACM Hum. Comput. Interact.6, CSCW2 (Nov. 2022), 1–29

work page 2022
[26]

Christina Harrington, Sheena Erete, and Anne Marie Piper. 2019. Deconstructing community-based collaborative design: Towards more equitable participatory design engagements.Proceedings of the ACM on Human-Computer Interaction3, CSCW (2019), 1–25

work page 2019
[27]

Kashmir Hill. 2025. A Teen Was Suicidal. ChatGPT Was the Friend He Confided In.The New York Times(Aug. 2025)

work page 2025
[28]

Lujain Ibrahim, Saffron Huang, Lama Ahmad, Umang Bhatt, and Markus Anderljung. 2025. Towards interactive evaluations for interaction harms in human-AI systems. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, Vol. 8. 1302–1310

work page 2025
[29]

Zainab Iftikhar, Amy Xiao, Sean Ransom, Jeff Huang, and Harini Suresh. 2025. How LLM counselors violate ethical standards in mental health practice: A practitioner-informed framework.Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society8, 2 (Oct. 2025), 1311–1323

work page 2025
[30]

Illinois General Assembly. 2025. House Bill 1806: Wellness and Oversight for Psychological Resources Act. Signed into law August 1,

work page 2025
[31]

https://ilga.gov/Legislation/BillStatus?DocNum=1806&GAID=18&DocTypeID=HB&LegId=159219&SessionID=114 Public Act 104-0054

work page
[32]

Nataliya V Ivankova, John W Creswell, and Sheldon L Stick. 2006. Using mixed-methods sequential explanatory design: From theory to practice.Field Methods18, 1 (Feb. 2006), 3–20

work page 2006
[33]

Nicholas Jenkins, Michael Bloor, Jan Fischer, Lee Berney, and Joanne Neale. 2010. Putting it in context: the use of vignettes in qualitative interviewing.Qual. Res.10, 2 (April 2010), 175–198

work page 2010
[34]

Kelly Joyce, Laurel Smith-Doerr, Sharla Alegria, Susan Bell, Taylor Cruz, Steve G Hoffman, Safiya Umoja Noble, and Benjamin Shestakofsky. 2021. Toward a sociology of artificial intelligence: A call for research on inequalities and structural change.Socius7 (Jan. 2021), 237802312199958

work page 2021
[35]

I’ve talked to ChatGPT about my issues last night

Kyuha Jung, Gyuho Lee, Yuanhui Huang, and Yunan Chen. 2025. “I’ve talked to ChatGPT about my issues last night. ”: Examining Mental Health Conversations with Large Language Models through Reddit Analysis.arXiv [cs.HC](April 2025)

work page 2025
[36]

Reishiro Kawakami and Sukrit Venkatagiri. 2024. The impact of generative AI on artists. InCreativity and Cognition. ACM, New York, NY, USA, 79–82

work page 2024
[37]

Hannah Rose Kirk, Iason Gabriel, Chris Summerfield, Bertie Vidgen, and Scott A Hale. 2025. Why human–AI relationships need socioaffective alignment.Humanit. Soc. Sci. Commun.12, 1 (May 2025), 728

work page 2025
[38]

Theodora Koulouri, Robert D Macredie, and David Olakitan. 2022. Chatbots to support young adults’ mental health: An exploratory study of acceptability.ACM Trans. Interact. Intell. Syst.12, 2 (June 2022), 1–39

work page 2022
[39]

Seth Lazar and Alondra Nelson. 2023. AI safety on whose terms?Science381, 6654 (July 2023), 138

work page 2023
[40]

This is human intelligence debugging artificial intelligence

Zhuoyang Li, Zihao Zhu, Xinning Gui, and Yuhan Luo. 2025. “This is human intelligence debugging artificial intelligence”: Examining how people prompt GPT in seeking mental health support.Int. J. Hum. Comput. Stud.103555 (June 2025), 103555

work page 2025
[41]

Michael Madaio, Lisa Egede, Hariharan Subramonyam, Jennifer Wortman Vaughan, and Hanna Wallach. 2022. Assessing the fairness of AI systems: AI practitioners’ processes, challenges, and needs for support.Proc. ACM Hum. Comput. Interact.6, CSCW1 (March 2022), 1–26

work page 2022
[42]

Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, and Dan Hendrycks. 2024. HarmBench: A standardized evaluation framework for automated red teaming and robust refusal.arXiv [cs.LG](Feb. 2024). Beyond the Single Turn FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page 2024
[43]

Miles McCain, Ryn Linthicum, Chloe Lubinski, Alex Tamkin, Saffron Huang, Michael Stern, Kunal Handa, Esin Durmus, Tyler Neylon, Stuart Ritchie, Kamya Jagadish, Paruul Maheshwary, Sarah Heck, Alexandra Sanderford, and Deep Ganguli. 2025. How People Use Claude for Support, Advice, and Companionship. https://www.anthropic.com/news/how-people-use-claude-for-s...

work page 2025
[44]

I see me here

Ashlee Milton, Leah Ajmani, Michael Ann DeVito, and Stevie Chancellor. 2023. “I see me here”: Mental health content, community, and algorithmic curation on TikTok. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Vol. 16. ACM, New York, NY, USA, 1–17

work page 2023
[45]

Jared Moore, Declan Grabb, William Agnew, Kevin Klyman, Stevie Chancellor, Desmond C Ong, and Nick Haber. 2025. Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, USA, 599–627

work page 2025
[46]

Ramaravind Kommiya Mothilal, Shion Guha, and Syed Ishtiaque Ahmed. 2024. Towards a non-ideal methodological framework for Responsible ML.arXiv [cs.HC](Jan. 2024)

work page 2024
[47]

I don’t think RAI applies to my model

Nadia Nahar, Chenyang Yang, Yanxin Chen, Wesley Hanwen Deng, Ken Holstein, Motahhare Eslami, and Christian Kästner. 2026. “I don’t think RAI applies to my model” – engaging non-champions with sticky stories for responsible AI work. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–23

work page 2026
[48]

Alondra Nelson. 2023. Thick Alignment

work page 2023
[49]

New York State Assembly. 2025. An Act to amend the general business law, in relation to artificial intelligence companion models. Assembly Bill A6767, 2025–2026 Regular Sessions. https://www.nysenate.gov/legislation/bills/2025/A6767 Introduced by M. of A. Vanel; referred to the Committee on Consumer Affairs and Protection

work page 2025
[50]

OpenAI. 2025. Strengthening ChatGPT’s responses in sensitive conversations. https://openai.com/index/strengthening-chatgpt- responses-in-sensitive-conversations/. Accessed: 2025-11-24

work page 2025
[51]

OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner...

work page 2023
[52]

Lawrence A Palinkas, Sarah M Horwitz, Carla A Green, Jennifer P Wisdom, Naihua Duan, and Kimberly Hoagwood. 2015. Purposeful sampling for qualitative data collection and analysis in mixed method implementation research.Adm. Policy Ment. Health42, 5 (Sept. 2015), 533–544

work page 2015
[53]

can I not be suicidal on a Sunday?

Sachin R Pendse, Amit Sharma, Aditya Vashistha, Munmun De Choudhury, and Neha Kumar. 2021. “can I not be suicidal on a Sunday?”: Understanding technology-mediated pathways to mental health support.Proc. SIGCHI Conf. Hum. Factor. Comput. Syst.2021 (May 2021)

work page 2021
[54]

Mehrdad Rahsepar Meadi, Tomas Sillekens, Suzanne Metselaar, Anton van Balkom, Justin Bernstein, and Neeltje Batelaan. 2025. Exploring the ethical challenges of conversational AI in mental health care: Scoping review.JMIR Ment. Health12, 1 (Feb. 2025), e60432

work page 2025
[55]

Richard Ren, Steven Basart, Adam Khoja, Alice Gatti, Long Phan, Xuwang Yin, Mantas Mazeika, Alexander Pan, Gabriel Mukobi, Ryan H Kim, Stephen Fitz, and Dan Hendrycks. 2024. Safetywashing: Do AI safety benchmarks actually measure safety progress?arXiv [cs.LG] (July 2024)

work page 2024
[56]

Rhode Island General Assembly. 2026. An Act Relating to Commercial Law—General Regulatory Provisions—Artificial Intelligence Companion Models. Senate Bill S2195, January Session, A.D. 2026. https://webserver.rilegislature.gov/BillText/BillText26/SenateText26/ S2195.pdf Introduced by Senators Urso, Gu, DiPalma, Paolino, Zurier, Murray, and Appollonio; refe...

work page 2026
[57]

Laughing so I don’t cry

Anastasia Schaadhardt, Yue Fu, Cory Gennari Pratt, and Wanda Pratt. 2023. “Laughing so I don’t cry”: How TikTok users employ humor and compassion to connect around psychiatric hospitalization. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Vol. 14. ACM, New York, NY, USA, 1–13

work page 2023
[58]

Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and Abstraction in Sociotechnical Systems. InProceedings of the Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, USA, 59–68

work page 2019
[59]

Itai Shapira, Gerdus Benade, and Ariel D Procaccia. 2026. How RLHF Amplifies Sycophancy.arXiv [cs.AI](Feb. 2026)

work page 2026
[60]

Renee Shelby, Shalaleh Rismani, Kathryn Henne, Ajung Moon, Negar Rostamzadeh, Paul Nicholas, N’mah Yilla-Akbari, Jess Gallegos, Andrew Smart, Emilio Garcia, and Gurleen Virk. 2023. Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, Vol. 24. ACM, New ...

work page 2023
[61]

Brett Sholtis. 2020. During A Mental Health Crisis, A Family’s Call To 911 Turns Tragic.NPR(Oct. 2020)

work page 2020
[62]

It happened to be the perfect thing

Steven Siddals, John Torous, and Astrid Coxon. 2024. “It happened to be the perfect thing”: experiences of generative AI chatbots for mental health.Npj Ment Health Res3, 1 (Oct. 2024), 48

work page 2024
[63]

Petr Slovak and Sean A Munson. 2024. HCI contributions in mental health: A modular framework to guide psychosocial intervention design.Proc. SIGCHI Conf. Hum. Factor. Comput. Syst.2024 (May 2024)

work page 2024
[64]

Inhwa Song, Sachin R Pendse, Neha Kumar, and Munmun De Choudhury. 2024. The typing cure: Experiences with Large Language Model chatbots for mental health support.arXiv [cs.HC](Jan. 2024)

work page 2024
[65]

Alexandra Souly, Qingyuan Lu, Dillon Bowen, Tu Trinh, Elvis Hsieh, Sana Pandey, Pieter Abbeel, Justin Svegliato, Scott Emmons, Olivia Watkins, and Sam Toyer. 2024. A StrongREJECT for Empty Jailbreaks. arXiv:2402.10260 [cs.LG] https://arxiv.org/abs/2402.10260

work page internal anchor Pith review Pith/arXiv arXiv 2024
[66]

Ningjing Tang, Megan Li, Amy Winecoff, Michael Madaio, Hoda Heidari, and Hong Shen. 2026. Navigating uncertainties: How GenAI developers document their models on open-source platforms. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–19

work page 2026
[67]

Tangila Islam Tanni, Mamtaj Akter, Joshua Anderson, Mary Jean Amon, and Pamela J Wisniewski. 2024. Examining the unique online risk experiences and mental health outcomes of LGBTQ+ versus heterosexual youth. InProceedings of the CHI Conference on Human Factors in Computing Systems, Vol. 31. ACM, New York, NY, USA, 1–21

work page 2024
[68]

Tamar Tavory. 2024. Regulating AI in mental health: Ethics of care perspective.JMIR Ment. Health11 (Sept. 2024), e58493

work page 2024
[69]

Bertie Vidgen, Nino Scherrer, Hannah Rose Kirk, Rebecca Qian, Anand Kannappan, Scott A Hale, and Paul Röttger. 2023. SimpleSafe- tyTests: A test suite for identifying critical safety risks in large language models.arXiv [cs.CL](Nov. 2023)

work page 2023
[70]

Hanna Wallach, Meera Desai, A Feder Cooper, Angelina Wang, Chad Atalla, Solon Barocas, Su Lin Blodgett, Alexandra Chouldechova, Emily Corvi, P Alex Dow, Jean Garcia-Gathright, Alexandra Olteanu, Nicholas Pangakis, Stefanie Reed, Emily Sheng, Dan Vann, Jennifer Wortman Vaughan, Matthew Vogel, Hannah Washington, and Abigail Z Jacobs. 2025. Position: Evaluat...

work page 2025
[71]

Yuxia Wang, Haonan Li, Xudong Han, Preslav Nakov, and Timothy Baldwin. 2024. Do-Not-Answer: Evaluating Safeguards in LLMs. In Findings of the Association for Computational Linguistics: EACL 2024. 896–911

work page 2024
[72]

Laura Weidinger, Inioluwa Deborah Raji, Hanna Wallach, Margaret Mitchell, Angelina Wang, Olawale Salaudeen, Rishi Bommasani, Deep Ganguli, Sanmi Koyejo, and William Isaac. 2025. Toward an evaluation science for generative AI systems.arXiv [cs.AI](March 2025)

work page 2025
[73]

Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, and William Isaac. 2023. Sociotechnical Safety Evaluation of Generative AI Systems.arXiv [cs.AI](Oct. 2023)

work page 2023
[74]

Bingbing Wen, Jihan Yao, Shangbin Feng, Chenjun Xu, Yulia Tsvetkov, Bill Howe, and Lucy Lu Wang. 2025. Know your limits: A survey of abstention in large language models.Trans. Assoc. Comput. Linguist.13 (June 2025), 529–556. Beyond the Single Turn FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page 2025
[75]

As an AI language model, I cannot

Joel Wester, Tim Schrills, Henning Pohl, and Niels van Berkel. 2024. “As an AI language model, I cannot”: Investigating LLM Denials of User Requests. InProceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24, Article 979). Association for Computing Machinery, New York, NY, USA, 1–14

work page 2024
[76]

Richmond Y Wong. 2021. Tactics of soft resistance in user experience professionals’ values work.Proc. ACM Hum. Comput. Interact.5, CSCW2 (Oct. 2021), 1–28

work page 2021
[77]

Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, and Prateek Mittal. 2024. SORRY-bench: Systematically evaluating large language model safety refusal.arXiv [cs.AI](June 2024)

work page 2024
[78]

Dong Whi Yoo, Jiayue Melissa Shi, Violeta J Rodriguez, and Koustuv Saha. 2025. AI chatbots for mental health: Values and harms from lived experiences of depression.arXiv [cs.HC](April 2025)

work page 2025
[79]

Meg Young, Lassana Magassa, and Batya Friedman. 2019. Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents.Ethics and Information Technology21, 2 (2019), 89–103

work page 2019
[80]

Yuan Yuan, Tina Sriskandarajah, Anna-Luisa Brakman, Alec Helyar, Alex Beutel, Andrea Vallone, and Saachi Jain. 2025. From hard refusals to safe-completions: Toward output-centric safety training.arXiv [cs.CY](Aug. 2025)

work page 2025
[81]

Xi Zheng, Zhuoyang Li, Xinning Gui, and Yuhan Luo. 2025. Customizing emotional support: How do individuals construct and interact with LLM-powered chatbots. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–20

work page 2025

Showing first 80 references.

[1] [1]

Leah Hope Ajmani, Arka Ghosh, Benjamin Kaveladze, Eugenia Kim, Keertana Namuduri, Theresa Nguyen, Ebele Okoli, Jessica Schleider, Denae Ford, and Jina Suh. 2025. Seeking late night life lines: Experiences of conversational AI use in mental health crisis.arXiv [cs.HC] (Dec. 2025)

work page 2025

[2] [2]

Andrew Selsky, Associated Press and Leah Willingham, Associated Press. 2022. How some encounters between police and people with mental illness can turn tragic. https://www.pbs.org/newshour/health/how-some-encounters-between-police-and-people-with-mental- illness-can-turn-tragic. Accessed: 2026-1-8

work page 2022

[3] [3]

Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, K...

work page 2026

[4] [4]

Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A Smith, Yejin Choi, and Hannaneh Hajishirzi. 2024. The art of saying no: Contextual noncompliance in language models.arXiv [cs.CL](July 2024)

work page 2024

[5] [5]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.Qual. Res. Psychol.3, 2 (Jan. 2006), 77–101

work page 2006

[6] [6]

2012.Thematic analysis.American Psychological Association

Virginia Braun and Victoria Clarke. 2012.Thematic analysis.American Psychological Association

work page 2012

[7] [7]

Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis.Qualitative research in sport, exercise and health11, 4 (2019), 589–597

work page 2019

[8] [8]

California State Legislature. 2025. Senate Bill 243: Companion Chatbots. Approved by Governor October 13, 2025. https://leginfo. legislature.ca.gov/faces/billNavClient.xhtml?bill_id=202520260SB243 Chapter 677, Statutes of 2025

work page 2025

[9] [9]

Mohit Chandra, Suchismita Naik, Denae Ford, Ebele Okoli, Munmun De Choudhury, Mahsa Ershadi, Gonzalo Ramos, Javier Hernandez, Ananya Bhattacharjee, Shahed Warreth, et al. 2025. From Lived Experience to Insight: Unpacking the Psychological Risks of Using AI Conversational Agents. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Tra...

work page 2025

[10] [10]

Mohit Chandra, Suchismita Naik, Denae Ford, Ebele Okoli, Munmun De Choudhury, Mahsa Ershadi, Gonzalo Ramos, Javier Hernandez, Ananya Bhattacharjee, Shahed Warreth, and Jina Suh. 2024. From lived experience to insight: Unpacking the psychological risks of using AI conversational agents.arXiv [cs.HC](Dec. 2024)

work page 2024

[11] [11]

Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J Pappas, Florian Tramer, Hamed Hassani, and Eric Wong. 2024. JailbreakBench: An open robustness benchmark for jailbreaking large language models.arXiv [cs.CR](March 2024)

work page 2024

[12] [12]

Khaoula Chehbouni, Mohammed Haddou, Jackie Chi Kit Cheung, and Golnoosh Farnadi. 2025. Neither valid nor reliable? Investigating the use of LLMs as judges.arXiv [cs.CL](Aug. 2025)

work page 2025

[13] [13]

Adam Dahlgren Lindström, Leila Methnani, Lea Krause, Petter Ericson, Íñigo Martínez de Rituerto de Troya, Dimitri Coelho Mollo, and Roel Dobbe. 2025. Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback.Ethics Inf. Technol.27, 2 (June 2025), 28

work page 2025

[14] [14]

Munmun De Choudhury and Sushovan De. 2014. Mental health discourse on reddit: Self-disclosure, social support, and anonymity. Proceedings of the International AAAI Conference on Web and Social Media8, 1 (May 2014), 71–80

work page 2014

[15] [15]

Mary J De Silva, Erica Breuer, Lucy Lee, Laura Asher, Neerja Chowdhary, Crick Lund, and Vikram Patel. 2014. Theory of Change: a theory-driven approach to enhance the Medical Research Council’s framework for complex interventions.Trials15, 1 (July 2014), 267

work page 2014

[16] [16]

John Draper and Richard T McKeon. 2024. The journey toward 988: A historical perspective on crisis hotlines in the United States. Psychiatr. Clin. North Am.47, 3 (Sept. 2024), 473–490

work page 2024

[17] [17]

John Draper, Gillian Murphy, Eduardo Vega, David W Covington, and Richard McKeon. 2015. Helping callers to the National Suicide Prevention Lifeline who are at imminent risk of suicide: the importance of active engagement, active rescue, and collaboration between crisis and emergency services.Suicide Life Threat. Behav.45, 3 (June 2015), 261–270

work page 2015

[18] [18]

Foundational Contributors, Ahmed El-Kishky, Daniel Selsam, Francis Song, Giambattista Parascandolo, Hongyu Ren, Hunter Lightman, Hyung Won, Ilge Akkaya, I Sutskever, Jason Wei, Jonathan Gordon, K Cobbe, Kevin Yu, Lukasz Kondraciuk, Max Schwarzer, Mostafa Rohaninejad, Noam Brown, Shengjia Zhao, Trapit Bansal, Vineet Kosaraju, Wenda Zhou Leadership, J Pacho...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[19] [19]

I cannot write this because it violates our content policy

Lan Gao, Oscar Chen, Rachel Lee, Nick Feamster, Chenhao Tan, and Marshini Chetty. 2025. “I cannot write this because it violates our content policy”: Understanding content moderation policies and user experiences in generative AI products.arXiv [cs.HC](June 2025)

work page 2025

[20] [20]

Clifford Geertz. 1973. The impact of the concept of culture on the concept of man

work page 1973

[21] [21]

Su Golder, Shahd Ahmed, Gill Norman, and Andrew Booth. 2017. Attitudes toward the ethics of research using social media: A systematic review.J. Med. Internet Res.19, 6 (June 2017), e195

work page 2017

[22] [23]

Melody Y Guan, Manas Joglekar, Eric Wallace, Saachi Jain, Boaz Barak, Alec Helyar, Rachel Dias, Andrea Vallone, Hongyu Ren, Jason Wei, Hyung Won Chung, Sam Toyer, Johannes Heidecke, Alex Beutel, and Amelia Glaese. 2024. Deliberative Alignment: Reasoning enables safer language models.arXiv [cs.CL](Dec. 2024)

work page 2024

[23] [24]

Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, and Nouha Dziri. 2024. WildGuard: Open one-stop moderation tools for safety risks, jailbreaks, and refusals of LLMs.arXiv [cs.CL](June 2024)

work page 2024

[24] [25]

for an app supposed to make its users feel better, it sure is a joke

Md Romael Haque and Sabirat Rubya. 2022. “for an app supposed to make its users feel better, it sure is a joke” - an analysis of user reviews of mobile mental health applications.Proc. ACM Hum. Comput. Interact.6, CSCW2 (Nov. 2022), 1–29

work page 2022

[25] [26]

Christina Harrington, Sheena Erete, and Anne Marie Piper. 2019. Deconstructing community-based collaborative design: Towards more equitable participatory design engagements.Proceedings of the ACM on Human-Computer Interaction3, CSCW (2019), 1–25

work page 2019

[26] [27]

Kashmir Hill. 2025. A Teen Was Suicidal. ChatGPT Was the Friend He Confided In.The New York Times(Aug. 2025)

work page 2025

[27] [28]

Lujain Ibrahim, Saffron Huang, Lama Ahmad, Umang Bhatt, and Markus Anderljung. 2025. Towards interactive evaluations for interaction harms in human-AI systems. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, Vol. 8. 1302–1310

work page 2025

[28] [29]

Zainab Iftikhar, Amy Xiao, Sean Ransom, Jeff Huang, and Harini Suresh. 2025. How LLM counselors violate ethical standards in mental health practice: A practitioner-informed framework.Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society8, 2 (Oct. 2025), 1311–1323

work page 2025

[29] [30]

Illinois General Assembly. 2025. House Bill 1806: Wellness and Oversight for Psychological Resources Act. Signed into law August 1,

work page 2025

[30] [31]

https://ilga.gov/Legislation/BillStatus?DocNum=1806&GAID=18&DocTypeID=HB&LegId=159219&SessionID=114 Public Act 104-0054

work page

[31] [32]

Nataliya V Ivankova, John W Creswell, and Sheldon L Stick. 2006. Using mixed-methods sequential explanatory design: From theory to practice.Field Methods18, 1 (Feb. 2006), 3–20

work page 2006

[32] [33]

Nicholas Jenkins, Michael Bloor, Jan Fischer, Lee Berney, and Joanne Neale. 2010. Putting it in context: the use of vignettes in qualitative interviewing.Qual. Res.10, 2 (April 2010), 175–198

work page 2010

[33] [34]

Kelly Joyce, Laurel Smith-Doerr, Sharla Alegria, Susan Bell, Taylor Cruz, Steve G Hoffman, Safiya Umoja Noble, and Benjamin Shestakofsky. 2021. Toward a sociology of artificial intelligence: A call for research on inequalities and structural change.Socius7 (Jan. 2021), 237802312199958

work page 2021

[34] [35]

I’ve talked to ChatGPT about my issues last night

Kyuha Jung, Gyuho Lee, Yuanhui Huang, and Yunan Chen. 2025. “I’ve talked to ChatGPT about my issues last night. ”: Examining Mental Health Conversations with Large Language Models through Reddit Analysis.arXiv [cs.HC](April 2025)

work page 2025

[35] [36]

Reishiro Kawakami and Sukrit Venkatagiri. 2024. The impact of generative AI on artists. InCreativity and Cognition. ACM, New York, NY, USA, 79–82

work page 2024

[36] [37]

Hannah Rose Kirk, Iason Gabriel, Chris Summerfield, Bertie Vidgen, and Scott A Hale. 2025. Why human–AI relationships need socioaffective alignment.Humanit. Soc. Sci. Commun.12, 1 (May 2025), 728

work page 2025

[37] [38]

Theodora Koulouri, Robert D Macredie, and David Olakitan. 2022. Chatbots to support young adults’ mental health: An exploratory study of acceptability.ACM Trans. Interact. Intell. Syst.12, 2 (June 2022), 1–39

work page 2022

[38] [39]

Seth Lazar and Alondra Nelson. 2023. AI safety on whose terms?Science381, 6654 (July 2023), 138

work page 2023

[39] [40]

This is human intelligence debugging artificial intelligence

Zhuoyang Li, Zihao Zhu, Xinning Gui, and Yuhan Luo. 2025. “This is human intelligence debugging artificial intelligence”: Examining how people prompt GPT in seeking mental health support.Int. J. Hum. Comput. Stud.103555 (June 2025), 103555

work page 2025

[40] [41]

Michael Madaio, Lisa Egede, Hariharan Subramonyam, Jennifer Wortman Vaughan, and Hanna Wallach. 2022. Assessing the fairness of AI systems: AI practitioners’ processes, challenges, and needs for support.Proc. ACM Hum. Comput. Interact.6, CSCW1 (March 2022), 1–26

work page 2022

[41] [42]

Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, and Dan Hendrycks. 2024. HarmBench: A standardized evaluation framework for automated red teaming and robust refusal.arXiv [cs.LG](Feb. 2024). Beyond the Single Turn FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page 2024

[42] [43]

Miles McCain, Ryn Linthicum, Chloe Lubinski, Alex Tamkin, Saffron Huang, Michael Stern, Kunal Handa, Esin Durmus, Tyler Neylon, Stuart Ritchie, Kamya Jagadish, Paruul Maheshwary, Sarah Heck, Alexandra Sanderford, and Deep Ganguli. 2025. How People Use Claude for Support, Advice, and Companionship. https://www.anthropic.com/news/how-people-use-claude-for-s...

work page 2025

[43] [44]

I see me here

Ashlee Milton, Leah Ajmani, Michael Ann DeVito, and Stevie Chancellor. 2023. “I see me here”: Mental health content, community, and algorithmic curation on TikTok. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Vol. 16. ACM, New York, NY, USA, 1–17

work page 2023

[44] [45]

Jared Moore, Declan Grabb, William Agnew, Kevin Klyman, Stevie Chancellor, Desmond C Ong, and Nick Haber. 2025. Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, USA, 599–627

work page 2025

[45] [46]

Ramaravind Kommiya Mothilal, Shion Guha, and Syed Ishtiaque Ahmed. 2024. Towards a non-ideal methodological framework for Responsible ML.arXiv [cs.HC](Jan. 2024)

work page 2024

[46] [47]

I don’t think RAI applies to my model

Nadia Nahar, Chenyang Yang, Yanxin Chen, Wesley Hanwen Deng, Ken Holstein, Motahhare Eslami, and Christian Kästner. 2026. “I don’t think RAI applies to my model” – engaging non-champions with sticky stories for responsible AI work. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–23

work page 2026

[47] [48]

Alondra Nelson. 2023. Thick Alignment

work page 2023

[48] [49]

New York State Assembly. 2025. An Act to amend the general business law, in relation to artificial intelligence companion models. Assembly Bill A6767, 2025–2026 Regular Sessions. https://www.nysenate.gov/legislation/bills/2025/A6767 Introduced by M. of A. Vanel; referred to the Committee on Consumer Affairs and Protection

work page 2025

[49] [50]

OpenAI. 2025. Strengthening ChatGPT’s responses in sensitive conversations. https://openai.com/index/strengthening-chatgpt- responses-in-sensitive-conversations/. Accessed: 2025-11-24

work page 2025

[50] [51]

OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner...

work page 2023

[51] [52]

Lawrence A Palinkas, Sarah M Horwitz, Carla A Green, Jennifer P Wisdom, Naihua Duan, and Kimberly Hoagwood. 2015. Purposeful sampling for qualitative data collection and analysis in mixed method implementation research.Adm. Policy Ment. Health42, 5 (Sept. 2015), 533–544

work page 2015

[52] [53]

can I not be suicidal on a Sunday?

Sachin R Pendse, Amit Sharma, Aditya Vashistha, Munmun De Choudhury, and Neha Kumar. 2021. “can I not be suicidal on a Sunday?”: Understanding technology-mediated pathways to mental health support.Proc. SIGCHI Conf. Hum. Factor. Comput. Syst.2021 (May 2021)

work page 2021

[53] [54]

Mehrdad Rahsepar Meadi, Tomas Sillekens, Suzanne Metselaar, Anton van Balkom, Justin Bernstein, and Neeltje Batelaan. 2025. Exploring the ethical challenges of conversational AI in mental health care: Scoping review.JMIR Ment. Health12, 1 (Feb. 2025), e60432

work page 2025

[54] [55]

Richard Ren, Steven Basart, Adam Khoja, Alice Gatti, Long Phan, Xuwang Yin, Mantas Mazeika, Alexander Pan, Gabriel Mukobi, Ryan H Kim, Stephen Fitz, and Dan Hendrycks. 2024. Safetywashing: Do AI safety benchmarks actually measure safety progress?arXiv [cs.LG] (July 2024)

work page 2024

[55] [56]

Rhode Island General Assembly. 2026. An Act Relating to Commercial Law—General Regulatory Provisions—Artificial Intelligence Companion Models. Senate Bill S2195, January Session, A.D. 2026. https://webserver.rilegislature.gov/BillText/BillText26/SenateText26/ S2195.pdf Introduced by Senators Urso, Gu, DiPalma, Paolino, Zurier, Murray, and Appollonio; refe...

work page 2026

[56] [57]

Laughing so I don’t cry

Anastasia Schaadhardt, Yue Fu, Cory Gennari Pratt, and Wanda Pratt. 2023. “Laughing so I don’t cry”: How TikTok users employ humor and compassion to connect around psychiatric hospitalization. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Vol. 14. ACM, New York, NY, USA, 1–13

work page 2023

[57] [58]

Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and Abstraction in Sociotechnical Systems. InProceedings of the Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, USA, 59–68

work page 2019

[58] [59]

Itai Shapira, Gerdus Benade, and Ariel D Procaccia. 2026. How RLHF Amplifies Sycophancy.arXiv [cs.AI](Feb. 2026)

work page 2026

[59] [60]

Renee Shelby, Shalaleh Rismani, Kathryn Henne, Ajung Moon, Negar Rostamzadeh, Paul Nicholas, N’mah Yilla-Akbari, Jess Gallegos, Andrew Smart, Emilio Garcia, and Gurleen Virk. 2023. Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, Vol. 24. ACM, New ...

work page 2023

[60] [61]

Brett Sholtis. 2020. During A Mental Health Crisis, A Family’s Call To 911 Turns Tragic.NPR(Oct. 2020)

work page 2020

[61] [62]

It happened to be the perfect thing

Steven Siddals, John Torous, and Astrid Coxon. 2024. “It happened to be the perfect thing”: experiences of generative AI chatbots for mental health.Npj Ment Health Res3, 1 (Oct. 2024), 48

work page 2024

[62] [63]

Petr Slovak and Sean A Munson. 2024. HCI contributions in mental health: A modular framework to guide psychosocial intervention design.Proc. SIGCHI Conf. Hum. Factor. Comput. Syst.2024 (May 2024)

work page 2024

[63] [64]

Inhwa Song, Sachin R Pendse, Neha Kumar, and Munmun De Choudhury. 2024. The typing cure: Experiences with Large Language Model chatbots for mental health support.arXiv [cs.HC](Jan. 2024)

work page 2024

[64] [65]

Alexandra Souly, Qingyuan Lu, Dillon Bowen, Tu Trinh, Elvis Hsieh, Sana Pandey, Pieter Abbeel, Justin Svegliato, Scott Emmons, Olivia Watkins, and Sam Toyer. 2024. A StrongREJECT for Empty Jailbreaks. arXiv:2402.10260 [cs.LG] https://arxiv.org/abs/2402.10260

work page internal anchor Pith review Pith/arXiv arXiv 2024

[65] [66]

Ningjing Tang, Megan Li, Amy Winecoff, Michael Madaio, Hoda Heidari, and Hong Shen. 2026. Navigating uncertainties: How GenAI developers document their models on open-source platforms. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–19

work page 2026

[66] [67]

Tangila Islam Tanni, Mamtaj Akter, Joshua Anderson, Mary Jean Amon, and Pamela J Wisniewski. 2024. Examining the unique online risk experiences and mental health outcomes of LGBTQ+ versus heterosexual youth. InProceedings of the CHI Conference on Human Factors in Computing Systems, Vol. 31. ACM, New York, NY, USA, 1–21

work page 2024

[67] [68]

Tamar Tavory. 2024. Regulating AI in mental health: Ethics of care perspective.JMIR Ment. Health11 (Sept. 2024), e58493

work page 2024

[68] [69]

Bertie Vidgen, Nino Scherrer, Hannah Rose Kirk, Rebecca Qian, Anand Kannappan, Scott A Hale, and Paul Röttger. 2023. SimpleSafe- tyTests: A test suite for identifying critical safety risks in large language models.arXiv [cs.CL](Nov. 2023)

work page 2023

[69] [70]

Hanna Wallach, Meera Desai, A Feder Cooper, Angelina Wang, Chad Atalla, Solon Barocas, Su Lin Blodgett, Alexandra Chouldechova, Emily Corvi, P Alex Dow, Jean Garcia-Gathright, Alexandra Olteanu, Nicholas Pangakis, Stefanie Reed, Emily Sheng, Dan Vann, Jennifer Wortman Vaughan, Matthew Vogel, Hannah Washington, and Abigail Z Jacobs. 2025. Position: Evaluat...

work page 2025

[70] [71]

Yuxia Wang, Haonan Li, Xudong Han, Preslav Nakov, and Timothy Baldwin. 2024. Do-Not-Answer: Evaluating Safeguards in LLMs. In Findings of the Association for Computational Linguistics: EACL 2024. 896–911

work page 2024

[71] [72]

Laura Weidinger, Inioluwa Deborah Raji, Hanna Wallach, Margaret Mitchell, Angelina Wang, Olawale Salaudeen, Rishi Bommasani, Deep Ganguli, Sanmi Koyejo, and William Isaac. 2025. Toward an evaluation science for generative AI systems.arXiv [cs.AI](March 2025)

work page 2025

[72] [73]

Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, and William Isaac. 2023. Sociotechnical Safety Evaluation of Generative AI Systems.arXiv [cs.AI](Oct. 2023)

work page 2023

[73] [74]

Bingbing Wen, Jihan Yao, Shangbin Feng, Chenjun Xu, Yulia Tsvetkov, Bill Howe, and Lucy Lu Wang. 2025. Know your limits: A survey of abstention in large language models.Trans. Assoc. Comput. Linguist.13 (June 2025), 529–556. Beyond the Single Turn FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page 2025

[74] [75]

As an AI language model, I cannot

Joel Wester, Tim Schrills, Henning Pohl, and Niels van Berkel. 2024. “As an AI language model, I cannot”: Investigating LLM Denials of User Requests. InProceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24, Article 979). Association for Computing Machinery, New York, NY, USA, 1–14

work page 2024

[75] [76]

Richmond Y Wong. 2021. Tactics of soft resistance in user experience professionals’ values work.Proc. ACM Hum. Comput. Interact.5, CSCW2 (Oct. 2021), 1–28

work page 2021

[76] [77]

Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, and Prateek Mittal. 2024. SORRY-bench: Systematically evaluating large language model safety refusal.arXiv [cs.AI](June 2024)

work page 2024

[77] [78]

Dong Whi Yoo, Jiayue Melissa Shi, Violeta J Rodriguez, and Koustuv Saha. 2025. AI chatbots for mental health: Values and harms from lived experiences of depression.arXiv [cs.HC](April 2025)

work page 2025

[78] [79]

Meg Young, Lassana Magassa, and Batya Friedman. 2019. Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents.Ethics and Information Technology21, 2 (2019), 89–103

work page 2019

[79] [80]

Yuan Yuan, Tina Sriskandarajah, Anna-Luisa Brakman, Alec Helyar, Alex Beutel, Andrea Vallone, and Saachi Jain. 2025. From hard refusals to safe-completions: Toward output-centric safety training.arXiv [cs.CY](Aug. 2025)

work page 2025

[80] [81]

Xi Zheng, Zhuoyang Li, Xinning Gui, and Yuhan Luo. 2025. Customizing emotional support: How do individuals construct and interact with LLM-powered chatbots. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–20

work page 2025