pith. sign in

arxiv: 2606.26641 · v1 · pith:ZGRAC2EDnew · submitted 2026-06-25 · 💻 cs.HC

Invisible Impact of Empathy on Behavioral Change: Isolating the Effect of Empathy in Long-term Physical Activity Coaching Chatbot Interactions

Pith reviewed 2026-06-26 03:39 UTC · model grok-4.3

classification 💻 cs.HC
keywords empathychatbotphysical activitybehavior changecoachinguser studyintention
0
0 comments X

The pith

Higher-empathy chatbots increased step counts and intention gains despite being undetectable by users.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether empathy in AI coaching chatbots actually drives behavior change by creating three versions that differ only in empathy and running a six-week study with 13 participants. Users could not tell the empathy levels apart and often preferred the non-empathetic chatbot for engagement and usefulness. Yet the higher-empathy versions produced larger average increases in daily steps and quicker improvements in intention to follow the advice. This points to empathy working subtly through motivation and trust rather than explicit recognition.

Core claim

By isolating empathy as the sole variable in three WhatsApp-based physical activity coaching chatbots and tracking objective step counts alongside self-reported intention over six weeks in a within-subject design, the study finds that higher empathy correlates with greater step increases and faster intention gains, even though participants struggled to distinguish the conditions and rated the non-empathetic version higher on engagement and usefulness. The pattern is interpreted as empathy influencing the peripheral route of persuasion.

What carries the argument

Three chatbots differing only in empathy level, compared in a within-subject six-week study measuring step counts and intention.

If this is right

  • Empathy may support sustained behavior change without users explicitly recognizing it.
  • PA coaching chatbot designs should incorporate empathy to enhance motivation and trust.
  • The effect aligns with the peripheral route in the Elaboration Likelihood Model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Larger studies could test whether the effect holds across diverse user groups and longer time frames.
  • Similar empathy variations could be examined in chatbots targeting other behaviors such as diet tracking.
  • Designers might include empathy even when explicit user ratings favor less empathetic interfaces.

Load-bearing premise

The three chatbots differed only in empathy level with all other conversational elements held constant, and the within-subject design with 13 participants isolates the effect of empathy on objective step counts and self-reported intention.

What would settle it

A replication showing no difference in step count increases or intention improvements across empathy conditions would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.26641 by Kai-Hui Liang, Li Siyan, Masahiro Yoshida, Shiyoh Goetsu, Shopnil Shahriar, Tsunayuki Ohwa, Wei-Wei Du, Xuhai Xu, Yilin Ye, Zhou Yu.

Figure 1
Figure 1. Figure 1: The overall workflow of the Empathetic version of our Physical Activity (PA) coaching chatbot. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of what the context fed into the LLM would look like between the three versions of the chatbot in the maintenance [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The general processes for the maintenance session. The personalized components are highlighted in green. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Example maintenance session conversations from the same user with Empathetic (left), Standard (middle), and Non-Empathetic [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The overall workflow of the Clinical Empathy Module to populate the language model context with the appropriate empathy [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Preferences for a PA coaching chatbot reported in the pre-survey by 106 responders. The preference rating for empathy levels [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Average LLM-annotated scores for each EPITOME dimension for every empathetic condition (NE = Non-Empathetic, S = [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Column-normalized percentage heatmap linking self-reported check-in form responses from our study with the LLM￾annotated EPITOME scores. integration exhibits overall higher empathy compared to directly leveraging the general empathy from LLMs, according to the EPITOME framework. We are further interested in investigating how perceived empathy correlates with our self-report measures on the check-in form. A… view at source ↗
read the original abstract

Current dialogue systems, powered by large language models, often treat empathy as essential without assessing its true impact, especially in behavior change, where motivation and adherence often depend on subtle user-chatbot dynamics. We examine this assumption by building three WhatsApp physical-activity (PA) coaching chatbots that differ only in empathy level and evaluating them in a six-week within-subject study (N = 13). Participants struggled to distinguish between the empathy conditions, and the non-empathetic version was often rated as more engaging and useful. However, higher-empathy variants were still associated with a larger overall average increase in step counts and faster improvement in intention to follow advice. These results suggest empathy's role is nuanced: it may be hard for lay users to identify explicitly, but it can still shape motivation and trust that support sustained change. We interpret this pattern through the Elaboration Likelihood Model's peripheral route. We highlight design implications for building next-generation PA coaching chatbots that balance effectiveness with human-like connection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript reports on a within-subject experiment (N=13) comparing three physical activity coaching chatbots on WhatsApp that vary in empathy. Participants could not distinguish the empathy levels, and the non-empathetic chatbot was rated higher in engagement and usefulness. Nevertheless, higher empathy was associated with greater average increases in daily step counts and faster improvements in intention to follow advice. The authors interpret this through the peripheral route of the Elaboration Likelihood Model and discuss design implications.

Significance. Should the central associations prove robust, the work contributes to understanding the subtle role of empathy in LLM-powered dialogue systems for behavior change. It challenges the assumption that empathy must be explicitly detectable to be effective and provides empirical data on objective behavioral outcomes alongside self-reports. The within-subject design and use of real-world messaging app are strengths, though the small sample limits generalizability.

major comments (2)
  1. [Abstract] Abstract: The premise that the chatbots 'differ only in empathy level' is undermined by the reported result that the non-empathetic version was rated as more engaging and useful. This indicates that other aspects of the conversational style may have differed, making it difficult to isolate empathy as the causal factor for the observed differences in step counts and intention.
  2. [Methods and Results] Methods and Results: The study uses N=13 in a six-week within-subject crossover design, yet the abstract provides no information on the statistical tests used, effect sizes, how empathy was specifically operationalized in the prompts, procedures for blinding or counterbalancing, or handling of order effects and carryover. These details are load-bearing for evaluating whether the design isolates the effect of empathy.
minor comments (1)
  1. [Abstract] Abstract: The abstract mentions 'associations' but could benefit from specifying the direction and magnitude of effects more precisely.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the scope and limitations of our work. We address each major point below and outline targeted revisions to improve transparency without overstating the isolation of empathy.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The premise that the chatbots 'differ only in empathy level' is undermined by the reported result that the non-empathetic version was rated as more engaging and useful. This indicates that other aspects of the conversational style may have differed, making it difficult to isolate empathy as the causal factor for the observed differences in step counts and intention.

    Authors: We acknowledge the referee's concern. The chatbots were constructed by varying only the empathy-related instructions in the system prompts while holding the core coaching content, structure, and response guidelines constant. Nevertheless, the observed differences in engagement and usefulness ratings indicate that empathy may have influenced these perceptions or that subtle, unintended variations in conversational tone emerged. In the revised manuscript we will (a) qualify the abstract claim to 'designed to differ primarily in empathy level' and (b) expand the discussion to address the possibility that empathy operates through or alongside other stylistic factors. We retain the interpretation that the objective step-count and intention trajectories still provide evidence of an empathy-related effect via the peripheral route, but we will present this more cautiously as an association rather than a fully isolated causal claim. revision: partial

  2. Referee: [Methods and Results] Methods and Results: The study uses N=13 in a six-week within-subject crossover design, yet the abstract provides no information on the statistical tests used, effect sizes, how empathy was specifically operationalized in the prompts, procedures for blinding or counterbalancing, or handling of order effects and carryover. These details are load-bearing for evaluating whether the design isolates the effect of empathy.

    Authors: We agree that the abstract must supply these methodological anchors. In the revision we will add a concise methods summary to the abstract covering: (1) the operationalization of empathy via targeted prompt modifications, (2) the within-subject crossover with counterbalancing and washout periods, (3) blinding procedures, (4) the statistical approach (repeated-measures models with appropriate handling of order and carryover), and (5) key effect-size metrics. Full procedural details already appear in the Methods section; the abstract will now reference them explicitly so readers can assess the design's ability to isolate empathy. revision: yes

Circularity Check

0 steps flagged

Empirical user study with no derivation chain or fitted predictions

full rationale

This is a purely empirical within-subject user study (N=13) reporting measured step counts, intention scores, and subjective ratings across three chatbot conditions. No equations, parameters, derivations, or first-principles predictions appear anywhere in the manuscript. All load-bearing claims rest on direct experimental outcomes rather than any reduction to self-citations, ansatzes, or fitted inputs renamed as predictions. The study is therefore self-contained against external benchmarks with zero circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As an empirical HCI study the central claim rests on the validity of the experimental manipulation and outcome measures rather than mathematical axioms or free parameters. Key unstated elements include the exact prompt engineering used to create empathy differences and the reliability of step-count tracking.

pith-pipeline@v0.9.1-grok · 5747 in / 1133 out tokens · 56981 ms · 2026-06-26T03:39:15.838115+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

77 extracted references · 4 canonical work pages

  1. [1]

    Mawulolo K Ameko, Miranda L Beltzer, Lihua Cai, Mehdi Boukhechba, Bethany A Teachman, and Laura E Barnes. 2020. Offline contextual multi-armed bandits for mobile health interventions: A case study on emotion regulation. InProceedings of the 14th ACM Conference on Recommender Systems. 249–258

  2. [2]

    John W Ayers, Adam Poliak, Mark Dredze, Eric C Leas, Zechariah Zhu, Jessica B Kelley, Dennis J Faix, Aaron M Goodman, Christopher A Longhurst, Michael Hogarth, et al. 2023. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum.JAMA internal medicine183, 6 (2023), 589–596

  3. [3]

    Timothy W Bickmore and Rosalind W Picard. 2005. Establishing and maintaining long-term human-computer relationships.ACM Transactions on Computer-Human Interaction (TOCHI)12, 2 (2005), 293–327

  4. [4]

    Timothy W Bickmore, Rebecca A Silliman, Kerrie Nelson, Debbie M Cheng, Michael Winter, Lori Henault, and Michael K Paasche-Orlow. 2013. A randomized controlled trial of an automated exercise coach for older adults.Journal of the American Geriatrics Society61, 10 (2013), 1676–1683

  5. [5]

    Frank Biocca, Chad Harms, and Judee K Burgoon. 2003. Toward a more robust theory and measure of social presence: Review and suggested criteria. Presence: Teleoperators & virtual environments12, 5 (2003), 456–480

  6. [6]

    Gallus Bischof, Anja Bischof, and Hans-Jürgen Rumpf. 2021. Motivational interviewing: an evidence-based approach for use in medical practice. Deutsches Ärzteblatt International118, 7 (2021), 109

  7. [7]

    John Brooke et al. 1996. SUS-A quick and dirty usability scale.Usability evaluation in industry189, 194 (1996), 4–7

  8. [8]

    Andrew Brown, Ash Tanuj Kumar, Osnat Melamed, Imtihan Ahmed, Yu Hao Wang, Arnaud Deza, Marc Morcos, Leon Zhu, Marta Maslej, Nadia Minian, et al. 2023. A motivational interviewing chatbot with generative reflections for increasing readiness to quit smoking: iterative development study.JMIR Mental Health10 (2023), e49132

  9. [9]

    Judee K Burgoon. 2015. Expectancy violations theory.The international encyclopedia of interpersonal communication(2015), 1–9

  10. [10]

    Rachel Chauvin, Céline Clavel, Nicolas Sabouret, and Brian Ravenet. 2023. A virtual coach with more or less empathy: impact on older adults’ engagement to exercise. InProceedings of the 23rd ACM International Conference on Intelligent Virtual Agents. 1–9

  11. [11]

    Kenneth B Clark. 1980. Empathy: A neglected topic in psychological research.American Psychologist35, 2 (1980), 187

  12. [12]

    Sheldon Cohen, Tom Kamarck, and Robin Mermelstein. 1983. A global measure of perceived stress.Journal of health and social behavior(1983), 385–396

  13. [13]

    Courtney R Davis, Karen J Murphy, Rachel G Curtis, and Carol A Maher. 2020. A process evaluation examining the performance, adherence, and acceptability of a physical activity and diet artificial intelligence virtual health assistant.International journal of environmental research and public health17, 23 (2020), 9137

  14. [14]

    2013.Intrinsic motivation and self-determination in human behavior

    Edward L Deci and Richard M Ryan. 2013.Intrinsic motivation and self-determination in human behavior. Springer Science & Business Media

  15. [15]

    Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2024. The faiss library.arXiv preprint arXiv:2401.08281(2024)

  16. [16]

    Robert Elliott, Arthur C Bohart, Jeanne C Watson, and David Murphy. 2018. Therapist empathy and client outcome: An updated meta-analysis. Psychotherapy55, 4 (2018), 399

  17. [17]

    Emmons and Michael E

    Robert A. Emmons and Michael E. McCullough. 2003. Counting Blessings Versus Burdens: An Experimental Investigation of Gratitude and Subjective Well-Being in Daily Life.Journal of Personality and Social Psychology84, 2 (2003), 377–389. doi:10.1037/0022-3514.84.2.377

  18. [18]

    Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial.JMIR mental health4, 2 (2017), e7785

  19. [19]

    Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, and Changhoon Oh. 2024. CloChat: Understanding how people customize, interact, and experience personas in large language models. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–24

  20. [20]

    M Hemmati Maslakpak, N Parizad, A Ghahremani, V Alinejad, and Masumeh Hemmati Maslakpak. 2021. The effect of motivational interviewing on the self-efficacy of people with type 2 diabetes: A randomised controlled trial.Journal of Diabetes Nursing25, 4 (2021), 1–8

  21. [21]

    Oliver Jacobs, Farid Pazhoohi, and Alan Kingstone. 2023. Brief exposure increases mind perception to ChatGPT and is moderated by the individual propensity to anthropomorphize. (2023)

  22. [22]

    Qiaolei Jiang, Yadi Zhang, and Wenjing Pian. 2022. Chatbot as an emergency exist: Mediated empathy for resilience via human-AI interaction during the COVID-19 pandemic.Information processing & management59, 6 (2022), 103074

  23. [23]

    Jillian A Johnson, Matthew J Zawadzki, Frank T Materia, Ann C White, and Joshua M Smyth. 2022. Efficacy and acceptability of digital stress management micro-interventions.Procedia Computer Science206 (2022), 45–55. Manuscript submitted to ACM Invisible Impact of Empathy on Behavioral Change 25

  24. [24]

    Matthew Jörke, Shardul Sapkota, Lyndsea Warkenthien, Niklas Vainio, Paul Schmiedmayer, Emma Brunskill, and James Landay. 2024. Supporting physical activity behavior change with llm-based conversational agents.arXiv preprint arXiv:2405.06061(2024)

  25. [25]

    Matthew Jörke, Shardul Sapkota, Lyndsea Warkenthien, Niklas Vainio, Paul Schmiedmayer, Emma Brunskill, and James A Landay. 2025. GPTCoach: Towards LLM-Based Physical Activity Coaching. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–46

  26. [26]

    Eunbin Kang and Youn Ah Kang. 2024. Counseling chatbot design: The effect of anthropomorphic chatbot characteristics on user self-disclosure and companionship.International Journal of Human–Computer Interaction40, 11 (2024), 2781–2795

  27. [27]

    Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T Joshi, Hanna Moazam, et al. 2023. Dspy: Compiling declarative language model calls into self-improving pipelines.arXiv preprint arXiv:2310.03714 (2023)

  28. [28]

    Laura A. King. 2001. The Health Benefits of Writing About Life Goals.Personality and Social Psychology Bulletin27, 7 (2001), 798–807. doi:10.1177/ 0146167201277003

  29. [29]

    Hekler, Saul Shiffman, Audrey Boruvka, Daniel Almirall, Ambuj Tewari, and Susan A

    Predrag Klasnja, Eric B. Hekler, Saul Shiffman, Audrey Boruvka, Daniel Almirall, Ambuj Tewari, and Susan A. Murphy. 2015. The Microrandomized Trial: An Experimental Design for Developing Just-In-Time Adaptive Interventions.Health Psychology34, S (2015), 1220–1228. doi:10.1037/hea0000305

  30. [30]

    Seewald, and other

    Predrag Klasnja, Stephen Smith, Nicholas J. Seewald, and other. 2019. Efficacy of Contextually Tailored Suggestions for Physical Activity: A Micro-Randomized Trial of HeartSteps.Annals of Behavioral Medicine53, 6 (2019), 573–582. doi:10.1093/abm/kay067

  31. [31]

    Rafal Kocielnik, Lillian Xiao, Daniel Avrahami, and Gary Hsieh. 2018. Reflection companion: a conversational system for engaging users in reflection on physical activity.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies2, 2 (2018), 1–26

  32. [32]

    Jan-Niklas Kramer, Florian Künzler, Varun Mishra, Shawna N Smith, David Kotz, Urte Scholz, Elgar Fleisch, and Tobias Kowatsch. 2020. Which components of a smartphone walking app help users to reach personalized step goals? Results from an optimization trial.Annals of Behavioral Medicine54, 7 (2020), 518–528

  33. [33]

    Aakriti Kumar, Nalin Poungpeth, Diyi Yang, Erina Farrell, Bruce Lambert, and Matthew Groh. 2025. When Large Language Models are Reliable for Judging Empathic Communication.arXiv preprint arXiv:2506.10150(2025)

  34. [34]

    Dillys Larbi, Kerstin Denecke, and Elia Gabarron. 2022. Usability testing of a social media chatbot for increasing physical activity behavior.Journal of Personalized Medicine12, 5 (2022), 828

  35. [35]

    Kai-Hui Liang, Patrick Lange, Yoo Jung Oh, Jingwen Zhang, Yoshimi Fukuoka, and Zhou Yu. 2021. Evaluation of in-person counseling strategies to develop physical activity chatbot for women.arXiv preprint arXiv:2107.10410(2021)

  36. [36]

    Christine Lisetti, Reza Amini, Ugan Yasavur, and Naphtali Rishe. 2013. I can help you change! an empathic virtual agent delivers behavior change health interventions.ACM Transactions on Management Information Systems (TMIS)4, 4 (2013), 1–28

  37. [37]

    Bingjie Liu and S Shyam Sundar. 2018. Should machines express sympathy and empathy? Experiments with a health advice chatbot.Cyberpsychology, Behavior, and Social Networking21, 10 (2018), 625–636

  38. [38]

    Man Luo, Christopher J Warren, Lu Cheng, Haidar M Abdul-Muhsin, and Imon Banerjee. 2024. Assessing empathy in large language models with real-world physician-patient interactions. In2024 IEEE International Conference on Big Data (BigData). IEEE, 6510–6519

  39. [39]

    Tiffany Christina Luo, Adrian Aguilera, Courtney Rees Lyles, and Caroline Astrid Figueroa. 2021. Promoting physical activity through conversational agents: mixed methods systematic review.Journal of Medical Internet Research23, 9 (2021), e25486

  40. [40]

    José Mercado, Ismael Edrein Espinosa-Curiel, and Juan Martínez-Miranda. 2023. Embodied conversational agents providing motivational interviewing to improve health-related behaviors: scoping review.Journal of medical Internet research25 (2023), e52097

  41. [41]

    William R Miller and Stephen Rollnick. 2002. Motivational interviewing: preparing people for change. 2002.New York: Guilford2 (2002)

  42. [42]

    Smith, Bonnie J

    Inbal Nahum-Shani, Stephanie N. Smith, Bonnie J. Spring, Linda M. Collins, Katie Witkiewitz, Ambuj Tewari, and Susan A. Murphy. 2018. Just-in-Time Adaptive Interventions (JITAIs) in Mobile Health: Key Components and Design Principles.Annals of Behavioral Medicine52, 6 (2018), 446–462. doi:10.1093/abm/kay071

  43. [43]

    Tim Norfolk, Kamal Birdi, and Deirdre Walsh. 2007. The role of empathy in establishing rapport in the consultation: a new model.Medical education 41, 7 (2007), 690–697

  44. [44]

    Heather L O’Brien, Paul Cairns, and Mark Hall. 2018. A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form.International Journal of Human-Computer Studies112 (2018), 28–39

  45. [45]

    Pablo Paredes, Ran Gilad-Bachrach, Mary Czerwinski, Asta Roseway, Kael Rowan, and Javier Hernandez. 2014. PopTherapy: Coping with stress through pop-culture. InProceedings of the 8th international conference on pervasive computing technologies for healthcare. 109–117

  46. [46]

    Sandra Peter, Kai Riemer, and Jevin D West. 2025. The benefits and dangers of anthropomorphic conversational agents.Proceedings of the National Academy of Sciences122, 22 (2025), e2415898122

  47. [47]

    Richard E Petty and John T Cacioppo. 1986. The elaboration likelihood model of persuasion. InAdvances in experimental social psychology. Vol. 19. Elsevier, 123–205

  48. [48]

    Meihua Piao, Hyeongju Ryu, Hyeongsuk Lee, Jeongeun Kim, et al. 2020. Use of the healthy lifestyle coaching chatbot app to promote stair-climbing habits among office workers: exploratory randomized controlled trial.JMIR mHealth and uHealth8, 5 (2020), e15085

  49. [49]

    Athanasios Polyportis and Nikolaos Pahos. 2025. Understanding students’ adoption of the ChatGPT chatbot in higher education: the role of anthropomorphism, trust, design novelty and institutional policy.Behaviour & Information Technology44, 2 (2025), 315–336

  50. [50]

    Gabrina Pounds. 2011. Empathy as ‘appraisal’.Journal of Applied Linguistics and Professional Practice7, 2 (2011), 139–162. Manuscript submitted to ACM 26 Siyan et al

  51. [51]

    Annisa Ristya Rahmanti, Hsuan-Chia Yang, Bagas Suryo Bintoro, Aldilas Achmad Nursetyo, Muhammad Solihuddin Muhtar, Shabbir Syed-Abdul, and Yu-Chuan Jack Li. 2022. SlimMe, a chatbot with artificial empathy for personal weight management: system design and finding.Frontiers in Nutrition9 (2022), 870775

  52. [52]

    Hannah Rashkin, Eric Michael Smith, Margaret Li, and Y-Lan Boureau. 2018. Towards empathetic open-domain conversation models: A new benchmark and dataset.arXiv preprint arXiv:1811.00207(2018)

  53. [53]

    Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084

  54. [54]

    E Rey Velasco, Z Demjén, TC Skinner, and Impact Diabetes B2B Collaboration Group. 2024. Digital empathy in behaviour change interventions: A survey study on health coach responses to patient cues.Digital Health10 (2024), 20552076231225889

  55. [55]

    Ashish Sharma, Adam S Miner, David C Atkins, and Tim Althoff. 2020. A computational approach to understanding empathy expressed in text-based mental health support.arXiv preprint arXiv:2009.08441(2020)

  56. [56]

    Ben Sheehan, Hyun Seung Jin, and Udo Gottlieb. 2020. Customer service chatbots: Anthropomorphism and adoption.Journal of Business Research 115 (2020), 14–24

  57. [57]

    Ben Singh, Timothy Olds, Jacinta Brinsley, Dot Dumuid, Rosa Virgara, Lisa Matricciani, Amanda Watson, Kimberley Szeto, Emily Eglitis, Aaron Miatke, et al. 2023. Systematic review and meta-analysis of the effectiveness of chatbots on lifestyle behaviours.NPJ Digital Medicine6, 1 (2023), 118

  58. [58]

    Matthew A Stults-Kolehmainen and Rajita Sinha. 2014. The effects of stress on physical activity and exercise.Sports medicine44, 1 (2014), 81–121

  59. [59]

    Ruth E Taylor-Piliae, Joan M Fair, William L Haskell, Ann N Varady, Carlos Iribarren, Mark A Hlatky, Alan S Go, and Stephen P Fortmann. 2010. Validation of the Stanford Brief Activity Survey: examining psychological factors and physical activity levels in older adults.Journal of Physical Activity and Health7, 1 (2010), 87–94

  60. [60]

    Quyen G To, Chelsea Green, and Corneel Vandelanotte. 2021. Feasibility, usability, and effectiveness of a machine learning–based physical activity chatbot: quasi-experimental study.JMIR mHealth and uHealth9, 11 (2021), e28577

  61. [61]

    Daniel Ullman and Bertram F Malle. 2019. MDMT: Multi-dimensional measure of trust

  62. [62]

    Corneel Vandelanotte, Stewart Trost, Danya Hodgetts, Tasadduq Imam, Mamunur Rashid, Quyen G To, and Carol Maher. 2023. Increasing physical activity using an just-in-time adaptive digital assistant supported by machine learning: a novel approach for hyper-personalised mHealth interventions.Journal of Biomedical Informatics144 (2023), 104435

  63. [63]

    Beverly Walpole, Elizabeth Dettmer, Barbara Morrongiello, Brian McCrindle, and Jill Hamilton. 2011. Motivational interviewing as an intervention to increase adolescent self-efficacy and promote weight loss: methodology and design.BMC public health11, 1 (2011), 459

  64. [64]

    Qihan Wang, Shidong Pan, Tal Linzen, and Emily Black. 2025. Multilingual Prompting for Improving LLM Generation Diversity.arXiv preprint arXiv:2505.15229(2025)

  65. [65]

    Chaya Ben Yehuda, Ran Gilad-Bachrach, and Yarin Udi. 2024. Improving Engagement and Efficacy of mHealth Micro-Interventions for Stress Coping: an In-The-Wild Study.arXiv preprint arXiv:2407.11612(2024)

  66. [66]

    Zhehao Zhang, Ryan A Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, et al. 2024. Personalization of large language models: A survey.arXiv preprint arXiv:2411.00027(2024)

  67. [67]

    one thing you’re grateful for today

    Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The design and implementation of xiaoice, an empathetic social chatbot.Computational Linguistics46, 1 (2020), 53–93. Manuscript submitted to ACM Invisible Impact of Empathy on Behavioral Change 27 A Description of Stress Relief Interventions Grounded in emotion-regulation theory and positive psycho...

  68. [68]

    Roll with Resistance

  69. [69]

    The agenda is as follows:

    Support Self-efficacy Manuscript submitted to ACM 28 Siyan et al. The agenda is as follows:

  70. [70]

    Hi! It's so great to meet you. I'm here to help you explore ways to add more physical activity into your day. How are you doing today?

    Engagement a) Start with: "Hi! It's so great to meet you. I'm here to help you explore ways to add more physical activity into your day. How are you doing today?" b) "Before we move on, I'd be interested in knowing more about you so I can provide more personalized suggestions later. Can I ask, how do you usually unwind after work?" c) "What are some thing...

  71. [71]

    b) "Do you think stress is a significant barrier to more physical activity for you?

    Focusing a) "What do you think has been holding you back from being more active? For example, common barriers include: lack of time, social support, motivation, energy, and financial resources. Other concerns include neighborhood safety, family obligation, weather, and fear of injury." b) "Do you think stress is a significant barrier to more physical acti...

  72. [72]

    What kinds of benefits do you think being more active might bring to your life? Feel free to ask me if you need some examples

    Evoking Goals: a. Eliciting and reinforcing change talk b. Increasing the amount and strength of change talk c. Get curious about their motivation d. Develop internal motivation Consideration: a. What are this person's own reasons for change? b. Is the reluctance more about confidence or the importance of change? a) "What kinds of benefits do you think be...

  73. [73]

    FINAL_MESSAGE

    Planning a) At this point, after reviewing all of this, what actions do you plan to give it a try in the following 7 days to be more physically active? For example, we have discussed some small opportunities such as.... b) On a scale of 0 to 10, how confident are you in doing these actions in the following 7 days? c) I will follow up with you around this ...

  74. [74]

    Asking about the user's past two days of experience with respect to physical activity. Make sure you include all of the following questions in a natural manner: a) How have you been these past two days? b) How did your plan for physical activity go? Did you get a chance to [insert the activity previously discussed]? c) When you were doing the physical act...

  75. [75]

    Update the action plan for physical activity if needed

  76. [76]

    Discuss the following specific topic with the user: DISCUSSION_TOPIC You may find the following information useful: RETRIEVAL_CONTENT

  77. [77]

    On a scale of 0 - 5, how stressed are you feeling overall?

    "On a scale of 0 - 5, how stressed are you feeling overall?" Once you ask this question, IMMEDIATELY respond with "FINISHED". No need to follow up. **AVOID PROVIDING RESPONSES THAT ARE MULTIPLE PARAGRAPHS LONG.** When you have finished the agenda, respond "FINISHED". D Clinical Empathy Module: Technical Details D.1 Empathy Opportunity Classifier Rey Velas...