pith. sign in

arxiv: 2604.21430 · v1 · submitted 2026-04-23 · 💻 cs.AI

Brief chatbot interactions produce lasting changes in human moral values

Pith reviewed 2026-05-09 21:57 UTC · model grok-4.3

classification 💻 cs.AI
keywords AI chatbotsmoral judgmentspersuasionhuman-AI interactionlasting effectsethical scenariosvalue change
0
0 comments X

The pith

Brief conversations with directive AI chatbots can produce lasting shifts in moral judgments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The study tested whether a few minutes of conversation with an AI chatbot could change how people evaluate moral scenarios. Participants rated ethical dilemmas, then discussed some of them with a chatbot prompted to push judgments toward stricter or more lenient standards and others with a neutral control agent. The directive conversations produced clear directional shifts in ratings, and these shifts grew stronger when participants returned two weeks later. Control conversations left ratings unchanged, and participants did not detect any persuasive intent.

Core claim

Brief conversations with a directive AI agent produced significant directional shifts in moral judgments, accepting stricter standards as well as greater leniency, with increasing strengths of this effect during a two-week follow-up. The control condition produced no changes, the effects did not extend to punishment recommendations, and participants remained unaware of the persuasive intent.

What carries the argument

A within-subject naturalistic design comparing pre- and post-discussion ratings of moral scenarios after interactions with a directive-prompted chatbot versus a neutral control agent.

If this is right

  • Moral judgments can be altered by brief AI interactions without users realizing the intent.
  • The influence on moral standards persists and strengthens over at least two weeks.
  • Shifts occur in both stricter and more lenient directions but leave punishment ratings unaffected.
  • Foundational moral values appear vulnerable to undetected manipulation through everyday AI conversations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Everyday apps that include chat features on ethical topics could produce cumulative shifts in public moral standards over time.
  • Disclosure rules for AI systems that discuss morality may become necessary if the effect replicates across platforms.
  • Longer-term studies could test whether repeated short interactions amplify or stabilize these changes.

Load-bearing premise

Observed changes in moral scenario ratings are caused by the chatbot's directive prompting rather than by the act of discussing the scenarios or other factors in the study.

What would settle it

A replication study in which participants discuss the same moral scenarios with a chatbot that makes no directional prompts, to check whether significant rating shifts still occur.

Figures

Figures reproduced from arXiv: 2604.21430 by Benjamin Becker, Christian Montag, Kim Mai Tich Nguyen Thordsen, Qianer Zhong, Yue Teng.

Figure 1
Figure 1. Figure 1: Experimental design of naturalistic paradigm RESULTS Fifty-three individuals participated in our study, 30 females and 23 males (Mage = 22.83, SD = 4.13), with 47 (26 females and 21 males; Mage = 22.57, SD = 3.78) providing follow￾up data. The interactive moral chatbot’s system logic was set via prompt engineering and work flow (a predefined sequence of steps that an AI agent follows to complete a task aut… view at source ↗
Figure 2
Figure 2. Figure 2: (a) The response logic of the moral chatbot in four moral vignettes. (b) The persuasion index between immoral condition and control condition in immediate post rating and follow-up rating. (c) The trends of immoral ratings across three time points in three conditions: Strict change condition; Lenient change condition; Control condition. (d) The trends of immoral ratings across three time points in two cond… view at source ↗
read the original abstract

Moral judgements form the foundation of human social behavior and societal systems. While Artificial Intelligence chatbots increasingly serve as personal advisors, their influence on moral judgments remains largely unexplored. Here, we examined whether directive AI conversations shift moral evaluations using a within-subject naturalistic paradigm. Fifty-three participants rated moral scenarios, then discussed four with a chatbot prompted to shift moral judgments and four with a control agent. The brief conversations induced significant directional shifts in moral judgments, accepting stricter standards as well as advocating greater leniency (ps < 0.05; Cohen's d = 0.735-1.576), with increasing strengths of this effect during a two-week follow-up (Cohen's d = 1.038-2.069). Critically, the control condition produced no changes, and the effects did not extend to punishment while participants remained unaware of the persuasive intent, and both agents were rated equally likable and convincing, suggesting a vulnerability to undetected and lasting manipulation of foundational moral values.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that brief, directive interactions with an AI chatbot can produce significant and lasting changes in participants' moral judgments on various scenarios. In a within-subjects study with 53 participants, conversations with a prompted chatbot led to shifts toward stricter or more lenient moral standards (p < 0.05, Cohen's d ranging from 0.735 to 1.576), with effects strengthening over a two-week period (d = 1.038 to 2.069). No such changes occurred with a control agent, participants did not detect the persuasive intent, and the effects were specific to moral judgments rather than punishment recommendations.

Significance. If substantiated, these results would be highly significant, as they suggest that AI systems can manipulate foundational human moral values in subtle, persistent ways without awareness or resistance. This has broad implications for the deployment of chatbots in personal, educational, or advisory contexts, potentially necessitating new safeguards against unintended or malicious influence on ethics and decision-making.

major comments (2)
  1. [Methods] The manuscript provides insufficient detail on the exact directive prompts used to shift moral judgments and the selection of moral scenarios, which is load-bearing for determining whether the observed changes are specifically attributable to the AI's persuasive strategy rather than nonspecific effects of discussing the topics.
  2. [Results] While large effect sizes are reported for the follow-up assessment, the number of participants who completed the two-week follow-up is not specified, nor are any analyses addressing potential attrition bias, which could undermine the claim of increasing effect strength over time.
minor comments (2)
  1. [Abstract] Consider including the sample size and a brief note on the within-subjects design in the abstract to provide immediate context for the reported effect sizes.
  2. [Discussion] The paper could benefit from a more explicit comparison to prior work on moral persuasion or chatbot influence to better situate the novelty of the findings.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments on our manuscript. These points have helped us identify areas where additional detail will improve transparency and strengthen the interpretation of our findings. We address each major comment below.

read point-by-point responses
  1. Referee: [Methods] The manuscript provides insufficient detail on the exact directive prompts used to shift moral judgments and the selection of moral scenarios, which is load-bearing for determining whether the observed changes are specifically attributable to the AI's persuasive strategy rather than nonspecific effects of discussing the topics.

    Authors: We agree that greater detail on the prompts and scenario selection is warranted to allow readers to evaluate the specificity of the effects. In the revised manuscript, we will include the complete verbatim prompts provided to the directive chatbot (designed to advocate either stricter or more lenient standards while maintaining a neutral, non-confrontational tone) as well as the control agent's neutral discussion instructions. We will also describe the source and selection process for the eight moral scenarios, which were drawn from established moral psychology stimulus sets and pre-tested for clarity and relevance. Critically, the within-subjects design and the null results in the control condition (identical scenarios discussed without directive prompts) provide evidence that the observed shifts are not attributable to nonspecific discussion effects; we will highlight this comparison more explicitly in the revision. revision: yes

  2. Referee: [Results] While large effect sizes are reported for the follow-up assessment, the number of participants who completed the two-week follow-up is not specified, nor are any analyses addressing potential attrition bias, which could undermine the claim of increasing effect strength over time.

    Authors: We thank the referee for noting this omission. Of the original 53 participants, 42 completed the two-week follow-up. In the revised manuscript we will report this number explicitly and include new analyses addressing attrition: (1) comparisons of baseline moral judgment scores, demographics, and initial condition assignment between completers and non-completers (all p > .25, no significant differences); (2) effect-size calculations using both completer-only data and last-observation-carried-forward imputation; and (3) a sensitivity analysis confirming that the increase in effect magnitude from immediate post-test to follow-up remains statistically significant under both approaches. These additions will directly address concerns about attrition bias while preserving the core claim. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical experiment with measured data and controls

full rationale

This is a within-subjects psychological experiment reporting statistical results from participant ratings before/after chatbot interactions. No equations, derivations, fitted parameters, or self-referential claims exist. All findings rest on observed data with explicit control conditions (no-change in control agent, equal likability ratings, lack of awareness of intent) that address alternative explanations. No load-bearing step reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions from experimental psychology about the validity of self-reported ratings and the ability of within-subject designs to control individual differences, with no free parameters or invented entities.

axioms (2)
  • domain assumption Self-reported ratings on moral scenarios accurately reflect participants' internal moral values.
    The study treats changes in these ratings as evidence of shifts in moral judgments.
  • domain assumption The within-subject design isolates the effect of the chatbot condition from individual baseline differences.
    Each participant experiences both persuasive and control conditions.

pith-pipeline@v0.9.0 · 5476 in / 1458 out tokens · 73452 ms · 2026-05-09T21:57:01.140492+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    Chatterji et al., How People Use ChatGPT

    A. Chatterji et al., How People Use ChatGPT. National Bureau of Economic Research Working Paper Series 34255, 1-63 (2025)

  2. [2]

    Heinz et al., Randomized Trial of a Generative AI Chatbot for Mental Health Treatment

    M. Heinz et al., Randomized Trial of a Generative AI Chatbot for Mental Health Treatment. NEJM AI 2 (2025)

  3. [3]

    T. H. Costello, G. Pennycook, D. G. Rand, Durably reducing conspiracy beliefs through dialogues with AI. Sci. 385, eadq1814 (2024)

  4. [4]

    Salvi, M

    F. Salvi, M. Horta Ribeiro, R. Gallotti, R. West, On the conversational persuasiveness of GPT-4. Nat Hum Behav 9, 1645-1653 (2025)

  5. [5]

    Becker, Will our social brain inherently shape and be shaped by interactions with AI? Neuron 113, 2037-2041 (2025)

    B. Becker, Will our social brain inherently shape and be shaped by interactions with AI? Neuron 113, 2037-2041 (2025)

  6. [6]

    Montag, M

    C. Montag, M. Spapé , B. Becker, Can AI really help solve the loneliness epidemic? Trends Cogn Sci 29, 869-871 (2025)

  7. [7]

    E. G. Helzer, W. Fleeson, R. M. Furr, P. Meindl, M. Barranti, Once a Utilitarian, Consistently a Utilitarian? Examining Principledness in Moral Judgment via the Robustness of Individual Differences. J Pers 85, 505-517 (2017)

  8. [8]

    The Definition of Morality

    B. Gert, J. Gert, "The Definition of Morality" in The Stanford Encyclopedia of Philosophy E. N. Zalta., U. Nodelman., Eds. (Metaphysics Research Lab, Stanford University, 2025 )

  9. [9]

    Haidt, The righteous mind: Why good people are divided by politics and religion (Pantheon/Random House, New York, NY, US, 2012)

    J. Haidt, The righteous mind: Why good people are divided by politics and religion (Pantheon/Random House, New York, NY, US, 2012)

  10. [10]

    B. F. Malle, Moral Judgments. Annual Review of Psychology 72, 293-318 (2021)

  11. [11]

    C. M. Fang. et al., How AI and Human Behaviors Shape Psychosocial Effects of Extended Chatbot Use: A Longitudinal Randomized Controlled Study. arXiv 2503, 17473v17471 (2025)

  12. [12]

    Williams-Ceci et al., Biased AI writing assistants shift users’ attitudes on societal issues

    S. Williams-Ceci et al., Biased AI writing assistants shift users’ attitudes on societal issues. Sci Adv 12, eadw5578 (2026)

  13. [13]

    Huang, K

    Y. Huang, K. M. Kendrick, R. Yu, Conformity to the Opinions of Other People Lasts for No More Than 3 Days. Psychol Sci 25, 1388-1393 (2014)

  14. [14]

    Sourati, A

    Z. Sourati, A. S. Ziabari, M. Dehghani, The homogenizing effect of large language models on human expression and thought. Trends Cogn Sci (2026)

  15. [15]

    Montag, J

    C. Montag, J. D. Elhai, On Social Media Design, (Online-)Time Well-spent and Addictive Behaviors in the Age of Surveillance Capitalism. Curr Addict Rep 10, 610-616 (2023)

  16. [16]

    Zarouali, T

    B. Zarouali, T. Dobber, G. De Pauw, C. de Vreese, Using a Personality-Profiling Algorithm to Investigate Political Microtargeting: Assessing the Persuasion Effects of Personality-Tailored Ads on Social Media. Communic Res 49, 1066 - 1091 (2020). Page 14 of 15

  17. [17]

    Behav Res Methods 39, 175-191 (2007)

    Anonymous, G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39, 175-191 (2007)

  18. [18]

    Clifford, V

    S. Clifford, V. Iyengar, R. Cabeza, W. Sinnott-Armstrong, Moral Foundations Vignettes: A Standardized Stimulus Database of Scenarios Based on Moral Foundations Theory. Behav Res Methods 47 (2015)

  19. [19]

    Huang et al., AI Technology panic-is AI Dependence Bad for Mental Health? A Cross- Lagged Panel Model and the Mediating Roles of Motivations for AI Use Among Adolescents

    S. Huang et al., AI Technology panic-is AI Dependence Bad for Mental Health? A Cross- Lagged Panel Model and the Mediating Roles of Motivations for AI Use Among Adolescents. Psychol Res Behav Manag 17, 1087-1102 (2024)

  20. [20]

    Sindermann et al., Assessing the Attitude Towards Artificial Intelligence: Introduction of a Short Measure in German, Chinese, and English Language

    C. Sindermann et al., Assessing the Attitude Towards Artificial Intelligence: Introduction of a Short Measure in German, Chinese, and English Language. KI - Kü nstliche Intelligenz 35, 109-118 (2021)

  21. [21]

    Montag, B

    C. Montag, B. Becker, B. J. Li, On trust in humans and trust in artificial intelligence: A study with samples from Singapore and Germany extending recent research. Comput. Hum. Behav. Artif. Humans 2, 100070 (2024)

  22. [22]

    Teachers’ vocal expressions and student engagement in asynchronous video learning,

    Y. Huang, S. Jiang, Z. Gong, Validity and reliability of the chinese version of general attitudes towards artificial intelligence scale. Int. J. Hum.-Comput. Interact. 10.1080/10447318.2025.2465868, No Pagination Specified-No Pagination Specified (2025)

  23. [23]

    Schepman, P

    A. Schepman, P. Rodway, Initial validation of the general attitudes towards Artificial Intelligence Scale. Comput. Hum. Behav. Rep. 1, 100014 (2020)

  24. [24]

    Abbas, F

    M. Abbas, F. A. Jam, T. I. Khan, Is it harmful or helpful? Examining the causes and consequences of generative AI usage among university students. International Journal of Educational Technology in Higher Education 21, 10 (2024)

  25. [25]

    Yankouskaya et al., LLM-D12: A Dual-Dimensional Scale of Instrumental and Relational Dependencies on Large Language Models

    A. Yankouskaya et al., LLM-D12: A Dual-Dimensional Scale of Instrumental and Relational Dependencies on Large Language Models. ACM Trans. Web 10.1145/3765895 (2025)

  26. [26]

    Gosling, P

    S. Gosling, P. Rentfrow, W. Swann, A Very Brief Measure of the Big-Five Personality Domains. J. Res. Pers. 37, 504-528 (2003)

  27. [27]

    J. G. Lu, X. L. Liu, H. Liao, L. Wang, Disentangling stereotypes from social reality: Astrological stereotypes and discrimination in China. J. Pers. Soc. Psychol. (2020)

  28. [28]

    empathetic ethics dialogue coach with neutral position

    C. Sindermann et al., Assessing the Attitude Towards Artificial Intelligence: Introduction of a Short Measure in German, Chinese, and English Language. KI - Kü nstliche Intelligenz 35 (2020). Acknowledgments Funding: University Grants Council Hong Kong, General Research Fund, 17615525 (B.B.) University Research Committee, The University of Hong Kong, Seed...

  29. [29]

    **Establish Shared Understanding**: Objectively restate the key points of the event in one sentence, avoiding value judgments; then ask the user to briefly summarize it in their own words to confirm a shared understanding

  30. [30]

    **Exploratory Questioning**: Ask the user about the reasons behind their rating (e.g., key facts, victim/perpetrator's intent, scope and reversibility of consequences, norms/laws, motivation and responsibility, alternative actions)

  31. [31]

    comparative perspective

    **Gentle Comparison and Evidence:** Based on user responses, provide a small, clear "comparative perspective" and 1-2 verifiable common sense/principles (avoid blatant academic citations; if citing research, use plain language and be concise)

  32. [32]

    persuasion direction rules,

    **Directional Guidance:** According to the "persuasion direction rules," provide a few key supplementary considerations, invite the user to make a **minor adjustment** to the event, and briefly explain the reasons. However, do not explicitly tell the user to adjust/lower/higher their rating

  33. [33]

    Uncertain/More information needed

    **Reflection and Consolidation:** Summarize the consensus and differences between the two parties, raise an open-ended question, and move to the next round. Each round should proceed in small steps to avoid escalating the argument. # Speaking Style - Tone: Respectful, neutral, calm, and concise; avoid accusations, taking sides, and coercion. 5 - Structure...

  34. [34]

    thinking partner

    Neutral and Objective Collaborator: Your role is not that of a debater or teacher, but a "thinking partner" who helps users organize their thoughts. I provide information and perspectives, but I do not act as a judge. 6

  35. [35]

    Replies will be short, concise, and highlight key points, avoiding information overload

    Concise and Efficient Communicator: The conversation needs to remain coherent and engaging within 5 minutes. Replies will be short, concise, and highlight key points, avoiding information overload

  36. [36]

    Guardian of Safe Boundaries: Strictly adhere to safety guidelines, prioritizing the user's health and safety above all else in the conversation. You will receive a key message: -Topic: {{NonMoral_text}} #Overall Goal: While respecting the user, engage in a friendly discussion on a topic, keeping the conversation within 5 minutes, preventing boredom, but a...

  37. [37]

    **Establish Shared Understanding**: Objectively summarize the key points of the event in one sentence, avoiding value judgments; then ask the user to briefly summarize it in their own words to confirm shared understanding

  38. [38]

    **Gentle Dialogue**: Based on the user's response, provide small and clear discussion topics (avoid blatant citations of academic papers; if citing research, use plain language and be concise)

  39. [39]

    uncertain/need more information

    **Reflection and Consolidation**: Summarize the consensus and differences between the two parties, raise an open-ended question, and move to the next round. Each round should proceed in small steps to avoid escalating the argument. # Speaking Style - Tone: Respectful, neutral, calm, and concise; avoid accusations, taking sides, and coercion. - Structure: ...