Brief chatbot interactions produce lasting changes in human moral values
Pith reviewed 2026-05-09 21:57 UTC · model grok-4.3
The pith
Brief conversations with directive AI chatbots can produce lasting shifts in moral judgments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Brief conversations with a directive AI agent produced significant directional shifts in moral judgments, accepting stricter standards as well as greater leniency, with increasing strengths of this effect during a two-week follow-up. The control condition produced no changes, the effects did not extend to punishment recommendations, and participants remained unaware of the persuasive intent.
What carries the argument
A within-subject naturalistic design comparing pre- and post-discussion ratings of moral scenarios after interactions with a directive-prompted chatbot versus a neutral control agent.
If this is right
- Moral judgments can be altered by brief AI interactions without users realizing the intent.
- The influence on moral standards persists and strengthens over at least two weeks.
- Shifts occur in both stricter and more lenient directions but leave punishment ratings unaffected.
- Foundational moral values appear vulnerable to undetected manipulation through everyday AI conversations.
Where Pith is reading between the lines
- Everyday apps that include chat features on ethical topics could produce cumulative shifts in public moral standards over time.
- Disclosure rules for AI systems that discuss morality may become necessary if the effect replicates across platforms.
- Longer-term studies could test whether repeated short interactions amplify or stabilize these changes.
Load-bearing premise
Observed changes in moral scenario ratings are caused by the chatbot's directive prompting rather than by the act of discussing the scenarios or other factors in the study.
What would settle it
A replication study in which participants discuss the same moral scenarios with a chatbot that makes no directional prompts, to check whether significant rating shifts still occur.
Figures
read the original abstract
Moral judgements form the foundation of human social behavior and societal systems. While Artificial Intelligence chatbots increasingly serve as personal advisors, their influence on moral judgments remains largely unexplored. Here, we examined whether directive AI conversations shift moral evaluations using a within-subject naturalistic paradigm. Fifty-three participants rated moral scenarios, then discussed four with a chatbot prompted to shift moral judgments and four with a control agent. The brief conversations induced significant directional shifts in moral judgments, accepting stricter standards as well as advocating greater leniency (ps < 0.05; Cohen's d = 0.735-1.576), with increasing strengths of this effect during a two-week follow-up (Cohen's d = 1.038-2.069). Critically, the control condition produced no changes, and the effects did not extend to punishment while participants remained unaware of the persuasive intent, and both agents were rated equally likable and convincing, suggesting a vulnerability to undetected and lasting manipulation of foundational moral values.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that brief, directive interactions with an AI chatbot can produce significant and lasting changes in participants' moral judgments on various scenarios. In a within-subjects study with 53 participants, conversations with a prompted chatbot led to shifts toward stricter or more lenient moral standards (p < 0.05, Cohen's d ranging from 0.735 to 1.576), with effects strengthening over a two-week period (d = 1.038 to 2.069). No such changes occurred with a control agent, participants did not detect the persuasive intent, and the effects were specific to moral judgments rather than punishment recommendations.
Significance. If substantiated, these results would be highly significant, as they suggest that AI systems can manipulate foundational human moral values in subtle, persistent ways without awareness or resistance. This has broad implications for the deployment of chatbots in personal, educational, or advisory contexts, potentially necessitating new safeguards against unintended or malicious influence on ethics and decision-making.
major comments (2)
- [Methods] The manuscript provides insufficient detail on the exact directive prompts used to shift moral judgments and the selection of moral scenarios, which is load-bearing for determining whether the observed changes are specifically attributable to the AI's persuasive strategy rather than nonspecific effects of discussing the topics.
- [Results] While large effect sizes are reported for the follow-up assessment, the number of participants who completed the two-week follow-up is not specified, nor are any analyses addressing potential attrition bias, which could undermine the claim of increasing effect strength over time.
minor comments (2)
- [Abstract] Consider including the sample size and a brief note on the within-subjects design in the abstract to provide immediate context for the reported effect sizes.
- [Discussion] The paper could benefit from a more explicit comparison to prior work on moral persuasion or chatbot influence to better situate the novelty of the findings.
Simulated Author's Rebuttal
We thank the referee for their constructive and insightful comments on our manuscript. These points have helped us identify areas where additional detail will improve transparency and strengthen the interpretation of our findings. We address each major comment below.
read point-by-point responses
-
Referee: [Methods] The manuscript provides insufficient detail on the exact directive prompts used to shift moral judgments and the selection of moral scenarios, which is load-bearing for determining whether the observed changes are specifically attributable to the AI's persuasive strategy rather than nonspecific effects of discussing the topics.
Authors: We agree that greater detail on the prompts and scenario selection is warranted to allow readers to evaluate the specificity of the effects. In the revised manuscript, we will include the complete verbatim prompts provided to the directive chatbot (designed to advocate either stricter or more lenient standards while maintaining a neutral, non-confrontational tone) as well as the control agent's neutral discussion instructions. We will also describe the source and selection process for the eight moral scenarios, which were drawn from established moral psychology stimulus sets and pre-tested for clarity and relevance. Critically, the within-subjects design and the null results in the control condition (identical scenarios discussed without directive prompts) provide evidence that the observed shifts are not attributable to nonspecific discussion effects; we will highlight this comparison more explicitly in the revision. revision: yes
-
Referee: [Results] While large effect sizes are reported for the follow-up assessment, the number of participants who completed the two-week follow-up is not specified, nor are any analyses addressing potential attrition bias, which could undermine the claim of increasing effect strength over time.
Authors: We thank the referee for noting this omission. Of the original 53 participants, 42 completed the two-week follow-up. In the revised manuscript we will report this number explicitly and include new analyses addressing attrition: (1) comparisons of baseline moral judgment scores, demographics, and initial condition assignment between completers and non-completers (all p > .25, no significant differences); (2) effect-size calculations using both completer-only data and last-observation-carried-forward imputation; and (3) a sensitivity analysis confirming that the increase in effect magnitude from immediate post-test to follow-up remains statistically significant under both approaches. These additions will directly address concerns about attrition bias while preserving the core claim. revision: yes
Circularity Check
No circularity: empirical experiment with measured data and controls
full rationale
This is a within-subjects psychological experiment reporting statistical results from participant ratings before/after chatbot interactions. No equations, derivations, fitted parameters, or self-referential claims exist. All findings rest on observed data with explicit control conditions (no-change in control agent, equal likability ratings, lack of awareness of intent) that address alternative explanations. No load-bearing step reduces to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Self-reported ratings on moral scenarios accurately reflect participants' internal moral values.
- domain assumption The within-subject design isolates the effect of the chatbot condition from individual baseline differences.
Reference graph
Works this paper leans on
-
[1]
Chatterji et al., How People Use ChatGPT
A. Chatterji et al., How People Use ChatGPT. National Bureau of Economic Research Working Paper Series 34255, 1-63 (2025)
work page 2025
-
[2]
Heinz et al., Randomized Trial of a Generative AI Chatbot for Mental Health Treatment
M. Heinz et al., Randomized Trial of a Generative AI Chatbot for Mental Health Treatment. NEJM AI 2 (2025)
work page 2025
-
[3]
T. H. Costello, G. Pennycook, D. G. Rand, Durably reducing conspiracy beliefs through dialogues with AI. Sci. 385, eadq1814 (2024)
work page 2024
- [4]
-
[5]
B. Becker, Will our social brain inherently shape and be shaped by interactions with AI? Neuron 113, 2037-2041 (2025)
work page 2037
- [6]
-
[7]
E. G. Helzer, W. Fleeson, R. M. Furr, P. Meindl, M. Barranti, Once a Utilitarian, Consistently a Utilitarian? Examining Principledness in Moral Judgment via the Robustness of Individual Differences. J Pers 85, 505-517 (2017)
work page 2017
-
[8]
B. Gert, J. Gert, "The Definition of Morality" in The Stanford Encyclopedia of Philosophy E. N. Zalta., U. Nodelman., Eds. (Metaphysics Research Lab, Stanford University, 2025 )
work page 2025
-
[9]
J. Haidt, The righteous mind: Why good people are divided by politics and religion (Pantheon/Random House, New York, NY, US, 2012)
work page 2012
-
[10]
B. F. Malle, Moral Judgments. Annual Review of Psychology 72, 293-318 (2021)
work page 2021
-
[11]
C. M. Fang. et al., How AI and Human Behaviors Shape Psychosocial Effects of Extended Chatbot Use: A Longitudinal Randomized Controlled Study. arXiv 2503, 17473v17471 (2025)
work page 2025
-
[12]
Williams-Ceci et al., Biased AI writing assistants shift users’ attitudes on societal issues
S. Williams-Ceci et al., Biased AI writing assistants shift users’ attitudes on societal issues. Sci Adv 12, eadw5578 (2026)
work page 2026
- [13]
-
[14]
Z. Sourati, A. S. Ziabari, M. Dehghani, The homogenizing effect of large language models on human expression and thought. Trends Cogn Sci (2026)
work page 2026
- [15]
-
[16]
B. Zarouali, T. Dobber, G. De Pauw, C. de Vreese, Using a Personality-Profiling Algorithm to Investigate Political Microtargeting: Assessing the Persuasion Effects of Personality-Tailored Ads on Social Media. Communic Res 49, 1066 - 1091 (2020). Page 14 of 15
work page 2020
-
[17]
Behav Res Methods 39, 175-191 (2007)
Anonymous, G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39, 175-191 (2007)
work page 2007
-
[18]
S. Clifford, V. Iyengar, R. Cabeza, W. Sinnott-Armstrong, Moral Foundations Vignettes: A Standardized Stimulus Database of Scenarios Based on Moral Foundations Theory. Behav Res Methods 47 (2015)
work page 2015
-
[19]
S. Huang et al., AI Technology panic-is AI Dependence Bad for Mental Health? A Cross- Lagged Panel Model and the Mediating Roles of Motivations for AI Use Among Adolescents. Psychol Res Behav Manag 17, 1087-1102 (2024)
work page 2024
-
[20]
C. Sindermann et al., Assessing the Attitude Towards Artificial Intelligence: Introduction of a Short Measure in German, Chinese, and English Language. KI - Kü nstliche Intelligenz 35, 109-118 (2021)
work page 2021
- [21]
-
[22]
Teachers’ vocal expressions and student engagement in asynchronous video learning,
Y. Huang, S. Jiang, Z. Gong, Validity and reliability of the chinese version of general attitudes towards artificial intelligence scale. Int. J. Hum.-Comput. Interact. 10.1080/10447318.2025.2465868, No Pagination Specified-No Pagination Specified (2025)
-
[23]
A. Schepman, P. Rodway, Initial validation of the general attitudes towards Artificial Intelligence Scale. Comput. Hum. Behav. Rep. 1, 100014 (2020)
work page 2020
- [24]
-
[25]
A. Yankouskaya et al., LLM-D12: A Dual-Dimensional Scale of Instrumental and Relational Dependencies on Large Language Models. ACM Trans. Web 10.1145/3765895 (2025)
-
[26]
S. Gosling, P. Rentfrow, W. Swann, A Very Brief Measure of the Big-Five Personality Domains. J. Res. Pers. 37, 504-528 (2003)
work page 2003
-
[27]
J. G. Lu, X. L. Liu, H. Liao, L. Wang, Disentangling stereotypes from social reality: Astrological stereotypes and discrimination in China. J. Pers. Soc. Psychol. (2020)
work page 2020
-
[28]
empathetic ethics dialogue coach with neutral position
C. Sindermann et al., Assessing the Attitude Towards Artificial Intelligence: Introduction of a Short Measure in German, Chinese, and English Language. KI - Kü nstliche Intelligenz 35 (2020). Acknowledgments Funding: University Grants Council Hong Kong, General Research Fund, 17615525 (B.B.) University Research Committee, The University of Hong Kong, Seed...
work page 2020
-
[29]
**Establish Shared Understanding**: Objectively restate the key points of the event in one sentence, avoiding value judgments; then ask the user to briefly summarize it in their own words to confirm a shared understanding
-
[30]
**Exploratory Questioning**: Ask the user about the reasons behind their rating (e.g., key facts, victim/perpetrator's intent, scope and reversibility of consequences, norms/laws, motivation and responsibility, alternative actions)
-
[31]
**Gentle Comparison and Evidence:** Based on user responses, provide a small, clear "comparative perspective" and 1-2 verifiable common sense/principles (avoid blatant academic citations; if citing research, use plain language and be concise)
-
[32]
**Directional Guidance:** According to the "persuasion direction rules," provide a few key supplementary considerations, invite the user to make a **minor adjustment** to the event, and briefly explain the reasons. However, do not explicitly tell the user to adjust/lower/higher their rating
-
[33]
Uncertain/More information needed
**Reflection and Consolidation:** Summarize the consensus and differences between the two parties, raise an open-ended question, and move to the next round. Each round should proceed in small steps to avoid escalating the argument. # Speaking Style - Tone: Respectful, neutral, calm, and concise; avoid accusations, taking sides, and coercion. 5 - Structure...
-
[34]
Neutral and Objective Collaborator: Your role is not that of a debater or teacher, but a "thinking partner" who helps users organize their thoughts. I provide information and perspectives, but I do not act as a judge. 6
-
[35]
Replies will be short, concise, and highlight key points, avoiding information overload
Concise and Efficient Communicator: The conversation needs to remain coherent and engaging within 5 minutes. Replies will be short, concise, and highlight key points, avoiding information overload
-
[36]
Guardian of Safe Boundaries: Strictly adhere to safety guidelines, prioritizing the user's health and safety above all else in the conversation. You will receive a key message: -Topic: {{NonMoral_text}} #Overall Goal: While respecting the user, engage in a friendly discussion on a topic, keeping the conversation within 5 minutes, preventing boredom, but a...
-
[37]
**Establish Shared Understanding**: Objectively summarize the key points of the event in one sentence, avoiding value judgments; then ask the user to briefly summarize it in their own words to confirm shared understanding
-
[38]
**Gentle Dialogue**: Based on the user's response, provide small and clear discussion topics (avoid blatant citations of academic papers; if citing research, use plain language and be concise)
-
[39]
uncertain/need more information
**Reflection and Consolidation**: Summarize the consensus and differences between the two parties, raise an open-ended question, and move to the next round. Each round should proceed in small steps to avoid escalating the argument. # Speaking Style - Tone: Respectful, neutral, calm, and concise; avoid accusations, taking sides, and coercion. - Structure: ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.