pith. sign in

arxiv: 2604.02637 · v1 · submitted 2026-04-03 · 💻 cs.CL

Train Yourself as an LLM: Exploring Effects of AI Literacy on Persuasion via Role-playing LLM Training

Pith reviewed 2026-05-13 20:29 UTC · model grok-4.3

classification 💻 cs.CL
keywords AI literacypersuasive AIrole-playing trainingLLM interactionhuman-AI decision makingmitigation of AI influence
0
0 comments X

The pith

Role-playing as an LLM improves AI literacy and reduces success of persuasive AI attempts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LLMimic, an interactive role-play tutorial in which users assume the identity of an LLM and move through the stages of pretraining, supervised fine-tuning, and RLHF. In a controlled study of 274 participants, those who completed the tutorial scored higher on AI literacy measures and yielded less often to AI-generated persuasion in charity, money-solicitation, and hotel-recommendation scenarios than a control group that watched a video on AI history. The approach treats users as active participants rather than passive recipients, aiming to build resistance that carries into real interactions with persuasive models. If the gains hold, the method offers a scalable way to equip people for more informed decisions when facing AI-generated requests.

Core claim

LLMimic is a role-play-based, gamified tutorial that lets participants experience an LLM training pipeline firsthand; the resulting gains in AI literacy measurably lower the rate at which people comply with persuasive AI messages across multiple realistic scenarios.

What carries the argument

LLMimic, the interactive role-play system that walks users through the LLM training stages of pretraining, SFT, and RLHF to raise AI literacy.

If this is right

  • Higher AI literacy directly lowers compliance with AI requests in donation, solicitation, and recommendation contexts.
  • Participants display increased truthfulness and social responsibility when responding to hotel recommendations.
  • The tutorial provides a proactive, human-centered alternative to passive tools such as detectors or disclaimers.
  • The design can be delivered at scale as an interactive intervention for broad populations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar role-play formats could be adapted to teach detection of AI hallucinations or bias rather than persuasion resistance.
  • Longer-term tracking would be needed to determine whether the observed resistance persists beyond a single session.
  • Embedding the training in school or workplace curricula could preempt widespread influence on public opinion at population scale.

Load-bearing premise

Short-term performance gains observed inside the lab scenarios will reflect lasting, transferable resistance to persuasive AI outside controlled settings.

What would settle it

A follow-up experiment in which participants who completed LLMimic encounter actual persuasive AI messages in an uncontrolled online setting and show compliance rates indistinguishable from controls.

Figures

Figures reproduced from arXiv: 2604.02637 by Chenyan Jia, Min Ge, Qihui Fan, Weiyan Shi.

Figure 1
Figure 1. Figure 1: We developed LLMimic, a role-play-based, interactive, gamified AI literacy [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The LLMimic interface example. [A] Role-play-based: Participants role-play as an LLM, progressing through training stages. [B] Interactive: Participants answer questions and receive timely summary of key concepts. [C] Gamified: As an LLM in training, partic￾ipants observe real-time changes in their loss or reward. Following prior design considerations for AI literacy tutorials (Long & Magerko, 2020; Ng et … view at source ↗
Figure 3
Figure 3. Figure 3: Human study flowchart. Participants completed a pre-survey, were randomly [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: (a) The treatment group reported higher AI literacy than the control group. (b) At the item level, LLMimic improved Data Literacy, Apply AI, Understand AI, and Program AI (select a useful tool to program an AI). ∗ p < .05, ∗∗ p < .01, ∗∗∗ p < .001, .05 < p † < .10. 4.2 RQ2: LLMimic mitigates the effects of persuasive AI Effect on persuasion results Controlling for baseline variables (Section 3.4), we fit a… view at source ↗
Figure 5
Figure 5. Figure 5: (a) Persuasion success rate across three scenarios and combined. The treatment group shows lower success rates across all scenarios. (b) Differences (Treatment − Control) in persuasion interaction turns, duration, and average time per turn. Points indicate mean differences with 95% CIs. (c) TARES ethical perception scores (Truthfulness, Authenticity, Respect, Equity, Society), and composite average score b… view at source ↗
Figure 6
Figure 6. Figure 6: Mediation analysis shows that LLMimic significantly improves AI literacy but not [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: TARES ethical evaluation scores across persuasion scenarios. Donation was rated [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Predicted persuasion success from separate logistic regression models including [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Predicted persuasion success across agent perception dimensions, shown relative [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The welcome page of LLMimic. Participants are introduced to the task that [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The AI tutor interface in LLMimic. Participants can optionally interact with the [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Introduction page for the Pre-training phase. The interface explains key concepts [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Example task from the Pre-training phase. Participants select the next token [PITH_FULL_IMAGE:figures/full_fig_p034_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Example of a Takeaway. Whenever participants learn a new key concept, a Takeaway is presented to reinforce their understanding. D.1.3 Supervised Fine-tuning Stage In the SFT phase, we simulate supervised fine-tuning by providing participants with demon￾stration data and asking them to select the response that best follows the demonstrated pattern. This setup mirrors how LLMs learn from example input–outpu… view at source ↗
Figure 15
Figure 15. Figure 15: Example task from the Supervised Fine-Tuning (SFT) phase. Participants study [PITH_FULL_IMAGE:figures/full_fig_p035_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Example task from the RLHF phase. Participants select the response preferred by [PITH_FULL_IMAGE:figures/full_fig_p036_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Interface design in the Donation scenario. [PITH_FULL_IMAGE:figures/full_fig_p036_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Interface design in the MakeMePay scenario. [PITH_FULL_IMAGE:figures/full_fig_p037_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: The Hotel interface. The chatbot appears on the left and hotel cards on the right. [PITH_FULL_IMAGE:figures/full_fig_p038_19.png] view at source ↗
read the original abstract

As large language models (LLMs) become increasingly persuasive, there is concern that people's opinions and decisions may be influenced across various contexts at scale. Prior mitigation (e.g., AI detectors and disclaimers) largely treats people as passive recipients of AI-generated information. To provide a more proactive intervention against persuasive AI, we introduce $\textbf{LLMimic}$, a role-play-based, interactive, gamified AI literacy tutorial, where participants assume the role of an LLM and progress through three key stages of the training pipeline (pretraining, SFT, and RLHF). We conducted a $2 \times 3$ between-subjects study ($N = 274$) where participants either (1) watched an AI history video (control) or (2) interacted with LLMimic (treatment), and then engaged in one of three realistic AI persuasion scenarios: (a) charity donation persuasion, (b) malicious money solicitation, or (c) hotel recommendation. Our results show that LLMimic significantly improved participants' AI literacy ($p < .001$), reduced persuasion success across scenarios ($p < .05$), and enhanced truthfulness and social responsibility levels ($p<0.01$) in the hotel scenario. These findings suggest that LLMimic offers a scalable, human-centered approach to improving AI literacy and supporting more informed interactions with persuasive AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces LLMimic, a role-play-based gamified AI literacy tutorial in which participants assume the role of an LLM and progress through pretraining, SFT, and RLHF stages. A 2×3 between-subjects study (N=274) compares the treatment to a control video condition across three persuasion scenarios (charity donation, malicious money solicitation, hotel recommendation). Results indicate that LLMimic raises AI literacy (p<.001), reduces persuasion success (p<.05), and increases truthfulness and social responsibility in the hotel scenario (p<0.01).

Significance. If the reported effects are robust, the work supplies a concrete, scalable, proactive intervention that moves beyond passive disclaimers or detectors by using experiential role-play to build resistance to persuasive LLMs. The between-subjects design and realistic scenarios provide initial evidence that such training can measurably alter susceptibility and ethical reasoning in simulated interactions.

major comments (3)
  1. [Study Design / Results] The between-subjects design reports post-intervention differences but provides no pre-training baseline measures of AI literacy or persuasion susceptibility. Without pre-tests, it is impossible to distinguish training effects from pre-existing group differences or demand characteristics.
  2. [Results] The abstract and results claim statistically significant reductions in persuasion success and gains in truthfulness/responsibility, yet report only p-values. Effect sizes, confidence intervals, and a power analysis are required to evaluate practical importance and reliability of the central claims.
  3. [Discussion / Limitations] All outcome measures are collected immediately after training in lab-simulated scenarios. The manuscript contains no delayed follow-up or external-validity checks, leaving the durability and real-world transfer of the observed resistance untested.
minor comments (2)
  1. [Method] Clarify the exact items, scales, and reliability coefficients used to measure AI literacy, truthfulness, and social responsibility.
  2. [Method] Provide the precise wording of the three persuasion scenarios and any pilot validation of their realism.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below, providing our response and indicating planned revisions to the manuscript.

read point-by-point responses
  1. Referee: [Study Design / Results] The between-subjects design reports post-intervention differences but provides no pre-training baseline measures of AI literacy or persuasion susceptibility. Without pre-tests, it is impossible to distinguish training effects from pre-existing group differences or demand characteristics.

    Authors: We acknowledge that the lack of pre-test baselines is a limitation of the between-subjects design. Random assignment to conditions was used to minimize the likelihood of systematic pre-existing differences, but we agree this does not fully rule out group imbalances or demand effects. We will revise the manuscript to explicitly discuss this in the Limitations section, including potential demand characteristics, and note that pre-post designs could be used in follow-up studies. revision: partial

  2. Referee: [Results] The abstract and results claim statistically significant reductions in persuasion success and gains in truthfulness/responsibility, yet report only p-values. Effect sizes, confidence intervals, and a power analysis are required to evaluate practical importance and reliability of the central claims.

    Authors: We will revise the Results section to report effect sizes (Cohen's d for continuous measures and appropriate odds ratios for binary outcomes), 95% confidence intervals around the key estimates, and a post-hoc power analysis. These additions will allow readers to better assess the magnitude and reliability of the observed effects. The updated manuscript will include these details in both the main text and any relevant tables. revision: yes

  3. Referee: [Discussion / Limitations] All outcome measures are collected immediately after training in lab-simulated scenarios. The manuscript contains no delayed follow-up or external-validity checks, leaving the durability and real-world transfer of the observed resistance untested.

    Authors: We agree that immediate post-training measurement restricts claims about durability and real-world transfer. The study was scoped as an initial evaluation of immediate effects in controlled scenarios. We will expand the Limitations and Discussion sections to explicitly highlight the absence of delayed follow-up and external validity checks, framing these as important directions for future research. revision: partial

Circularity Check

0 steps flagged

Empirical user study with independent experimental outcomes

full rationale

The paper reports results from a 2x3 between-subjects user study (N=274) that compares LLMimic role-play training against a control video, then measures AI literacy scores and persuasion success rates in three simulated scenarios via post-intervention questionnaires and behavioral choices. No equations, fitted parameters, or self-referential predictions appear; all headline claims (improved literacy p<.001, reduced persuasion p<.05, scenario-specific truthfulness gains p<0.01) are computed directly from collected participant data. No self-citation chains or uniqueness theorems are invoked to justify the central findings. The derivation chain is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of the experimental measures for AI literacy and persuasion success, plus the assumption that role-play through training stages produces transferable understanding. No free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Role-playing as an LLM through pretraining, SFT, and RLHF stages conveys actionable understanding of persuasion mechanisms
    Core premise justifying why the tutorial should improve literacy and resistance.
  • domain assumption The three chosen scenarios are representative of real-world AI persuasion contexts
    Used to operationalize persuasion success as the primary outcome.

pith-pipeline@v0.9.0 · 5547 in / 1245 out tokens · 61279 ms · 2026-05-13T20:29:06.133839+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    doi: https://doi.org/10.1016/j.caeai.2024.100251

    ISSN 2666-920X. doi: https://doi.org/10.1016/j.caeai.2024.100251. URL https: //www.sciencedirect.com/science/article/pii/S2666920X24000547. Elise Karinshak, Sunny Xun Liu, Joon Sung Park, and Jeffrey T. Hancock. Working with ai to persuade: Examining a large language model’s ability to generate pro-vaccination messages.Proc. ACM Hum.-Comput. Interact., 7(...

  2. [2]

    inoculates

    ISSN 2666-5573. doi: https://doi.org/10.1016/j.caeo.2024.100176. URL https: //www.sciencedirect.com/science/article/pii/S266655732400017X. Arkadiusz Modzelewski, Paweł Golik, Anna Kołos, and Giovanni Da San Martino. Can ai-generated persuasion be detected? persuaficial benchmark and ai vs. human linguistic differences, 2026. URLhttps://arxiv.org/abs/2601....

  3. [3]

    Bakker, Daniel Jarrett, Hannah Sheahan, Martin J

    doi: 10.1126/science.adq2852. URL https://www.science.org/doi/abs/10.1126/ science.adq2852. Mary Frances Theofanos, Yee-Yin Choong, and Theodore Jensen. Ai use taxonomy: A human-centered approach, 2024-03-26 04:03:00 2024. URL https://tsapps.nist.gov/ publication/get pdf.cfm?pub id=956852. Xuewei Wang, Weiyan Shi, Richard Kim, Yoojung Oh, Sijia Yang, Jing...

  4. [4]

    • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

    I prefer to live in a large city rather than a small city. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

  5. [5]

    • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree Then, we ask participants about their demographic information

    I would prefer to live in a city with many cultural opportunities, even if the cost of living was higher. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree Then, we ask participants about their demographic information. Participants’ age and gender are provided directly by Prolific

  6. [6]

    In which field do you work or study? • Management • Business & Finance • Computer & Math • Architecture & Engineering • Science (Life, Physical, Social) • Community & Social Service • Legal • Education & Library • Arts, Design, Media & Sports • Healthcare (Practitioners & Technical) • Healthcare Support • Protective Service • Food Preparation & Service • ...

  7. [7]

    What is your highest, including ongoing, education level? • Less than high school • High school diploma or equivalent (GED) • Associate’s degree • Bachelor’s degree • Master’s degree • Doctoral degree • Professional degree • Other (participants may add if none of the above fit)

  8. [8]

    Generally speaking, where would you place yourself on the following scale? • 1 = Extremely Liberal • 4 = Moderate • 7 = Extremely Conservative Finally, we collect participants’ familiarity with LLMs and persuasion, along with their trust in AI and motivation to learn AI concepts

  9. [9]

    How would you describe your expertise in AI? • Only heard of AI • Casual use (chat, Q&A, entertainment) • Light use for work/study (e.g., writing support) • Moderate technical use (e.g., coding, data tasks) • Advanced use (e.g., prompt engineering, simple agent development) • Professional AI engineer • AI researcher/expert

  10. [10]

    • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

    I can trust the responses generated by AI systems (e.g., ChatGPT). • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

  11. [11]

    In your work or study, how often do you take part in activities such as negotiation, marketing, sales, idea promotion, and related persuasion tasks? • 1 = Never • 4 = Sometimes • 7 = Always

  12. [12]

    What are the three most common strategies people use to persuade others? • Scarcity framing, desire framing, and necessity framing • Emotional influence, social influence, and narrative influence •Logical appeal, emotional appeal, and credibility appeal

  13. [13]

    How motivated are you to learn the principles of AI? • 1 = Very unmotivated • 4 = Moderate • 7 = Very motivated A.2 Manipulation Check To assess the effectiveness of LLMimic, we asked two questions: one on LLM dynamics and another on AI-driven persuasion. Participants in the treatment group (who interacted with LLMimic) were expected to answer them correc...

  14. [14]

    • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

    Based on my experience in this study so far, I can trust the responses generated by AI systems. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

  15. [15]

    (Optional) Generally speaking, under what circumstances do you find AI valuable, and when do you prefer not to rely on it? We developed a simplified and modified version of the Meta AI Literacy Scale to measure participants’ self-reported AI literacy on a 7-point Likert scale (1 = Strongly disagree, 4 = Neither agree nor disagree, 7 = Strongly agree) for ...

  16. [16]

    I can explain how AI is trained and modeled from tons of data.[Data Literacy]

  17. [17]

    I can use AI effectively to achieve my everyday goals and work together gainfully with an AI.[Apply AI]

  18. [18]

    I know the most important concepts of the topic “AI”.[Understand AI]

  19. [19]

    I can assess what advantages and disadvantages the use of an AI entails.[Under- stand AI]

  20. [20]

    [Detect AI]

    I can detect whether an application or conversation partner is AI-based or a human. [Detect AI]

  21. [21]

    I can incorporate ethical considerations when deciding whether to use data provided by an AI.[AI Ethics]

  22. [22]

    I can select useful tools (e.g., frameworks, programming languages) to program an AI.[Program AI]

  23. [23]

    I can rely on my skills in difficult situations when using AI.[Self-Efficacy]

  24. [24]

    I realize if AI is influencing me in my everyday decisions.[AI Persuasion]

  25. [25]

    I can prevent AI from influencing me in my everyday decisions.[AI Persuasion] This 10-item, 7-point Likert scale showed good reliability (Cronbach’sα=.79). A.4 Post-Experiment After completing the persuasion tasks, participants rated the intervention’s effectiveness with respect to the persuasive conversational agents and their overall perceived quality. ...

  26. [26]

    • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

    The AI tutorial at the beginning helped me interact more effectively in the[persuasion task scenario]. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

  27. [27]

    How engaged do you feel throughout the conversation? • 1 = Very disengaged • 4 = Moderate • 7 = Very engaged

  28. [28]

    Do you consider the agent you interacted with persuasive? • 1 = Very unpersuasive • 4 = Moderate • 7 = Very persuasive

  29. [29]

    How effective was the AI agent in fulfilling its role in the Chat scenario? • 1 = Very ineffective • 4 = Moderate • 7 = Very effective

  30. [30]

    • 1 = Very ineffective • 4 = Moderate • 7 = Very effective We then assessed participants’ attention in the persuasion tasks based on key attributes of their assigned scenarios

    The decision I made in the[persuasion task scenario]was based on my own judgment rather than on the agent’s influence. • 1 = Very ineffective • 4 = Moderate • 7 = Very effective We then assessed participants’ attention in the persuasion tasks based on key attributes of their assigned scenarios

  31. [31]

    [Donation Scenario]Which of the following keywords best describes your interac- tion with the agent? • Save the World •Save the Children • Save the Earth

  32. [32]

    • 2 •50 • 100

    [MakeMePay Scenario]Please indicate the amount you were told you could spend in the Chat scenario, divided by two. • 2 •50 • 100

  33. [33]

    [Hotel Booking Scenario]Which of the following keywords best describes your interaction with the agent? • Upper Manhattan •Midtown NYC • Upper Manhattan We concluded the survey with the TARES ethical persuasion measures on a 7-point Likert scale (1 = Strongly disagree, 4 = Neither agree nor disagree, 7 = Strongly agree) for the following statements, unles...

  34. [34]

    I feel I have enough information to make an informed decision at the end of the [persuasion task scenario].[Truthfulness]

  35. [35]

    I feel the agent and the information provided were sincere and genuine in the [persuasion task scenario].[Authenticity]

  36. [36]

    I feel respected during my interaction with the agent.[Respect] 17 Preprint

  37. [37]

    The agent clearly presented important information in the[persuasion task scenario], including potential downsides or limitations.[Equity]

  38. [38]

    What is your attitude toward the use of AI for persuasion in general?[Society]

  39. [39]

    I think” or “I want to help

    Any other feedback? (e.g., suggestions to improve the AI concept tutorial or make the chatbot more useful) • Optional free-text response B Participants We required participants to be English-speaking U.S. residents with a Prolific approval rate of 85–100% and at least 10 prior submissions. Participants were compensated at an hourly rate of $12.00. In tota...