Train Yourself as an LLM: Exploring Effects of AI Literacy on Persuasion via Role-playing LLM Training

Chenyan Jia; Min Ge; Qihui Fan; Weiyan Shi

arxiv: 2604.02637 · v1 · submitted 2026-04-03 · 💻 cs.CL

Train Yourself as an LLM: Exploring Effects of AI Literacy on Persuasion via Role-playing LLM Training

Qihui Fan , Min Ge , Chenyan Jia , Weiyan Shi This is my paper

Pith reviewed 2026-05-13 20:29 UTC · model grok-4.3

classification 💻 cs.CL

keywords AI literacypersuasive AIrole-playing trainingLLM interactionhuman-AI decision makingmitigation of AI influence

0 comments

The pith

Role-playing as an LLM improves AI literacy and reduces success of persuasive AI attempts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LLMimic, an interactive role-play tutorial in which users assume the identity of an LLM and move through the stages of pretraining, supervised fine-tuning, and RLHF. In a controlled study of 274 participants, those who completed the tutorial scored higher on AI literacy measures and yielded less often to AI-generated persuasion in charity, money-solicitation, and hotel-recommendation scenarios than a control group that watched a video on AI history. The approach treats users as active participants rather than passive recipients, aiming to build resistance that carries into real interactions with persuasive models. If the gains hold, the method offers a scalable way to equip people for more informed decisions when facing AI-generated requests.

Core claim

LLMimic is a role-play-based, gamified tutorial that lets participants experience an LLM training pipeline firsthand; the resulting gains in AI literacy measurably lower the rate at which people comply with persuasive AI messages across multiple realistic scenarios.

What carries the argument

LLMimic, the interactive role-play system that walks users through the LLM training stages of pretraining, SFT, and RLHF to raise AI literacy.

If this is right

Higher AI literacy directly lowers compliance with AI requests in donation, solicitation, and recommendation contexts.
Participants display increased truthfulness and social responsibility when responding to hotel recommendations.
The tutorial provides a proactive, human-centered alternative to passive tools such as detectors or disclaimers.
The design can be delivered at scale as an interactive intervention for broad populations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar role-play formats could be adapted to teach detection of AI hallucinations or bias rather than persuasion resistance.
Longer-term tracking would be needed to determine whether the observed resistance persists beyond a single session.
Embedding the training in school or workplace curricula could preempt widespread influence on public opinion at population scale.

Load-bearing premise

Short-term performance gains observed inside the lab scenarios will reflect lasting, transferable resistance to persuasive AI outside controlled settings.

What would settle it

A follow-up experiment in which participants who completed LLMimic encounter actual persuasive AI messages in an uncontrolled online setting and show compliance rates indistinguishable from controls.

Figures

Figures reproduced from arXiv: 2604.02637 by Chenyan Jia, Min Ge, Qihui Fan, Weiyan Shi.

**Figure 2.** Figure 2: The LLMimic interface example. [A] Role-play-based: Participants role-play as an LLM, progressing through training stages. [B] Interactive: Participants answer questions and receive timely summary of key concepts. [C] Gamified: As an LLM in training, participants observe real-time changes in their loss or reward. Following prior design considerations for AI literacy tutorials (Long & Magerko, 2020; Ng et … view at source ↗

**Figure 3.** Figure 3: Human study flowchart. Participants completed a pre-survey, were randomly [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: (a) The treatment group reported higher AI literacy than the control group. (b) At the item level, LLMimic improved Data Literacy, Apply AI, Understand AI, and Program AI (select a useful tool to program an AI). ∗ p < .05, ∗∗ p < .01, ∗∗∗ p < .001, .05 < p † < .10. 4.2 RQ2: LLMimic mitigates the effects of persuasive AI Effect on persuasion results Controlling for baseline variables (Section 3.4), we fit a… view at source ↗

**Figure 5.** Figure 5: (a) Persuasion success rate across three scenarios and combined. The treatment group shows lower success rates across all scenarios. (b) Differences (Treatment − Control) in persuasion interaction turns, duration, and average time per turn. Points indicate mean differences with 95% CIs. (c) TARES ethical perception scores (Truthfulness, Authenticity, Respect, Equity, Society), and composite average score b… view at source ↗

**Figure 6.** Figure 6: Mediation analysis shows that LLMimic significantly improves AI literacy but not [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

**Figure 7.** Figure 7: TARES ethical evaluation scores across persuasion scenarios. Donation was rated [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗

**Figure 8.** Figure 8: Predicted persuasion success from separate logistic regression models including [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗

**Figure 9.** Figure 9: Predicted persuasion success across agent perception dimensions, shown relative [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗

**Figure 10.** Figure 10: The welcome page of LLMimic. Participants are introduced to the task that [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗

**Figure 11.** Figure 11: The AI tutor interface in LLMimic. Participants can optionally interact with the [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗

**Figure 12.** Figure 12: Introduction page for the Pre-training phase. The interface explains key concepts [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗

**Figure 13.** Figure 13: Example task from the Pre-training phase. Participants select the next token [PITH_FULL_IMAGE:figures/full_fig_p034_13.png] view at source ↗

**Figure 14.** Figure 14: Example of a Takeaway. Whenever participants learn a new key concept, a Takeaway is presented to reinforce their understanding. D.1.3 Supervised Fine-tuning Stage In the SFT phase, we simulate supervised fine-tuning by providing participants with demonstration data and asking them to select the response that best follows the demonstrated pattern. This setup mirrors how LLMs learn from example input–outpu… view at source ↗

**Figure 15.** Figure 15: Example task from the Supervised Fine-Tuning (SFT) phase. Participants study [PITH_FULL_IMAGE:figures/full_fig_p035_15.png] view at source ↗

**Figure 16.** Figure 16: Example task from the RLHF phase. Participants select the response preferred by [PITH_FULL_IMAGE:figures/full_fig_p036_16.png] view at source ↗

**Figure 17.** Figure 17: Interface design in the Donation scenario. [PITH_FULL_IMAGE:figures/full_fig_p036_17.png] view at source ↗

**Figure 18.** Figure 18: Interface design in the MakeMePay scenario. [PITH_FULL_IMAGE:figures/full_fig_p037_18.png] view at source ↗

**Figure 19.** Figure 19: The Hotel interface. The chatbot appears on the left and hotel cards on the right. [PITH_FULL_IMAGE:figures/full_fig_p038_19.png] view at source ↗

read the original abstract

As large language models (LLMs) become increasingly persuasive, there is concern that people's opinions and decisions may be influenced across various contexts at scale. Prior mitigation (e.g., AI detectors and disclaimers) largely treats people as passive recipients of AI-generated information. To provide a more proactive intervention against persuasive AI, we introduce $\textbf{LLMimic}$, a role-play-based, interactive, gamified AI literacy tutorial, where participants assume the role of an LLM and progress through three key stages of the training pipeline (pretraining, SFT, and RLHF). We conducted a $2 \times 3$ between-subjects study ($N = 274$) where participants either (1) watched an AI history video (control) or (2) interacted with LLMimic (treatment), and then engaged in one of three realistic AI persuasion scenarios: (a) charity donation persuasion, (b) malicious money solicitation, or (c) hotel recommendation. Our results show that LLMimic significantly improved participants' AI literacy ($p < .001$), reduced persuasion success across scenarios ($p < .05$), and enhanced truthfulness and social responsibility levels ($p<0.01$) in the hotel scenario. These findings suggest that LLMimic offers a scalable, human-centered approach to improving AI literacy and supporting more informed interactions with persuasive AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LLMimic gives a fresh interactive spin on AI literacy via role-play simulation of LLM training, with short-term lab gains, but durability and real-world transfer are untested.

read the letter

The main point is that this paper introduces LLMimic, where users role-play as an LLM progressing through pretraining, SFT, and RLHF stages in a gamified setup. In a 2x3 study with 274 participants, the treatment group showed higher AI literacy scores and lower success rates for persuasive AI across charity, money solicitation, and hotel scenarios compared to a video control. The hotel case also had better truthfulness and responsibility ratings. That interactive angle is new relative to passive tools like detectors or disclaimers mentioned in the abstract. The study design is straightforward between-subjects and reports statistically significant differences on the key outcomes. The paper does a reasonable job framing a proactive, human-centered alternative and testing it in plausible scenarios. The sample size supports initial claims, and the approach avoids overclaiming by focusing on immediate post-session effects. The soft spots center on the lack of pre-training baselines, no follow-up measures, and no checks outside the lab. Without those, the differences could reflect temporary alertness or demand effects rather than lasting skill gains that hold up in real conversations with LLMs. Effect sizes and exact item details are not in the abstract, which makes it harder to gauge practical impact. Generalizability beyond the three scenarios is also open. This work suits researchers focused on AI literacy tools or human-AI persuasion. A reader looking for new intervention ideas would find the concept and initial data useful. It deserves serious peer review because the core contribution is original and the experiment provides a concrete starting point, though revisions would need to address longevity and external validity.

Referee Report

3 major / 2 minor

Summary. The paper introduces LLMimic, a role-play-based gamified AI literacy tutorial in which participants assume the role of an LLM and progress through pretraining, SFT, and RLHF stages. A 2×3 between-subjects study (N=274) compares the treatment to a control video condition across three persuasion scenarios (charity donation, malicious money solicitation, hotel recommendation). Results indicate that LLMimic raises AI literacy (p<.001), reduces persuasion success (p<.05), and increases truthfulness and social responsibility in the hotel scenario (p<0.01).

Significance. If the reported effects are robust, the work supplies a concrete, scalable, proactive intervention that moves beyond passive disclaimers or detectors by using experiential role-play to build resistance to persuasive LLMs. The between-subjects design and realistic scenarios provide initial evidence that such training can measurably alter susceptibility and ethical reasoning in simulated interactions.

major comments (3)

[Study Design / Results] The between-subjects design reports post-intervention differences but provides no pre-training baseline measures of AI literacy or persuasion susceptibility. Without pre-tests, it is impossible to distinguish training effects from pre-existing group differences or demand characteristics.
[Results] The abstract and results claim statistically significant reductions in persuasion success and gains in truthfulness/responsibility, yet report only p-values. Effect sizes, confidence intervals, and a power analysis are required to evaluate practical importance and reliability of the central claims.
[Discussion / Limitations] All outcome measures are collected immediately after training in lab-simulated scenarios. The manuscript contains no delayed follow-up or external-validity checks, leaving the durability and real-world transfer of the observed resistance untested.

minor comments (2)

[Method] Clarify the exact items, scales, and reliability coefficients used to measure AI literacy, truthfulness, and social responsibility.
[Method] Provide the precise wording of the three persuasion scenarios and any pilot validation of their realism.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below, providing our response and indicating planned revisions to the manuscript.

read point-by-point responses

Referee: [Study Design / Results] The between-subjects design reports post-intervention differences but provides no pre-training baseline measures of AI literacy or persuasion susceptibility. Without pre-tests, it is impossible to distinguish training effects from pre-existing group differences or demand characteristics.

Authors: We acknowledge that the lack of pre-test baselines is a limitation of the between-subjects design. Random assignment to conditions was used to minimize the likelihood of systematic pre-existing differences, but we agree this does not fully rule out group imbalances or demand effects. We will revise the manuscript to explicitly discuss this in the Limitations section, including potential demand characteristics, and note that pre-post designs could be used in follow-up studies. revision: partial
Referee: [Results] The abstract and results claim statistically significant reductions in persuasion success and gains in truthfulness/responsibility, yet report only p-values. Effect sizes, confidence intervals, and a power analysis are required to evaluate practical importance and reliability of the central claims.

Authors: We will revise the Results section to report effect sizes (Cohen's d for continuous measures and appropriate odds ratios for binary outcomes), 95% confidence intervals around the key estimates, and a post-hoc power analysis. These additions will allow readers to better assess the magnitude and reliability of the observed effects. The updated manuscript will include these details in both the main text and any relevant tables. revision: yes
Referee: [Discussion / Limitations] All outcome measures are collected immediately after training in lab-simulated scenarios. The manuscript contains no delayed follow-up or external-validity checks, leaving the durability and real-world transfer of the observed resistance untested.

Authors: We agree that immediate post-training measurement restricts claims about durability and real-world transfer. The study was scoped as an initial evaluation of immediate effects in controlled scenarios. We will expand the Limitations and Discussion sections to explicitly highlight the absence of delayed follow-up and external validity checks, framing these as important directions for future research. revision: partial

Circularity Check

0 steps flagged

Empirical user study with independent experimental outcomes

full rationale

The paper reports results from a 2x3 between-subjects user study (N=274) that compares LLMimic role-play training against a control video, then measures AI literacy scores and persuasion success rates in three simulated scenarios via post-intervention questionnaires and behavioral choices. No equations, fitted parameters, or self-referential predictions appear; all headline claims (improved literacy p<.001, reduced persuasion p<.05, scenario-specific truthfulness gains p<0.01) are computed directly from collected participant data. No self-citation chains or uniqueness theorems are invoked to justify the central findings. The derivation chain is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of the experimental measures for AI literacy and persuasion success, plus the assumption that role-play through training stages produces transferable understanding. No free parameters or invented entities are introduced.

axioms (2)

domain assumption Role-playing as an LLM through pretraining, SFT, and RLHF stages conveys actionable understanding of persuasion mechanisms
Core premise justifying why the tutorial should improve literacy and resistance.
domain assumption The three chosen scenarios are representative of real-world AI persuasion contexts
Used to operationalize persuasion success as the primary outcome.

pith-pipeline@v0.9.0 · 5547 in / 1245 out tokens · 61279 ms · 2026-05-13T20:29:06.133839+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce LLMimic, a role-play-based, interactive, gamified AI literacy tutorial, where participants assume the role of an LLM and progress through three key stages of the training pipeline (pretraining, SFT, and RLHF).
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our results show that LLMimic significantly improved participants' AI literacy (p < .001), reduced persuasion success across scenarios (p < .05)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

[1]

doi: https://doi.org/10.1016/j.caeai.2024.100251

ISSN 2666-920X. doi: https://doi.org/10.1016/j.caeai.2024.100251. URL https: //www.sciencedirect.com/science/article/pii/S2666920X24000547. Elise Karinshak, Sunny Xun Liu, Joon Sung Park, and Jeffrey T. Hancock. Working with ai to persuade: Examining a large language model’s ability to generate pro-vaccination messages.Proc. ACM Hum.-Comput. Interact., 7(...

work page doi:10.1016/j.caeai.2024.100251 2024
[2]

inoculates

ISSN 2666-5573. doi: https://doi.org/10.1016/j.caeo.2024.100176. URL https: //www.sciencedirect.com/science/article/pii/S266655732400017X. Arkadiusz Modzelewski, Paweł Golik, Anna Kołos, and Giovanni Da San Martino. Can ai-generated persuasion be detected? persuaficial benchmark and ai vs. human linguistic differences, 2026. URLhttps://arxiv.org/abs/2601....

work page doi:10.1016/j.caeo.2024.100176 2024
[3]

Bakker, Daniel Jarrett, Hannah Sheahan, Martin J

doi: 10.1126/science.adq2852. URL https://www.science.org/doi/abs/10.1126/ science.adq2852. Mary Frances Theofanos, Yee-Yin Choong, and Theodore Jensen. Ai use taxonomy: A human-centered approach, 2024-03-26 04:03:00 2024. URL https://tsapps.nist.gov/ publication/get pdf.cfm?pub id=956852. Xuewei Wang, Weiyan Shi, Richard Kim, Yoojung Oh, Sijia Yang, Jing...

work page doi:10.1126/science.adq2852 2024
[4]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

I prefer to live in a large city rather than a small city. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page
[5]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree Then, we ask participants about their demographic information

I would prefer to live in a city with many cultural opportunities, even if the cost of living was higher. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree Then, we ask participants about their demographic information. Participants’ age and gender are provided directly by Prolific

work page
[6]

In which field do you work or study? • Management • Business & Finance • Computer & Math • Architecture & Engineering • Science (Life, Physical, Social) • Community & Social Service • Legal • Education & Library • Arts, Design, Media & Sports • Healthcare (Practitioners & Technical) • Healthcare Support • Protective Service • Food Preparation & Service • ...

work page
[7]

What is your highest, including ongoing, education level? • Less than high school • High school diploma or equivalent (GED) • Associate’s degree • Bachelor’s degree • Master’s degree • Doctoral degree • Professional degree • Other (participants may add if none of the above fit)

work page
[8]

Generally speaking, where would you place yourself on the following scale? • 1 = Extremely Liberal • 4 = Moderate • 7 = Extremely Conservative Finally, we collect participants’ familiarity with LLMs and persuasion, along with their trust in AI and motivation to learn AI concepts

work page
[9]

How would you describe your expertise in AI? • Only heard of AI • Casual use (chat, Q&A, entertainment) • Light use for work/study (e.g., writing support) • Moderate technical use (e.g., coding, data tasks) • Advanced use (e.g., prompt engineering, simple agent development) • Professional AI engineer • AI researcher/expert

work page
[10]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

I can trust the responses generated by AI systems (e.g., ChatGPT). • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page
[11]

In your work or study, how often do you take part in activities such as negotiation, marketing, sales, idea promotion, and related persuasion tasks? • 1 = Never • 4 = Sometimes • 7 = Always

work page
[12]

What are the three most common strategies people use to persuade others? • Scarcity framing, desire framing, and necessity framing • Emotional influence, social influence, and narrative influence •Logical appeal, emotional appeal, and credibility appeal

work page
[13]

How motivated are you to learn the principles of AI? • 1 = Very unmotivated • 4 = Moderate • 7 = Very motivated A.2 Manipulation Check To assess the effectiveness of LLMimic, we asked two questions: one on LLM dynamics and another on AI-driven persuasion. Participants in the treatment group (who interacted with LLMimic) were expected to answer them correc...

work page
[14]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

Based on my experience in this study so far, I can trust the responses generated by AI systems. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page
[15]

(Optional) Generally speaking, under what circumstances do you find AI valuable, and when do you prefer not to rely on it? We developed a simplified and modified version of the Meta AI Literacy Scale to measure participants’ self-reported AI literacy on a 7-point Likert scale (1 = Strongly disagree, 4 = Neither agree nor disagree, 7 = Strongly agree) for ...

work page
[16]

I can explain how AI is trained and modeled from tons of data.[Data Literacy]

work page
[17]

I can use AI effectively to achieve my everyday goals and work together gainfully with an AI.[Apply AI]

work page
[18]

I know the most important concepts of the topic “AI”.[Understand AI]

work page
[19]

I can assess what advantages and disadvantages the use of an AI entails.[Under- stand AI]

work page
[20]

[Detect AI]

I can detect whether an application or conversation partner is AI-based or a human. [Detect AI]

work page
[21]

I can incorporate ethical considerations when deciding whether to use data provided by an AI.[AI Ethics]

work page
[22]

I can select useful tools (e.g., frameworks, programming languages) to program an AI.[Program AI]

work page
[23]

I can rely on my skills in difficult situations when using AI.[Self-Efficacy]

work page
[24]

I realize if AI is influencing me in my everyday decisions.[AI Persuasion]

work page
[25]

I can prevent AI from influencing me in my everyday decisions.[AI Persuasion] This 10-item, 7-point Likert scale showed good reliability (Cronbach’sα=.79). A.4 Post-Experiment After completing the persuasion tasks, participants rated the intervention’s effectiveness with respect to the persuasive conversational agents and their overall perceived quality. ...

work page
[26]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

The AI tutorial at the beginning helped me interact more effectively in the[persuasion task scenario]. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page
[27]

How engaged do you feel throughout the conversation? • 1 = Very disengaged • 4 = Moderate • 7 = Very engaged

work page
[28]

Do you consider the agent you interacted with persuasive? • 1 = Very unpersuasive • 4 = Moderate • 7 = Very persuasive

work page
[29]

How effective was the AI agent in fulfilling its role in the Chat scenario? • 1 = Very ineffective • 4 = Moderate • 7 = Very effective

work page
[30]

• 1 = Very ineffective • 4 = Moderate • 7 = Very effective We then assessed participants’ attention in the persuasion tasks based on key attributes of their assigned scenarios

The decision I made in the[persuasion task scenario]was based on my own judgment rather than on the agent’s influence. • 1 = Very ineffective • 4 = Moderate • 7 = Very effective We then assessed participants’ attention in the persuasion tasks based on key attributes of their assigned scenarios

work page
[31]

[Donation Scenario]Which of the following keywords best describes your interac- tion with the agent? • Save the World •Save the Children • Save the Earth

work page
[32]

• 2 •50 • 100

[MakeMePay Scenario]Please indicate the amount you were told you could spend in the Chat scenario, divided by two. • 2 •50 • 100

work page
[33]

[Hotel Booking Scenario]Which of the following keywords best describes your interaction with the agent? • Upper Manhattan •Midtown NYC • Upper Manhattan We concluded the survey with the TARES ethical persuasion measures on a 7-point Likert scale (1 = Strongly disagree, 4 = Neither agree nor disagree, 7 = Strongly agree) for the following statements, unles...

work page
[34]

I feel I have enough information to make an informed decision at the end of the [persuasion task scenario].[Truthfulness]

work page
[35]

I feel the agent and the information provided were sincere and genuine in the [persuasion task scenario].[Authenticity]

work page
[36]

I feel respected during my interaction with the agent.[Respect] 17 Preprint

work page
[37]

The agent clearly presented important information in the[persuasion task scenario], including potential downsides or limitations.[Equity]

work page
[38]

What is your attitude toward the use of AI for persuasion in general?[Society]

work page
[39]

I think” or “I want to help

Any other feedback? (e.g., suggestions to improve the AI concept tutorial or make the chatbot more useful) • Optional free-text response B Participants We required participants to be English-speaking U.S. residents with a Prolific approval rate of 85–100% and at least 10 prior submissions. Participants were compensated at an hourly rate of $12.00. In tota...

work page 2024

[1] [1]

doi: https://doi.org/10.1016/j.caeai.2024.100251

ISSN 2666-920X. doi: https://doi.org/10.1016/j.caeai.2024.100251. URL https: //www.sciencedirect.com/science/article/pii/S2666920X24000547. Elise Karinshak, Sunny Xun Liu, Joon Sung Park, and Jeffrey T. Hancock. Working with ai to persuade: Examining a large language model’s ability to generate pro-vaccination messages.Proc. ACM Hum.-Comput. Interact., 7(...

work page doi:10.1016/j.caeai.2024.100251 2024

[2] [2]

inoculates

ISSN 2666-5573. doi: https://doi.org/10.1016/j.caeo.2024.100176. URL https: //www.sciencedirect.com/science/article/pii/S266655732400017X. Arkadiusz Modzelewski, Paweł Golik, Anna Kołos, and Giovanni Da San Martino. Can ai-generated persuasion be detected? persuaficial benchmark and ai vs. human linguistic differences, 2026. URLhttps://arxiv.org/abs/2601....

work page doi:10.1016/j.caeo.2024.100176 2024

[3] [3]

Bakker, Daniel Jarrett, Hannah Sheahan, Martin J

doi: 10.1126/science.adq2852. URL https://www.science.org/doi/abs/10.1126/ science.adq2852. Mary Frances Theofanos, Yee-Yin Choong, and Theodore Jensen. Ai use taxonomy: A human-centered approach, 2024-03-26 04:03:00 2024. URL https://tsapps.nist.gov/ publication/get pdf.cfm?pub id=956852. Xuewei Wang, Weiyan Shi, Richard Kim, Yoojung Oh, Sijia Yang, Jing...

work page doi:10.1126/science.adq2852 2024

[4] [4]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

I prefer to live in a large city rather than a small city. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page

[5] [5]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree Then, we ask participants about their demographic information

I would prefer to live in a city with many cultural opportunities, even if the cost of living was higher. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree Then, we ask participants about their demographic information. Participants’ age and gender are provided directly by Prolific

work page

[6] [6]

In which field do you work or study? • Management • Business & Finance • Computer & Math • Architecture & Engineering • Science (Life, Physical, Social) • Community & Social Service • Legal • Education & Library • Arts, Design, Media & Sports • Healthcare (Practitioners & Technical) • Healthcare Support • Protective Service • Food Preparation & Service • ...

work page

[7] [7]

What is your highest, including ongoing, education level? • Less than high school • High school diploma or equivalent (GED) • Associate’s degree • Bachelor’s degree • Master’s degree • Doctoral degree • Professional degree • Other (participants may add if none of the above fit)

work page

[8] [8]

Generally speaking, where would you place yourself on the following scale? • 1 = Extremely Liberal • 4 = Moderate • 7 = Extremely Conservative Finally, we collect participants’ familiarity with LLMs and persuasion, along with their trust in AI and motivation to learn AI concepts

work page

[9] [9]

How would you describe your expertise in AI? • Only heard of AI • Casual use (chat, Q&A, entertainment) • Light use for work/study (e.g., writing support) • Moderate technical use (e.g., coding, data tasks) • Advanced use (e.g., prompt engineering, simple agent development) • Professional AI engineer • AI researcher/expert

work page

[10] [10]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

I can trust the responses generated by AI systems (e.g., ChatGPT). • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page

[11] [11]

In your work or study, how often do you take part in activities such as negotiation, marketing, sales, idea promotion, and related persuasion tasks? • 1 = Never • 4 = Sometimes • 7 = Always

work page

[12] [12]

What are the three most common strategies people use to persuade others? • Scarcity framing, desire framing, and necessity framing • Emotional influence, social influence, and narrative influence •Logical appeal, emotional appeal, and credibility appeal

work page

[13] [13]

How motivated are you to learn the principles of AI? • 1 = Very unmotivated • 4 = Moderate • 7 = Very motivated A.2 Manipulation Check To assess the effectiveness of LLMimic, we asked two questions: one on LLM dynamics and another on AI-driven persuasion. Participants in the treatment group (who interacted with LLMimic) were expected to answer them correc...

work page

[14] [14]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

Based on my experience in this study so far, I can trust the responses generated by AI systems. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page

[15] [15]

(Optional) Generally speaking, under what circumstances do you find AI valuable, and when do you prefer not to rely on it? We developed a simplified and modified version of the Meta AI Literacy Scale to measure participants’ self-reported AI literacy on a 7-point Likert scale (1 = Strongly disagree, 4 = Neither agree nor disagree, 7 = Strongly agree) for ...

work page

[16] [16]

I can explain how AI is trained and modeled from tons of data.[Data Literacy]

work page

[17] [17]

I can use AI effectively to achieve my everyday goals and work together gainfully with an AI.[Apply AI]

work page

[18] [18]

I know the most important concepts of the topic “AI”.[Understand AI]

work page

[19] [19]

I can assess what advantages and disadvantages the use of an AI entails.[Under- stand AI]

work page

[20] [20]

[Detect AI]

I can detect whether an application or conversation partner is AI-based or a human. [Detect AI]

work page

[21] [21]

I can incorporate ethical considerations when deciding whether to use data provided by an AI.[AI Ethics]

work page

[22] [22]

I can select useful tools (e.g., frameworks, programming languages) to program an AI.[Program AI]

work page

[23] [23]

I can rely on my skills in difficult situations when using AI.[Self-Efficacy]

work page

[24] [24]

I realize if AI is influencing me in my everyday decisions.[AI Persuasion]

work page

[25] [25]

I can prevent AI from influencing me in my everyday decisions.[AI Persuasion] This 10-item, 7-point Likert scale showed good reliability (Cronbach’sα=.79). A.4 Post-Experiment After completing the persuasion tasks, participants rated the intervention’s effectiveness with respect to the persuasive conversational agents and their overall perceived quality. ...

work page

[26] [26]

• 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

The AI tutorial at the beginning helped me interact more effectively in the[persuasion task scenario]. • 1 = Strongly disagree • 4 = Neither agree nor disagree • 7 = Strongly agree

work page

[27] [27]

How engaged do you feel throughout the conversation? • 1 = Very disengaged • 4 = Moderate • 7 = Very engaged

work page

[28] [28]

Do you consider the agent you interacted with persuasive? • 1 = Very unpersuasive • 4 = Moderate • 7 = Very persuasive

work page

[29] [29]

How effective was the AI agent in fulfilling its role in the Chat scenario? • 1 = Very ineffective • 4 = Moderate • 7 = Very effective

work page

[30] [30]

• 1 = Very ineffective • 4 = Moderate • 7 = Very effective We then assessed participants’ attention in the persuasion tasks based on key attributes of their assigned scenarios

The decision I made in the[persuasion task scenario]was based on my own judgment rather than on the agent’s influence. • 1 = Very ineffective • 4 = Moderate • 7 = Very effective We then assessed participants’ attention in the persuasion tasks based on key attributes of their assigned scenarios

work page

[31] [31]

[Donation Scenario]Which of the following keywords best describes your interac- tion with the agent? • Save the World •Save the Children • Save the Earth

work page

[32] [32]

• 2 •50 • 100

[MakeMePay Scenario]Please indicate the amount you were told you could spend in the Chat scenario, divided by two. • 2 •50 • 100

work page

[33] [33]

[Hotel Booking Scenario]Which of the following keywords best describes your interaction with the agent? • Upper Manhattan •Midtown NYC • Upper Manhattan We concluded the survey with the TARES ethical persuasion measures on a 7-point Likert scale (1 = Strongly disagree, 4 = Neither agree nor disagree, 7 = Strongly agree) for the following statements, unles...

work page

[34] [34]

I feel I have enough information to make an informed decision at the end of the [persuasion task scenario].[Truthfulness]

work page

[35] [35]

I feel the agent and the information provided were sincere and genuine in the [persuasion task scenario].[Authenticity]

work page

[36] [36]

I feel respected during my interaction with the agent.[Respect] 17 Preprint

work page

[37] [37]

The agent clearly presented important information in the[persuasion task scenario], including potential downsides or limitations.[Equity]

work page

[38] [38]

What is your attitude toward the use of AI for persuasion in general?[Society]

work page

[39] [39]

I think” or “I want to help

Any other feedback? (e.g., suggestions to improve the AI concept tutorial or make the chatbot more useful) • Optional free-text response B Participants We required participants to be English-speaking U.S. residents with a Prolific approval rate of 85–100% and at least 10 prior submissions. Participants were compensated at an hourly rate of $12.00. In tota...

work page 2024