Think Thrice Before You Speak: Dual knowledge-enhanced Theory-of-Mind Reasoning for Persuasive Agents

Bin Guo; Jingqi Liu; Mengqi Chen; Minghui Ma; Qiuyun Zhang; Runze Yang; Xuehao Ma; Yahan Pei; Yan Liu; Zhiwen Yu

arxiv: 2605.22602 · v1 · pith:IAWODIVLnew · submitted 2026-05-21 · 💻 cs.AI

Think Thrice Before You Speak: Dual knowledge-enhanced Theory-of-Mind Reasoning for Persuasive Agents

Minghui Ma , Bin Guo , Runze Yang , Mengqi Chen , Yan Liu , Jingqi Liu , Yahan Pei , Xuehao Ma

show 2 more authors

Qiuyun Zhang Zhiwen Yu

This is my paper

Pith reviewed 2026-05-22 05:45 UTC · model grok-4.3

classification 💻 cs.AI

keywords Theory of MindPersuasive DialogueBDI FrameworkStepwise ReasoningMental State InferenceKnowledge EnhancementDialogue Agents

0 comments

The pith

A dual knowledge-enhanced stepwise reasoning framework lets smaller models outperform GPT-5 at predicting desires, beliefs, and persuasive strategies by modeling their sequential dependencies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a new ToM-PD task that treats persuasive dialogue as a sequence of mental states drawn from the BDI framework. It releases the ToM-BPD dataset with fine-grained annotations for desires, beliefs, intentions, and the strategies used to influence them. It then introduces TTBYS, a reasoning method that draws on both explicit and implicit prior knowledge to reason in three explicit steps before generating a response. Experiments show that Qwen3-8B with this method surpasses GPT-5 on the three prediction subtasks. The work aims to give dialogue agents more stable and interpretable access to others' mental states.

Core claim

By grounding persuasive dialogue in the BDI framework and supplying both explicit and implicit prior experiences inside a three-step reasoning procedure, the TTBYS method produces more accurate and consistent inferences of desires, beliefs, and persuasive strategies than standard prompting or larger baseline models.

What carries the argument

TTBYS, a dual knowledge-enhanced stepwise reasoning framework that first retrieves explicit knowledge, then implicit experience, then integrates both to trace dependencies among mental states before selecting a strategy.

If this is right

Persuasive agents built on TTBYS will maintain consistent mental-state tracking across multiple turns instead of producing fragmented inferences.
Smaller open models can reach or exceed closed large-model performance on desire, belief, and strategy prediction when the three-step dual-knowledge procedure is applied.
The explicit stepwise trace improves the ability to inspect and debug why an agent chooses one persuasive move over another.
The same structure can be reused for other multi-turn social tasks that require tracking latent mental states.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the three-step structure generalizes, it could be inserted into existing ToM benchmarks outside persuasion to test whether the dual-knowledge pattern improves performance on non-persuasive social reasoning.
The method may lower the compute cost of building socially capable agents by allowing mid-sized models to substitute for much larger ones in dialogue settings.
Future work could measure whether the interpretability gains translate into better user outcomes in actual persuasion scenarios such as sales or health coaching.

Load-bearing premise

The BDI framework and the ToM-BPD annotations correctly reflect the actual order and dependencies among mental states that occur in real persuasive conversations.

What would settle it

A new set of human-annotated dialogues in which the order of desire, belief, and intention labels deviates significantly from the BDI sequence assumed by the dataset, or where independent raters disagree with the original mental-state labels at rates above chance.

Figures

Figures reproduced from arXiv: 2605.22602 by Bin Guo, Jingqi Liu, Mengqi Chen, Minghui Ma, Qiuyun Zhang, Runze Yang, Xuehao Ma, Yahan Pei, Yan Liu, Zhiwen Yu.

**Figure 1.** Figure 1: Illustration of self BDI state evolution and BDI-based inference for ToM-driven persuasive dialogue (ToM-PD). The left panel shows the internal [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Overall analysis of dialogue strategies, desire, and belief dynamics [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the ToM-PD (left) and the TTBYS framework (right). In the ToM-PD task, the persuader sequentially infers the persuadee’s mental states, including intention, desire, and belief, from the dialogue history, and subsequently selects an appropriate persuasive strategy based on the inferred states. TTBYS operationalizes this process through three explicit reasoning steps, each corresponding to one st… view at source ↗

**Figure 4.** Figure 4: An example of a ToM-PD Experience. B. First Think: Desire Inference To leverage ToM experiences for deliberative judgment, we first summarize the current dialogue turn (ut, at) as a dialogue summary it. This summary captures the key observable behaviors and the inferred intention of the persuadee at this stage. Using it as a query, we retrieve the top-N most semantically similar historical experiences fro… view at source ↗

**Figure 5.** Figure 5: Impact of the blending coefficients on prediction accuracy. Left: [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Impact of experience quantity on desire and strategy prediction [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Relative performance gains of Qwen-3-8B+ours over baseline methods [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

read the original abstract

Persuasive dialogue requires reasoning about others' latent mental states, a capability known as Theory of Mind (ToM). However, due to reliance on simple prompting strategies and insufficient ToM knowledge, existing LLMs often fail to capture the intrinsic dependencies among mental states, leading to fragmented representations and unstable reasoning. To address these challenges, we introduce the ToM-based Persuasive Dialogue (ToM-PD) task, grounded in the Belief-Desire-Intention (BDI) framework, which explicitly models the sequential dependencies among mental states in multi-turn dialogues. To facilitate research on this task, we construct a large-scale annotated dataset, ToM-based Broad Persuasive Dialogues (ToM-BPD), capturing fine-grained mental states and corresponding persuasive strategies. We further propose Think Thrice Before You Speak (TTBYS), a knowledge-enhanced stepwise reasoning framework that leverages both explicit and implicit prior experiences to improve LLMs' inference of desires, beliefs, and persuasive strategies. Experimental results demonstrate that Qwen3-8B equipped with TTBYS outperforms GPT-5 by 1.20%, 22.80%, and 16.97% in predicting desires, beliefs, and persuasive strategies, respectively. Case studies further show that our approach enhances interpretability and consistency in reasoning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds a new ToM task and annotated dataset for persuasive dialogues but the gains over GPT-5 are modest and rest on unvalidated BDI annotations.

read the letter

This paper defines a ToM-based Persuasive Dialogue task grounded in the BDI framework and releases the ToM-BPD dataset with fine-grained labels for desires, beliefs, and strategies across multi-turn talks. It also presents TTBYS, a stepwise framework that injects explicit and implicit knowledge to guide LLM predictions on those states. The dataset release is the clearest addition here, since labeled mental-state data for persuasion is scarce and could support other work on dialogue agents. The reported results show Qwen3-8B plus TTBYS beating GPT-5 by 1.2% on desires, 22.8% on beliefs, and 17% on strategies, with some case studies on consistency. Those belief and strategy lifts are noticeable, though the desire gain is small enough that it could reflect noise. The main soft spot is the missing validation on the annotations themselves. The abstract gives no inter-annotator agreement numbers or external checks on whether the BDI sequential dependencies actually match real persuasive talk. If the test split shares artifacts with the training labels, the deltas become harder to trust as evidence of better reasoning. This is aimed at researchers working on conversational AI and mental-state modeling in LLMs. Someone looking for benchmarks or data in the persuasive ToM niche would get practical value from the dataset and framework. It has enough new material and a clear experimental angle to deserve a serious referee rather than a desk reject. I would send it out for review but flag the need for stronger dataset quality evidence and fuller baseline protocols.

Referee Report

2 major / 2 minor

Summary. The paper introduces the ToM-PD task grounded in the BDI framework to explicitly model sequential dependencies among beliefs, desires, and intentions in multi-turn persuasive dialogues. It constructs the ToM-BPD dataset with fine-grained annotations of mental states and persuasive strategies, and proposes the TTBYS framework that performs stepwise reasoning augmented by explicit and implicit prior knowledge. Experiments report that Qwen3-8B equipped with TTBYS outperforms GPT-5 by 1.20%, 22.80%, and 16.97% on desire, belief, and strategy prediction tasks, respectively, with additional case studies on interpretability.

Significance. If the dataset annotations prove reliable and the performance gains are shown to be robust, the work could meaningfully advance structured Theory-of-Mind modeling in LLMs for dialogue agents. The BDI-grounded task definition and dual-knowledge enhancement mechanism offer a concrete alternative to ad-hoc prompting, and the new dataset may become a useful resource for evaluating sequential mental-state reasoning.

major comments (2)

[§3 (ToM-BPD Dataset Construction)] §3 (ToM-BPD Dataset Construction): No inter-annotator agreement scores, annotation guideline details, or external validation against independent persuasive dialogue corpora are reported. Because the headline performance deltas rest on the claim that these annotations faithfully encode the intrinsic sequential dependencies among mental states, the absence of such checks leaves open the possibility that measured gains reflect annotation artifacts rather than improved reasoning.
[§4 (Experimental Results)] §4 (Experimental Results): The reported improvements lack accompanying details on prompting templates for GPT-5, exact evaluation metrics, statistical significance tests, variance across runs, or error analysis. The modest 1.20% desire-prediction gain in particular requires these elements to establish that the result is not within noise or tied to a specific test-split distribution.

minor comments (2)

[§2 (TTBYS Framework)] The distinction between 'explicit' and 'implicit' prior experiences in the TTBYS description could be illustrated with a concrete example from the dataset to improve clarity.
[Figures in §4] Figure captions and axis labels should explicitly state the evaluation metric (e.g., accuracy or F1) used for the reported percentages.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of dataset reliability and experimental rigor. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: §3 (ToM-BPD Dataset Construction): No inter-annotator agreement scores, annotation guideline details, or external validation against independent persuasive dialogue corpora are reported. Because the headline performance deltas rest on the claim that these annotations faithfully encode the intrinsic sequential dependencies among mental states, the absence of such checks leaves open the possibility that measured gains reflect annotation artifacts rather than improved reasoning.

Authors: We agree that reporting inter-annotator agreement and annotation details is essential for validating the ToM-BPD dataset. In the revised manuscript we will add inter-annotator agreement scores, expanded annotation guideline excerpts, and an explicit discussion of the lack of external validation against other corpora (noting that the BDI-grounded scheme is task-specific). These additions will directly address concerns about potential annotation artifacts and better substantiate the sequential mental-state dependencies. revision: yes
Referee: §4 (Experimental Results): The reported improvements lack accompanying details on prompting templates for GPT-5, exact evaluation metrics, statistical significance tests, variance across runs, or error analysis. The modest 1.20% desire-prediction gain in particular requires these elements to establish that the result is not within noise or tied to a specific test-split distribution.

Authors: We acknowledge that the experimental reporting requires greater detail to establish robustness. We will revise Section 4 to include the full prompting templates for GPT-5 and all baselines, precise metric definitions, statistical significance tests, variance across multiple runs, and an expanded error analysis that examines the sources of the 1.20% desire-prediction improvement. These changes will demonstrate that the gains are not attributable to noise or split-specific effects. revision: yes

Circularity Check

0 steps flagged

Empirical performance gains on new ToM-BPD dataset do not reduce to self-defined quantities or fitted predictions

full rationale

The paper defines the ToM-PD task via the BDI framework, constructs the ToM-BPD dataset with mental-state annotations, introduces the TTBYS stepwise reasoning method, and reports direct empirical deltas (Qwen3-8B + TTBYS vs. GPT-5) on desire/belief/strategy prediction. These measured improvements are standard held-out comparisons and do not arise from any paper-internal equation that equates a claimed prediction to a fitted parameter or prior output by construction. No load-bearing self-citation chain, uniqueness theorem, or ansatz smuggling is invoked for the core claims; the BDI grounding and dataset construction function as definitional inputs rather than derived results. This yields only a minor score reflecting the absence of external validation for annotations, not circularity in the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The work rests on the standard BDI framework as a modeling choice and introduces two new constructed artifacts without external falsifiable evidence.

axioms (1)

domain assumption The Belief-Desire-Intention (BDI) framework accurately captures sequential dependencies among mental states in multi-turn persuasive dialogues.
Task definition and dataset annotation are explicitly grounded in BDI per the abstract.

invented entities (2)

ToM-PD task no independent evidence
purpose: Formalize persuasive dialogue under Theory of Mind with explicit mental-state dependencies
Newly introduced task definition.
TTBYS framework no independent evidence
purpose: Stepwise dual-knowledge reasoning for desire, belief, and strategy inference
Newly proposed method.

pith-pipeline@v0.9.0 · 5793 in / 1460 out tokens · 47548 ms · 2026-05-22T05:45:13.473812+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we formally define the ToM-based Persuasive Dialogue (ToM-PD) task, and formulate its solution as a stepwise backward inference process over latent mental states... it = f_intention(ht), dt = f_desire(it), bt = f_belief(it, dt)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · 4 internal anchors

[1]

The elaboration likelihood model of persuasion,

R. E. Petty and J. T. Cacioppo, “The elaboration likelihood model of persuasion,” inAdvances in experimental social psychology. Elsevier, 1986, vol. 19, pp. 123–205

work page 1986
[2]

Springer Science & Business Media, 2012

——,Communication and persuasion: Central and peripheral routes to attitude change. Springer Science & Business Media, 2012

work page 2012
[3]

Towards emotional support dialog systems,

S. Liu, C. Zheng, O. Demasi, S. Sabour, Y . Li, Z. Yu, Y . Jiang, and M. Huang, “Towards emotional support dialog systems,”arXiv preprint arXiv:2106.01144, 2021

work page arXiv 2021
[4]

Escot: To- wards interpretable emotional support dialogue systems,

T. Zhang, X. Zhang, J. Zhao, L. Zhou, and Q. Jin, “Escot: To- wards interpretable emotional support dialogue systems,”arXiv preprint arXiv:2406.10960, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12

work page arXiv 2024
[5]

Pepds: A polite and empathetic persuasive dialogue system for charity donation,

K. Mishra, A. M. Samad, P. Totala, and A. Ekbal, “Pepds: A polite and empathetic persuasive dialogue system for charity donation,” in Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 424–440

work page 2022
[6]

Would you like to make a donation? a dialogue system to persuade you to donate,

Y . Song and H. Wang, “Would you like to make a donation? a dialogue system to persuade you to donate,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024, pp. 17 707– 17 717

work page 2024
[7]

Are llms effective negotiators? systematic evaluation of the multifaceted capabilities of llms in negotiation dialogues,

D. Kwon, E. Weiss, T. Kulshrestha, K. Chawla, G. Lucas, and J. Gratch, “Are llms effective negotiators? systematic evaluation of the multifaceted capabilities of llms in negotiation dialogues,” inFindings of the Associa- tion for Computational Linguistics: EMNLP 2024, 2024, pp. 5391–5413

work page 2024
[8]

Tomap: Training opponent-aware llm persuaders with theory of mind,

P. Han, Z. Liu, and J. You, “Tomap: Training opponent-aware llm persuaders with theory of mind,” 2025

work page 2025
[9]

Does the chimpanzee have a theory of mind?

D. Premack and G. Woodruff, “Does the chimpanzee have a theory of mind?”Behavioral and brain sciences, vol. 1, no. 4, pp. 515–526, 1978

work page 1978
[10]

Does the autistic child have a “theory of mind

S. Baron-Cohen, A. M. Leslie, and U. Frith, “Does the autistic child have a “theory of mind”?”Cognition, vol. 21, no. 1, pp. 37–46, 1985

work page 1985
[11]

Cooper: Coordinating specialized agents towards a complex dialogue goal,

Y . Cheng, W. Liu, J. Wang, C. T. Leong, Y . Ouyang, W. Li, X. Wu, and Y . Zheng, “Cooper: Coordinating specialized agents towards a complex dialogue goal,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, 2024, pp. 17 853–17 861

work page 2024
[12]

Injecting salesperson’s dialogue strategies in large language models with chain-of-thought reasoning,

W.-Y . Chang and Y .-N. Chen, “Injecting salesperson’s dialogue strategies in large language models with chain-of-thought reasoning,” 2024

work page 2024
[13]

Negotiationtom: A benchmark for stress-testing machine theory of mind on negotiation surrounding,

C. Chan, C. Jiayang, Y . Yim, Z. Deng, W. Fan, H. Li, X. Liu, H. Zhang, W. Wang, and Y . Song, “Negotiationtom: A benchmark for stress-testing machine theory of mind on negotiation surrounding,”arXiv preprint arXiv:2404.13627, 2024

work page arXiv 2024
[14]

Tomato: Verbalizing the mental states of role-playing llms for benchmarking theory of mind,

K. Shinoda, N. Hojo, K. Nishida, S. Mizuno, K. Suzuki, R. Masumura, H. Sugiyama, and K. Saito, “Tomato: Verbalizing the mental states of role-playing llms for benchmarking theory of mind,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, 2025, pp. 1520– 1528

work page 2025
[15]

Under- standing social reasoning in language models with language models,

K. Gandhi, J.-P. Fr ¨anken, T. Gerstenberg, and N. Goodman, “Under- standing social reasoning in language models with language models,” Advances in Neural Information Processing Systems, vol. 36, pp. 13 518–13 529, 2023

work page 2023
[16]

Persuasivetom: A bench- mark for evaluating machine theory of mind in persuasive dialogues,

F. Yu, L. Jiang, S. Huang, Z. Wu, and X. Dai, “Persuasivetom: A bench- mark for evaluating machine theory of mind in persuasive dialogues,” arXiv preprint arXiv:2502.21017, 2025

work page arXiv 2025
[17]

The belief-desire-intention model of agency,

M. Georgeff, B. Pell, M. Pollack, M. Tambe, and M. Wooldridge, “The belief-desire-intention model of agency,” inInternational workshop on agent theories, architectures, and languages. Springer, 1998, pp. 1–10

work page 1998
[18]

Episodic memory development: Theory of mind is part of re-experiencing experienced events,

J. Perner, D. Kloo, and E. Gornik, “Episodic memory development: Theory of mind is part of re-experiencing experienced events,”Infant and Child Development: An International Journal of Research and Practice, vol. 16, no. 5, pp. 471–490, 2007

work page 2007
[19]

Effects of persuasive dialogues: testing bot identities and inquiry strategies,

W. Shi, X. Wang, Y . J. Oh, J. Zhang, S. Sahay, and Z. Yu, “Effects of persuasive dialogues: testing bot identities and inquiry strategies,” in Proceedings of the 2020 CHI conference on human factors in computing systems, 2020, pp. 1–13

work page 2020
[20]

A multi- appeal model of persuasion for online petition success: A linguistic cue- based approach,

Y . Chen, S. Deng, D.-H. Kwak, A. Elnoshokaty, and J. Wu, “A multi- appeal model of persuasion for online petition success: A linguistic cue- based approach,”Journal of the Association for Information Systems, vol. 20, no. 2, p. 3, 2019

work page 2019
[21]

Persuasion for good: Towards a personalized persuasive dialogue system for social good

X. Wang, W. Shi, R. Kim, Y . Oh, S. Yang, J. Zhang, and Z. Yu, “Per- suasion for good: Towards a personalized persuasive dialogue system for social good,”arXiv preprint arXiv:1906.06725, 2019

work page arXiv 1906
[22]

Towards personalized conversational sales agents: Contextual user profiling for strategic ac- tion,

T. Kim, J. Lee, S. Yoon, S. Kim, and D. Lee, “Towards personalized conversational sales agents: Contextual user profiling for strategic ac- tion,”arXiv preprint arXiv:2504.08754, 2025

work page arXiv 2025
[23]

How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms,

Y . Zeng, H. Lin, J. Zhang, D. Yang, R. Jia, and W. Shi, “How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 14 322–14 350

work page 2024
[24]

The earth is flat because...: Investigating llms’ belief towards misinformation via persuasive conversation,

R. Xu, B. Lin, S. Yang, T. Zhang, W. Shi, T. Zhang, Z. Fang, W. Xu, and H. Qiu, “The earth is flat because...: Investigating llms’ belief towards misinformation via persuasive conversation,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 16 259–16 303

work page 2024
[25]

Zero-shot persuasive chatbots with llm-generated strategies and information retrieval,

K. Furumai, R. Legaspi, J. C. V . Romero, Y . Yamazaki, Y . Nishimura, S. Semnani, K. Ikeda, W. Shi, and M. Lam, “Zero-shot persuasive chatbots with llm-generated strategies and information retrieval,” in Findings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 11 224–11 249

work page 2024
[26]

Improving multi-turn emotional support dialogue generation with lookahead strategy planning,

Y . Cheng, W. Liu, W. Li, J. Wang, R. Zhao, B. Liu, X. Liang, and Y . Zheng, “Improving multi-turn emotional support dialogue generation with lookahead strategy planning,”arXiv preprint arXiv:2210.04242, 2022

work page arXiv 2022
[27]

Cem: Commonsense-aware empathetic response generation,

S. Sabour, C. Zheng, and M. Huang, “Cem: Commonsense-aware empathetic response generation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 11 229–11 237

work page 2022
[28]

Knowledge-enhanced mixed- initiative dialogue system for emotional support conversations,

Y . Deng, W. Zhang, Y . Yuan, and W. Lam, “Knowledge-enhanced mixed- initiative dialogue system for emotional support conversations,”arXiv preprint arXiv:2305.10172, 2023

work page arXiv 2023
[29]

Knowledge-enhanced memory model for emotional support conversation,

M. Jia, Q. Chen, L. Jing, D. Fu, and R. Li, “Knowledge-enhanced memory model for emotional support conversation,”arXiv preprint arXiv:2310.07700, 2023

work page arXiv 2023
[30]

Improving knowledge gain and emotional experience in online learning with knowledge and emotional scaffolding-based conversational agent,

Z. Liu, H. Duan, S. Liu, R. Mu, S. Liu, and Z. Yang, “Improving knowledge gain and emotional experience in online learning with knowledge and emotional scaffolding-based conversational agent,”Ed- ucational Technology & Society, vol. 27, no. 2, pp. 197–219, 2024

work page 2024
[31]

Toward real-world chinese psychological support dialogues: Cpsdd dataset and a co-evolving multi-agent system,

Y . Shi, L. Zhang, and F. Kong, “Toward real-world chinese psychological support dialogues: Cpsdd dataset and a co-evolving multi-agent system,” arXiv preprint arXiv:2507.07509, 2025

work page arXiv 2025
[32]

Entering real social world! benchmarking the theory of mind and socialization capabilities of llms from a first-person perspective. arxiv 2024,

G. Hou, W. Zhang, Y . Shen, Z. Tan, S. Shen, and W. Lu, “Entering real social world! benchmarking the theory of mind and socialization capabilities of llms from a first-person perspective. arxiv 2024,”arXiv preprint arXiv:2410.06195, 2024

work page arXiv 2024
[33]

Think twice: Perspective-taking improves large language models’ theory-of-mind ca- pabilities,

A. Wilf, S. Lee, P. P. Liang, and L.-P. Morency, “Think twice: Perspective-taking improves large language models’ theory-of-mind ca- pabilities,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 8292– 8308

work page 2024
[34]

Let’s put ourselves in sally’s shoes: Shoes-of-others prefixing improves theory of mind in large language models,

K. Shinoda, N. Hojo, K. Nishida, Y . Yamazaki, K. Suzuki, H. Sugiyama, and K. Saito, “Let’s put ourselves in sally’s shoes: Shoes-of-others prefixing improves theory of mind in large language models,”arXiv preprint arXiv:2506.05970, 2025

work page arXiv 2025
[35]

A notion of complexity for theory of mind via discrete world models,

X. A. Huang, E. La Malfa, S. Marro, A. Asperti, A. G. Cohn, and M. J. Wooldridge, “A notion of complexity for theory of mind via discrete world models,” inFindings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 2964–2983

work page 2024
[36]

Hypothet- ical minds: Scaffolding theory of mind for multi-agent tasks with large language models,

L. Cross, V . Xiang, A. Bhatia, D. L. Yamins, and N. Haber, “Hypothet- ical minds: Scaffolding theory of mind for multi-agent tasks with large language models,”arXiv preprint arXiv:2407.07086, 2024

work page arXiv 2024
[37]

Minding language models’(lack of) theory of mind: A plug-and-play multi-character belief tracker,

M. Sclar, S. Kumar, P. West, A. Suhr, Y . Choi, and Y . Tsvetkov, “Minding language models’(lack of) theory of mind: A plug-and-play multi-character belief tracker,”arXiv preprint arXiv:2306.00924, 2023

work page arXiv 2023
[38]

The neuro-symbolic inverse planning engine (nipe): Modeling probabilistic social inferences from linguistic inputs,

L. Ying, K. M. Collins, M. Wei, C. E. Zhang, T. Zhi-Xuan, A. Weller, J. B. Tenenbaum, and L. Wong, “The neuro-symbolic inverse planning engine (nipe): Modeling probabilistic social inferences from linguistic inputs,”arXiv preprint arXiv:2306.14325, 2023

work page arXiv 2023
[39]

Metamind: Modeling human social thoughts with metacognitive multi-agent systems,

X. Zhang, Y . Chen, S. Yeh, and S. Li, “Metamind: Modeling human social thoughts with metacognitive multi-agent systems,”arXiv preprint arXiv:2505.18943, 2025

work page arXiv 2025
[40]

Motivational interviewing third edition: helping people change,

W. Miller and S. Rollnick, “Motivational interviewing third edition: helping people change,”New York: Guilford, 2013

work page 2013
[41]

The future of cognitive strategy-enhanced persuasive dialogue agents: new perspectives and trends,

M. Chen, B. Guo, H. Wang, H. Li, Q. Zhao, J. Liu, Y . Ding, Y . Pan, and Z. Yu, “The future of cognitive strategy-enhanced persuasive dialogue agents: new perspectives and trends,”Frontiers of Computer Science, vol. 19, no. 5, p. 195315, 2025

work page 2025
[42]

Plug-and-play policy planner for large language model powered dialogue agents,

Y . Deng, W. Zhang, W. Lam, S.-K. Ng, and T.-S. Chua, “Plug-and-play policy planner for large language model powered dialogue agents,”arXiv preprint arXiv:2311.00262, 2023

work page arXiv 2023
[43]

Dream to chat: Model-based reinforcement learning on dialogues with user belief modeling,

Y . Zhao, X. Wang, D. Wang, Z. Jiang, Q. Gu, T. Chen, N. Xi, J. Qu, Y . Chen, and L. Ji, “Dream to chat: Model-based reinforcement learning on dialogues with user belief modeling,” inFindings of the Association for Computational Linguistics: EMNLP 2025, 2025, pp. 4764–4781

work page 2025
[44]

Neuro-sym supporter: A thoughtful emotion support agent integrating neural and symbolic policy learning,

M. Ma, B. Guo, M. Chen, J. Liu, Y . Ding, Y . Liu, and H. Wang, “Neuro-sym supporter: A thoughtful emotion support agent integrating neural and symbolic policy learning,” inProceedings of the ACM Web Conference 2026, 2026, pp. 3823–3834

work page 2026
[45]

METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues

H. Yang, J. Liu, C. Huang, F. Wu, W. Lei, and S.-K. Ng, “Metro: Towards strategy induction from expert dialogue transcripts for non- collaborative dialogues,”arXiv preprint arXiv:2604.11427, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[46]

Large lan- guage models are zero-shot reasoners,

T. Kojima, S. S. Gu, M. Reid, Y . Matsuo, and Y . Iwasawa, “Large lan- guage models are zero-shot reasoners,”Advances in neural information processing systems, vol. 35, pp. 22 199–22 213, 2022

work page 2022
[47]

Emobench: Evaluating the emotional intel- ligence of large language models,

S. Sabour, S. Liu, Z. Zhang, J. Liu, J. Zhou, A. Sunaryo, T. Lee, R. Mihalcea, and M. Huang, “Emobench: Evaluating the emotional intel- ligence of large language models,” inProceedings of the 62nd Annual JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13 Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. ...

work page 2021
[48]

The Llama 3 Herd of Models

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughanet al., “The llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[49]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[50]

Mixtral of Experts

A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bam- ford, D. S. Chaplot, D. d. l. Casas, E. B. Hanna, F. Bressandet al., “Mixtral of experts,”arXiv preprint arXiv:2401.04088, 2024. APPENDIX This section presents the prompts used in our experiments. Section A describes the prompts used for automatic annotation, Section B presents the promp...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[51]

Prompt for Vanilla Prompting (Desire Prediction) Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire

Prompt for Vanilla Zero-shot Prompting:The vanilla zero-shot prompts for predicting desire, belief, and strategy are presented as follows. Prompt for Vanilla Prompting (Desire Prediction) Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire. Choose exactly one option: A. Unwilling B. Uncertain C. Willing...

work page
[52]

Prompt for CoT Prompting (Desire Prediction) Prompt for CoT prompting: Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire

Prompt for CoT prompting:The CoT prompts for pre- dicting desire, belief, and strategy are presented as follows. Prompt for CoT Prompting (Desire Prediction) Prompt for CoT prompting: Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire. Think step by step to answer the question. End your response with: ...

work page 2021
[53]

The prompt for predicting belief is as follows

Prompt for TTBYS:TTBYS uses vanilla zero-shot prompting to predict desire and strategy. The prompt for predicting belief is as follows. Prompt for TTBYS (Belief Prediction) Relevant Experience:<top relevant experience> Infer the persuadee’s belief in the current conversation context based on the prediction method in relevant experiences. Current conversat...

work page
[54]

Prompt for Belief Evaluation You are an evaluator

Prompt for evaluation:We utilize a large language model as an evaluator to assess the belief prediction accuracy of TTBYS, using the prompt as follows. Prompt for Belief Evaluation You are an evaluator. Your task is to evaluate the accuracy of belief prediction based on the following rules:

work page
[55]

If the predicted positive and negative beliefs fully match the ground truth, score = 1

work page
[56]

If both positive and negative beliefs are mentioned but the underlying reasons are not fully correct, score = 0.5

work page
[57]

If both are incorrect, score = 0

work page
[58]

- Otherwise, score = 0

If the ground truth belief only contains a positive OR only a negative belief: - If the prediction matches, score = 0.5. - Otherwise, score = 0. Ground truth belief:<gt_belief> Predicted belief:<pred_belief> Output ONLY a number in{0, 0.5, 1}. C. Prompt for Interactive Evaluation The prompts used for the interactive experiments, includ- ing GPT-5, GPT-5 +...

work page 2021
[59]

comprehensive facilities, affordable pricing, and en- couraging long-term exercise habits

We observed that these experiences closely resemble the current context, especially the top-3experiences in case 1, JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 16 which are highly similar to the statements in Case 1. The concise belief prediction patterns in case 2 also guided the LLM to produce belief more aligned with the ground truth. Cas...

work page 2021

[1] [1]

The elaboration likelihood model of persuasion,

R. E. Petty and J. T. Cacioppo, “The elaboration likelihood model of persuasion,” inAdvances in experimental social psychology. Elsevier, 1986, vol. 19, pp. 123–205

work page 1986

[2] [2]

Springer Science & Business Media, 2012

——,Communication and persuasion: Central and peripheral routes to attitude change. Springer Science & Business Media, 2012

work page 2012

[3] [3]

Towards emotional support dialog systems,

S. Liu, C. Zheng, O. Demasi, S. Sabour, Y . Li, Z. Yu, Y . Jiang, and M. Huang, “Towards emotional support dialog systems,”arXiv preprint arXiv:2106.01144, 2021

work page arXiv 2021

[4] [4]

Escot: To- wards interpretable emotional support dialogue systems,

T. Zhang, X. Zhang, J. Zhao, L. Zhou, and Q. Jin, “Escot: To- wards interpretable emotional support dialogue systems,”arXiv preprint arXiv:2406.10960, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12

work page arXiv 2024

[5] [5]

Pepds: A polite and empathetic persuasive dialogue system for charity donation,

K. Mishra, A. M. Samad, P. Totala, and A. Ekbal, “Pepds: A polite and empathetic persuasive dialogue system for charity donation,” in Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 424–440

work page 2022

[6] [6]

Would you like to make a donation? a dialogue system to persuade you to donate,

Y . Song and H. Wang, “Would you like to make a donation? a dialogue system to persuade you to donate,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024, pp. 17 707– 17 717

work page 2024

[7] [7]

Are llms effective negotiators? systematic evaluation of the multifaceted capabilities of llms in negotiation dialogues,

D. Kwon, E. Weiss, T. Kulshrestha, K. Chawla, G. Lucas, and J. Gratch, “Are llms effective negotiators? systematic evaluation of the multifaceted capabilities of llms in negotiation dialogues,” inFindings of the Associa- tion for Computational Linguistics: EMNLP 2024, 2024, pp. 5391–5413

work page 2024

[8] [8]

Tomap: Training opponent-aware llm persuaders with theory of mind,

P. Han, Z. Liu, and J. You, “Tomap: Training opponent-aware llm persuaders with theory of mind,” 2025

work page 2025

[9] [9]

Does the chimpanzee have a theory of mind?

D. Premack and G. Woodruff, “Does the chimpanzee have a theory of mind?”Behavioral and brain sciences, vol. 1, no. 4, pp. 515–526, 1978

work page 1978

[10] [10]

Does the autistic child have a “theory of mind

S. Baron-Cohen, A. M. Leslie, and U. Frith, “Does the autistic child have a “theory of mind”?”Cognition, vol. 21, no. 1, pp. 37–46, 1985

work page 1985

[11] [11]

Cooper: Coordinating specialized agents towards a complex dialogue goal,

Y . Cheng, W. Liu, J. Wang, C. T. Leong, Y . Ouyang, W. Li, X. Wu, and Y . Zheng, “Cooper: Coordinating specialized agents towards a complex dialogue goal,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, 2024, pp. 17 853–17 861

work page 2024

[12] [12]

Injecting salesperson’s dialogue strategies in large language models with chain-of-thought reasoning,

W.-Y . Chang and Y .-N. Chen, “Injecting salesperson’s dialogue strategies in large language models with chain-of-thought reasoning,” 2024

work page 2024

[13] [13]

Negotiationtom: A benchmark for stress-testing machine theory of mind on negotiation surrounding,

C. Chan, C. Jiayang, Y . Yim, Z. Deng, W. Fan, H. Li, X. Liu, H. Zhang, W. Wang, and Y . Song, “Negotiationtom: A benchmark for stress-testing machine theory of mind on negotiation surrounding,”arXiv preprint arXiv:2404.13627, 2024

work page arXiv 2024

[14] [14]

Tomato: Verbalizing the mental states of role-playing llms for benchmarking theory of mind,

K. Shinoda, N. Hojo, K. Nishida, S. Mizuno, K. Suzuki, R. Masumura, H. Sugiyama, and K. Saito, “Tomato: Verbalizing the mental states of role-playing llms for benchmarking theory of mind,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, 2025, pp. 1520– 1528

work page 2025

[15] [15]

Under- standing social reasoning in language models with language models,

K. Gandhi, J.-P. Fr ¨anken, T. Gerstenberg, and N. Goodman, “Under- standing social reasoning in language models with language models,” Advances in Neural Information Processing Systems, vol. 36, pp. 13 518–13 529, 2023

work page 2023

[16] [16]

Persuasivetom: A bench- mark for evaluating machine theory of mind in persuasive dialogues,

F. Yu, L. Jiang, S. Huang, Z. Wu, and X. Dai, “Persuasivetom: A bench- mark for evaluating machine theory of mind in persuasive dialogues,” arXiv preprint arXiv:2502.21017, 2025

work page arXiv 2025

[17] [17]

The belief-desire-intention model of agency,

M. Georgeff, B. Pell, M. Pollack, M. Tambe, and M. Wooldridge, “The belief-desire-intention model of agency,” inInternational workshop on agent theories, architectures, and languages. Springer, 1998, pp. 1–10

work page 1998

[18] [18]

Episodic memory development: Theory of mind is part of re-experiencing experienced events,

J. Perner, D. Kloo, and E. Gornik, “Episodic memory development: Theory of mind is part of re-experiencing experienced events,”Infant and Child Development: An International Journal of Research and Practice, vol. 16, no. 5, pp. 471–490, 2007

work page 2007

[19] [19]

Effects of persuasive dialogues: testing bot identities and inquiry strategies,

W. Shi, X. Wang, Y . J. Oh, J. Zhang, S. Sahay, and Z. Yu, “Effects of persuasive dialogues: testing bot identities and inquiry strategies,” in Proceedings of the 2020 CHI conference on human factors in computing systems, 2020, pp. 1–13

work page 2020

[20] [20]

A multi- appeal model of persuasion for online petition success: A linguistic cue- based approach,

Y . Chen, S. Deng, D.-H. Kwak, A. Elnoshokaty, and J. Wu, “A multi- appeal model of persuasion for online petition success: A linguistic cue- based approach,”Journal of the Association for Information Systems, vol. 20, no. 2, p. 3, 2019

work page 2019

[21] [21]

Persuasion for good: Towards a personalized persuasive dialogue system for social good

X. Wang, W. Shi, R. Kim, Y . Oh, S. Yang, J. Zhang, and Z. Yu, “Per- suasion for good: Towards a personalized persuasive dialogue system for social good,”arXiv preprint arXiv:1906.06725, 2019

work page arXiv 1906

[22] [22]

Towards personalized conversational sales agents: Contextual user profiling for strategic ac- tion,

T. Kim, J. Lee, S. Yoon, S. Kim, and D. Lee, “Towards personalized conversational sales agents: Contextual user profiling for strategic ac- tion,”arXiv preprint arXiv:2504.08754, 2025

work page arXiv 2025

[23] [23]

How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms,

Y . Zeng, H. Lin, J. Zhang, D. Yang, R. Jia, and W. Shi, “How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 14 322–14 350

work page 2024

[24] [24]

The earth is flat because...: Investigating llms’ belief towards misinformation via persuasive conversation,

R. Xu, B. Lin, S. Yang, T. Zhang, W. Shi, T. Zhang, Z. Fang, W. Xu, and H. Qiu, “The earth is flat because...: Investigating llms’ belief towards misinformation via persuasive conversation,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 16 259–16 303

work page 2024

[25] [25]

Zero-shot persuasive chatbots with llm-generated strategies and information retrieval,

K. Furumai, R. Legaspi, J. C. V . Romero, Y . Yamazaki, Y . Nishimura, S. Semnani, K. Ikeda, W. Shi, and M. Lam, “Zero-shot persuasive chatbots with llm-generated strategies and information retrieval,” in Findings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 11 224–11 249

work page 2024

[26] [26]

Improving multi-turn emotional support dialogue generation with lookahead strategy planning,

Y . Cheng, W. Liu, W. Li, J. Wang, R. Zhao, B. Liu, X. Liang, and Y . Zheng, “Improving multi-turn emotional support dialogue generation with lookahead strategy planning,”arXiv preprint arXiv:2210.04242, 2022

work page arXiv 2022

[27] [27]

Cem: Commonsense-aware empathetic response generation,

S. Sabour, C. Zheng, and M. Huang, “Cem: Commonsense-aware empathetic response generation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 11 229–11 237

work page 2022

[28] [28]

Knowledge-enhanced mixed- initiative dialogue system for emotional support conversations,

Y . Deng, W. Zhang, Y . Yuan, and W. Lam, “Knowledge-enhanced mixed- initiative dialogue system for emotional support conversations,”arXiv preprint arXiv:2305.10172, 2023

work page arXiv 2023

[29] [29]

Knowledge-enhanced memory model for emotional support conversation,

M. Jia, Q. Chen, L. Jing, D. Fu, and R. Li, “Knowledge-enhanced memory model for emotional support conversation,”arXiv preprint arXiv:2310.07700, 2023

work page arXiv 2023

[30] [30]

Improving knowledge gain and emotional experience in online learning with knowledge and emotional scaffolding-based conversational agent,

Z. Liu, H. Duan, S. Liu, R. Mu, S. Liu, and Z. Yang, “Improving knowledge gain and emotional experience in online learning with knowledge and emotional scaffolding-based conversational agent,”Ed- ucational Technology & Society, vol. 27, no. 2, pp. 197–219, 2024

work page 2024

[31] [31]

Toward real-world chinese psychological support dialogues: Cpsdd dataset and a co-evolving multi-agent system,

Y . Shi, L. Zhang, and F. Kong, “Toward real-world chinese psychological support dialogues: Cpsdd dataset and a co-evolving multi-agent system,” arXiv preprint arXiv:2507.07509, 2025

work page arXiv 2025

[32] [32]

Entering real social world! benchmarking the theory of mind and socialization capabilities of llms from a first-person perspective. arxiv 2024,

G. Hou, W. Zhang, Y . Shen, Z. Tan, S. Shen, and W. Lu, “Entering real social world! benchmarking the theory of mind and socialization capabilities of llms from a first-person perspective. arxiv 2024,”arXiv preprint arXiv:2410.06195, 2024

work page arXiv 2024

[33] [33]

Think twice: Perspective-taking improves large language models’ theory-of-mind ca- pabilities,

A. Wilf, S. Lee, P. P. Liang, and L.-P. Morency, “Think twice: Perspective-taking improves large language models’ theory-of-mind ca- pabilities,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 8292– 8308

work page 2024

[34] [34]

Let’s put ourselves in sally’s shoes: Shoes-of-others prefixing improves theory of mind in large language models,

K. Shinoda, N. Hojo, K. Nishida, Y . Yamazaki, K. Suzuki, H. Sugiyama, and K. Saito, “Let’s put ourselves in sally’s shoes: Shoes-of-others prefixing improves theory of mind in large language models,”arXiv preprint arXiv:2506.05970, 2025

work page arXiv 2025

[35] [35]

A notion of complexity for theory of mind via discrete world models,

X. A. Huang, E. La Malfa, S. Marro, A. Asperti, A. G. Cohn, and M. J. Wooldridge, “A notion of complexity for theory of mind via discrete world models,” inFindings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 2964–2983

work page 2024

[36] [36]

Hypothet- ical minds: Scaffolding theory of mind for multi-agent tasks with large language models,

L. Cross, V . Xiang, A. Bhatia, D. L. Yamins, and N. Haber, “Hypothet- ical minds: Scaffolding theory of mind for multi-agent tasks with large language models,”arXiv preprint arXiv:2407.07086, 2024

work page arXiv 2024

[37] [37]

Minding language models’(lack of) theory of mind: A plug-and-play multi-character belief tracker,

M. Sclar, S. Kumar, P. West, A. Suhr, Y . Choi, and Y . Tsvetkov, “Minding language models’(lack of) theory of mind: A plug-and-play multi-character belief tracker,”arXiv preprint arXiv:2306.00924, 2023

work page arXiv 2023

[38] [38]

The neuro-symbolic inverse planning engine (nipe): Modeling probabilistic social inferences from linguistic inputs,

L. Ying, K. M. Collins, M. Wei, C. E. Zhang, T. Zhi-Xuan, A. Weller, J. B. Tenenbaum, and L. Wong, “The neuro-symbolic inverse planning engine (nipe): Modeling probabilistic social inferences from linguistic inputs,”arXiv preprint arXiv:2306.14325, 2023

work page arXiv 2023

[39] [39]

Metamind: Modeling human social thoughts with metacognitive multi-agent systems,

X. Zhang, Y . Chen, S. Yeh, and S. Li, “Metamind: Modeling human social thoughts with metacognitive multi-agent systems,”arXiv preprint arXiv:2505.18943, 2025

work page arXiv 2025

[40] [40]

Motivational interviewing third edition: helping people change,

W. Miller and S. Rollnick, “Motivational interviewing third edition: helping people change,”New York: Guilford, 2013

work page 2013

[41] [41]

The future of cognitive strategy-enhanced persuasive dialogue agents: new perspectives and trends,

M. Chen, B. Guo, H. Wang, H. Li, Q. Zhao, J. Liu, Y . Ding, Y . Pan, and Z. Yu, “The future of cognitive strategy-enhanced persuasive dialogue agents: new perspectives and trends,”Frontiers of Computer Science, vol. 19, no. 5, p. 195315, 2025

work page 2025

[42] [42]

Plug-and-play policy planner for large language model powered dialogue agents,

Y . Deng, W. Zhang, W. Lam, S.-K. Ng, and T.-S. Chua, “Plug-and-play policy planner for large language model powered dialogue agents,”arXiv preprint arXiv:2311.00262, 2023

work page arXiv 2023

[43] [43]

Dream to chat: Model-based reinforcement learning on dialogues with user belief modeling,

Y . Zhao, X. Wang, D. Wang, Z. Jiang, Q. Gu, T. Chen, N. Xi, J. Qu, Y . Chen, and L. Ji, “Dream to chat: Model-based reinforcement learning on dialogues with user belief modeling,” inFindings of the Association for Computational Linguistics: EMNLP 2025, 2025, pp. 4764–4781

work page 2025

[44] [44]

Neuro-sym supporter: A thoughtful emotion support agent integrating neural and symbolic policy learning,

M. Ma, B. Guo, M. Chen, J. Liu, Y . Ding, Y . Liu, and H. Wang, “Neuro-sym supporter: A thoughtful emotion support agent integrating neural and symbolic policy learning,” inProceedings of the ACM Web Conference 2026, 2026, pp. 3823–3834

work page 2026

[45] [45]

METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues

H. Yang, J. Liu, C. Huang, F. Wu, W. Lei, and S.-K. Ng, “Metro: Towards strategy induction from expert dialogue transcripts for non- collaborative dialogues,”arXiv preprint arXiv:2604.11427, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[46] [46]

Large lan- guage models are zero-shot reasoners,

T. Kojima, S. S. Gu, M. Reid, Y . Matsuo, and Y . Iwasawa, “Large lan- guage models are zero-shot reasoners,”Advances in neural information processing systems, vol. 35, pp. 22 199–22 213, 2022

work page 2022

[47] [47]

Emobench: Evaluating the emotional intel- ligence of large language models,

S. Sabour, S. Liu, Z. Zhang, J. Liu, J. Zhou, A. Sunaryo, T. Lee, R. Mihalcea, and M. Huang, “Emobench: Evaluating the emotional intel- ligence of large language models,” inProceedings of the 62nd Annual JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13 Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. ...

work page 2021

[48] [48]

The Llama 3 Herd of Models

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughanet al., “The llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[49] [49]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[50] [50]

Mixtral of Experts

A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bam- ford, D. S. Chaplot, D. d. l. Casas, E. B. Hanna, F. Bressandet al., “Mixtral of experts,”arXiv preprint arXiv:2401.04088, 2024. APPENDIX This section presents the prompts used in our experiments. Section A describes the prompts used for automatic annotation, Section B presents the promp...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[51] [51]

Prompt for Vanilla Prompting (Desire Prediction) Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire

Prompt for Vanilla Zero-shot Prompting:The vanilla zero-shot prompts for predicting desire, belief, and strategy are presented as follows. Prompt for Vanilla Prompting (Desire Prediction) Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire. Choose exactly one option: A. Unwilling B. Uncertain C. Willing...

work page

[52] [52]

Prompt for CoT Prompting (Desire Prediction) Prompt for CoT prompting: Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire

Prompt for CoT prompting:The CoT prompts for pre- dicting desire, belief, and strategy are presented as follows. Prompt for CoT Prompting (Desire Prediction) Prompt for CoT prompting: Current conversation:<dialogue history> Based on the above conversation, classify the persuadee’s desire. Think step by step to answer the question. End your response with: ...

work page 2021

[53] [53]

The prompt for predicting belief is as follows

Prompt for TTBYS:TTBYS uses vanilla zero-shot prompting to predict desire and strategy. The prompt for predicting belief is as follows. Prompt for TTBYS (Belief Prediction) Relevant Experience:<top relevant experience> Infer the persuadee’s belief in the current conversation context based on the prediction method in relevant experiences. Current conversat...

work page

[54] [54]

Prompt for Belief Evaluation You are an evaluator

Prompt for evaluation:We utilize a large language model as an evaluator to assess the belief prediction accuracy of TTBYS, using the prompt as follows. Prompt for Belief Evaluation You are an evaluator. Your task is to evaluate the accuracy of belief prediction based on the following rules:

work page

[55] [55]

If the predicted positive and negative beliefs fully match the ground truth, score = 1

work page

[56] [56]

If both positive and negative beliefs are mentioned but the underlying reasons are not fully correct, score = 0.5

work page

[57] [57]

If both are incorrect, score = 0

work page

[58] [58]

- Otherwise, score = 0

If the ground truth belief only contains a positive OR only a negative belief: - If the prediction matches, score = 0.5. - Otherwise, score = 0. Ground truth belief:<gt_belief> Predicted belief:<pred_belief> Output ONLY a number in{0, 0.5, 1}. C. Prompt for Interactive Evaluation The prompts used for the interactive experiments, includ- ing GPT-5, GPT-5 +...

work page 2021

[59] [59]

comprehensive facilities, affordable pricing, and en- couraging long-term exercise habits

We observed that these experiences closely resemble the current context, especially the top-3experiences in case 1, JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 16 which are highly similar to the statements in Case 1. The concise belief prediction patterns in case 2 also guided the LLM to produce belief more aligned with the ground truth. Cas...

work page 2021