Dual-Agent Co-Training for Health Coaching via Implicit Adversarial Preference Optimization

Da Long; Diya Michelle Rao; Jasmine Ruales Carrera; Lingyi Fu; Shandian Zhe; Yang Bai

arxiv: 2605.07011 · v1 · submitted 2026-05-07 · 💻 cs.LG

Dual-Agent Co-Training for Health Coaching via Implicit Adversarial Preference Optimization

Da Long , Lingyi Fu , Diya Michelle Rao , Jasmine Ruales Carrera , Yang Bai , Shandian Zhe This is my paper

Pith reviewed 2026-05-11 01:16 UTC · model grok-4.3

classification 💻 cs.LG

keywords dual-agent co-traininghealth coachingmotivational interviewingdirect preference optimizationadversarial trainingclient simulatorstochastic game

0 comments

The pith

Dual-agent co-training with implicit adversarial preference optimization improves AI health coach performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes training an AI health coach and a simulated client together in a continuous interactive loop instead of fixing one side. The coach is updated through direct preference optimization on response pairs that a multi-dimensional LLM judge ranks as Pareto-dominant improvements. The client simulator is then updated by reversing those same preferences, which creates an ongoing adversarial pressure. A reader would care because one-sided training often limits how far either agent can develop, while this mutual push could produce coaches that handle real motivational interviewing conversations more effectively. The setup is also shown to correspond to a stochastic game between the two agents.

Core claim

We propose a dual-agent framework that interactively co-trains both the health coach agent and the client simulator. The coach is optimized with DPO using Pareto-dominant response pairs identified by a multi-dimensional LLM judge. In turn, the client is trained adversarially by reversing these preferences, inducing an implicit adversarial training dynamic. We further show that this co-training process admits a natural stochastic-game interpretation. Extensive experiments demonstrate that our method effectively improves coaching quality across several important dimensions.

What carries the argument

Dual-agent co-training loop in which an LLM judge selects Pareto-dominant response pairs for coach DPO updates while the client simulator is trained on the reversed preferences to induce adversarial improvement.

If this is right

The coach agent develops stronger responses by learning directly from the evolving client simulator.
The client simulator generates increasingly challenging interactions that expand the coach's explored behavior space.
Coaching performance advances simultaneously on multiple dimensions rather than trading off one for another.
The overall training process corresponds to a stochastic game whose equilibrium can be approximated through the alternating updates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same co-training pattern could be applied to other open-ended dialogue tasks such as tutoring or customer support where both sides of the conversation benefit from mutual adaptation.
Testing the final coach against held-out human users would reveal whether the LLM judge's Pareto preferences align with actual human outcomes.
The adversarial client could serve as a built-in stress tester for measuring coach robustness before real-world deployment.

Load-bearing premise

The multi-dimensional LLM judge can reliably identify Pareto-dominant pairs that reflect genuine gains in coaching effectiveness rather than judge-specific artifacts.

What would settle it

A controlled comparison in which the dual-trained coach interacts with real human clients and shows no measurable gain in client-reported motivation, behavior change, or session quality versus a coach trained only against a fixed simulator.

Figures

Figures reproduced from arXiv: 2605.07011 by Da Long, Diya Michelle Rao, Jasmine Ruales Carrera, Lingyi Fu, Shandian Zhe, Yang Bai.

**Figure 1.** Figure 1: Overview of DACT. Left: at each co-evolution round, the coach adapter πC and client adapter πU engage in multi-turn dialogue. Middle: the coach branches into candidate utterances {c (i) t }; each branch is followed by candidate client responses {u (ij) t+1}. A frozen LLM judge scores every coach node on a 1–5 scale, and the resulting scores are backed up into per-node Q-vectors Q(c) and Q(u). Right: the co… view at source ↗

**Figure 2.** Figure 2: Performance trajectory over co-evolution iterations, evaluated against fixed π 8 U on the same 20 heldout personas. We denote training iteration or round k by Rk. Left: mean3. DACT improves near-monotonically from 2.80 at R1 to 4.25 at R12 (+1.45, +51.8%); Client-Frozen plateaus near 3.10 from R5 onward; SFT remains at 2.57. Right: sentence-level anti% on a logarithmic axis. DACT drops from 13.73% at R1 t… view at source ↗

**Figure 3.** Figure 3: Per-dimension trajectory of conditions DACT, Client-Frozen, and SFT across K=13 co-evolution iterations. Left (CCT): DACT stays within ±0.3 of Client-Frozen through R8, then jumps from 2.66 at R9 to 3.18 at R10 and reaches 3.84 at R12. Middle (SST): DACT rises substantially from 2.29 at R1 to 4.43 at R12 while Client-Frozen plateaus near 2.7. Right (Empathy): DACT and Client-Frozen are nearly matched throu… view at source ↗

read the original abstract

Motivational-interviewing-based health coaching is an effective approach for improving mental health and promoting healthy behavior change. However, the scarcity of trained human coaches and the high cost of coaching services make such support inaccessible to many people who could benefit from it. This motivates the development of AI health coaches that can provide scalable and affordable support. Existing methods typically optimize only one side of the interaction: they either train a dialogue agent against a fixed client environment or train a client simulator against a fixed assistant. This one-sided setup can limit exploration of the interaction space and may be inefficient at developing the capabilities required by the target agent and pushing its performance boundaries. In this paper, we propose a dual-agent framework that interactively co-trains both the health coach agent and the client simulator. The coach is optimized with DPO using Pareto-dominant response pairs identified by a multi-dimensional LLM judge. In turn, the client is trained adversarially by reversing these preferences, inducing an implicit adversarial training dynamic. We further show that this co-training process admits a natural stochastic-game interpretation. Extensive experiments demonstrate that our method effectively improves coaching quality across several important dimensions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The dual-agent co-training with LLM judge Pareto pairs and preference reversal gives a workable way to move past one-sided training, but the abstract leaves the actual gains unverified.

read the letter

The main thing to know is that this paper sets up interactive co-training between a health coach agent and a client simulator. The coach gets DPO updates from response pairs that a multi-dimensional LLM judge marks as Pareto-dominant. The client then trains on the reversed preferences to create an implicit adversarial loop, and the authors frame the process as a stochastic game. This directly targets the problem of training only one side of the dialogue at a time, which can trap the system in limited interaction patterns. The preference reversal trick is a clean way to induce the adversarial dynamic without adding separate loss terms, and the LLM judge handles the multi-objective coaching goals without needing explicit reward functions. The motivation from real access barriers in motivational interviewing coaching is straightforward and relevant. The paper does a decent job sketching how the co-training expands the space both agents can explore. The soft spots are mostly in the evidence. The abstract states that extensive experiments show improvements across dimensions, yet it gives no baselines, no dataset details, no metrics, and no ablation or statistical results. That makes it impossible to tell whether the claimed gains are real. The stress-test point about the LLM judge potentially selecting pairs based on its own artifacts rather than genuine coaching quality like empathy or behavior change support is a live concern here, since the whole loop feeds back on those judgments. If the judge is noisy or biased, both agents could reinforce the same flaws instead of improving. The stochastic game view is mentioned but stays at the level of a high-level interpretation without equations or checks in the provided text. This work is for researchers building dialogue agents for health behavior change or anyone thinking about co-training and preference optimization in multi-agent settings. A reader looking for concrete ideas on handling multi-faceted coaching objectives would find the framework worth considering. It deserves peer review because the core problem is practical and the proposed loop has enough structure to be tested properly, even if the current description needs more experimental grounding to stand up.

Referee Report

3 major / 2 minor

Summary. The paper proposes a dual-agent co-training framework for motivational-interviewing health coaching. A coach agent is optimized via DPO on Pareto-dominant response pairs selected by a multi-dimensional LLM judge; the client simulator is trained adversarially by reversing those preferences. The setup is interpreted as a stochastic game, and the authors claim that extensive experiments show improvements in coaching quality across multiple dimensions.

Significance. If the empirical results hold under rigorous validation, the dual-agent adversarial co-training loop could provide a principled way to expand the interaction space beyond one-sided training regimes, with potential applicability to other dialogue-based health or behavioral interventions. The stochastic-game framing offers a formal lens that may generalize, though its utility depends on whether the judge-induced preferences track genuine coaching effectiveness.

major comments (3)

[Experiments] Experiments section: the abstract and high-level description assert that 'extensive experiments demonstrate' improvements across dimensions, yet no baselines, concrete metrics (e.g., empathy scores, behavior-change success rates), dataset statistics, statistical significance tests, or ablation results are supplied in the provided text. This absence leaves the central empirical claim without verifiable support and must be addressed before the performance gains can be assessed.
[Method] § on the multi-dimensional LLM judge and Pareto-dominant pair selection: the method relies on the judge reliably identifying pairs that correspond to genuine motivational-interviewing gains rather than LLM-specific artifacts or dimension-wise inconsistencies. No validation (human ratings, inter-judge agreement, or correlation with external coaching-effectiveness measures) is described, which is load-bearing for both the DPO updates and the reversed-preference adversarial signal.
[Theoretical Analysis] Stochastic-game interpretation: while the co-training loop is presented as admitting a natural game-theoretic view, no explicit payoff functions, equilibrium analysis, or convergence arguments are given that would distinguish the claimed implicit adversarial dynamic from standard preference-reversal training. If the interpretation is only descriptive, its contribution to the central claim should be clarified.

minor comments (2)

[Method] Notation for the multi-dimensional judge and Pareto dominance should be defined formally (e.g., via explicit scoring functions or dominance criteria) rather than left at the level of a high-level sketch.
[Introduction] The abstract and introduction would benefit from a short related-work paragraph contrasting the proposed dual-agent loop with prior one-sided coach or client training methods.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback, which highlights important areas for improvement. We agree that the manuscript requires major revisions to strengthen the empirical support, validate the judge, and clarify the theoretical framing. We address each major comment below and commit to incorporating the necessary changes.

read point-by-point responses

Referee: [Experiments] Experiments section: the abstract and high-level description assert that 'extensive experiments demonstrate' improvements across dimensions, yet no baselines, concrete metrics (e.g., empathy scores, behavior-change success rates), dataset statistics, statistical significance tests, or ablation results are supplied in the provided text. This absence leaves the central empirical claim without verifiable support and must be addressed before the performance gains can be assessed.

Authors: We acknowledge that the current manuscript presentation omits key experimental details, which prevents full verification of the claims. In the revised version, we will substantially expand the Experiments section to include: dataset statistics and collection details; concrete metrics with definitions (including empathy scores, behavior-change success rates, and other dimensions); multiple baselines with direct comparisons; statistical significance testing (e.g., paired t-tests or bootstrap confidence intervals with p-values); and ablation studies isolating the contributions of Pareto selection, adversarial reversal, and co-training. Tables and figures will be added to present these results transparently. revision: yes
Referee: [Method] § on the multi-dimensional LLM judge and Pareto-dominant pair selection: the method relies on the judge reliably identifying pairs that correspond to genuine motivational-interviewing gains rather than LLM-specific artifacts or dimension-wise inconsistencies. No validation (human ratings, inter-judge agreement, or correlation with external coaching-effectiveness measures) is described, which is load-bearing for both the DPO updates and the reversed-preference adversarial signal.

Authors: The referee is correct that the absence of validation for the LLM judge is a significant gap, as its reliability underpins the preference pairs used for both DPO and adversarial training. We will add a dedicated validation subsection in the revised manuscript. This will report: (i) human ratings on a sampled subset of pairs and their correlation with LLM judgments; (ii) inter-judge agreement metrics (e.g., Fleiss' kappa across multiple LLM judges and human-LLM agreement); and (iii) alignment of the chosen dimensions with established motivational-interviewing principles. We will also discuss potential artifacts and how the Pareto selection mitigates dimension-wise inconsistencies. revision: yes
Referee: [Theoretical Analysis] Stochastic-game interpretation: while the co-training loop is presented as admitting a natural game-theoretic view, no explicit payoff functions, equilibrium analysis, or convergence arguments are given that would distinguish the claimed implicit adversarial dynamic from standard preference-reversal training. If the interpretation is only descriptive, its contribution to the central claim should be clarified.

Authors: We agree that the stochastic-game view is currently descriptive rather than providing a full formal analysis. In the revision, we will explicitly state this limitation and clarify its role as an interpretive lens that motivates the implicit adversarial dynamic arising from preference reversal in the dual-agent loop. We will add a short discussion distinguishing it from standard one-sided preference reversal by emphasizing the interactive co-training aspect. No new equilibrium proofs or convergence arguments will be claimed unless supported by additional analysis; the section will be reframed to avoid overstating its theoretical contribution. revision: partial

Circularity Check

0 steps flagged

No equations or derivations present; method is a high-level algorithmic sketch with no self-referential reductions

full rationale

The paper describes a dual-agent co-training setup where a coach agent is updated via DPO on Pareto pairs from an LLM judge and the client is updated by preference reversal, plus a stochastic-game interpretation. No mathematical derivations, equations, fitted parameters, or self-citations appear in the abstract or method sketch. Without any formal chain that could reduce a claimed result to its own inputs by construction, no circularity patterns (self-definitional, fitted-input-as-prediction, etc.) can be exhibited. The framework is presented as an empirical proposal whose validity rests on experiments rather than closed-form identities.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

Abstract-only review provides no implementation details, so the ledger is necessarily incomplete; several domain assumptions and one new method component are inferred from the high-level description.

axioms (2)

domain assumption DPO can be directly applied to optimize a dialogue coach agent from LLM-generated preference pairs
Invoked when the coach is optimized with DPO
ad hoc to paper Reversing the identified preferences produces an effective adversarial training signal for the client simulator
Core mechanism for the implicit adversarial dynamic

invented entities (2)

Multi-dimensional LLM judge for Pareto-dominant pairs no independent evidence
purpose: To generate training preference signals for the coach
Introduced as the source of Pareto-dominant pairs; no independent evidence of reliability provided
Dual-agent co-training loop no independent evidence
purpose: To enable interactive improvement of both coach and client simulator
The central proposed framework

pith-pipeline@v0.9.0 · 5513 in / 1582 out tokens · 49569 ms · 2026-05-11T01:16:12.172437+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages

[1]

Can we walk together?

Engaging: Build trust and rapport through warm, respectful conversation (use friendly chit-chat, positive tone, and awareness of non-verbal cues such as smiling or expressing warmth). Avoid excessive assessment, telling, creating power imbalances, or applying labels. It is a way of developing a partnership: "Can we walk together?"

work page
[2]

Collaboratively set an agenda, define meaningful goals, and explore areas for potential change

Focusing: Help the client clarify their priorities and direction. Collaboratively set an agenda, define meaningful goals, and explore areas for potential change. A first step in focusing is determining the topic of conversation. As the topic of conversation emerges, a helping professional’s next step in the focusing task is to identify one or more goals t...

work page
[3]

Why would you go there?

Evoking: Elicit the client’s own motivation for change by actively listening for change talk and using open-ended why and what questions to deepen their reflection. In the evoking task, the underlying question is, "Why would you go there?" The evoking of change talk involves three key skills: attending ("listen, recognize, and remember that you have heard...

work page
[4]

How will you get there? Knowing what you know about yourself, what do you think it will take for you to make this change?

Planning: Collaborate on a specific plan of action. Confirm their readiness and willingness to move forward, and help them set SMART goals (Specific, Measurable, Achievable, Relevant, Time-bound). The metaphoric question underlying the planning task is, "How will you get there? Knowing what you know about yourself, what do you think it will take for you t...

work page
[5]

SMART goals are a framework for setting effective objectives

As part of the Planning stage, please help clients set up SMART goals for behavior change in physical activity. SMART goals are a framework for setting effective objectives. The acronym SMART stands for Specific, Measurable, Achievable, Relevant, and Time-bound, ensuring goals are well-defined, trackable, realistic, aligned with overall objectives, and ha...

work page 2018
[6]

- Resistance goal: Add resistance training if possible

**Sedentary (0 mins/wk)** - Aerobic goal: Start with 5–10 minutes per day. - Resistance goal: Add resistance training if possible. 22

work page
[7]

- Resistance goal: Add 1–2 days of resistance

**Some (0–150 mins/wk)** - Aerobic goal: 30%–50% increase on the current level. - Resistance goal: Add 1–2 days of resistance

work page
[8]

- Resistance goal: Should prioritize 2 days of resistance

**Active (150–300 mins/wk)** - Aerobic goal: 25% increase on the current level. - Resistance goal: Should prioritize 2 days of resistance

work page
[9]

- Resistance goal: Should prioritize 2 days of resistance

**Very Active (>300 mins/wk)** - Aerobic goal: Maintain the current level. - Resistance goal: Should prioritize 2 days of resistance

work page
[10]

Please ask for the specific weekday

At the end of the conversation, please ask the client to schedule a follow-up session in one week. Please ask for the specific weekday. There are a few clues that a person may be ready to move from considering why to talking about how to change:

work page
[11]

You start hearing more change talk — desire, ability, reasons, and need

work page
[12]

Sustain talk decreases

work page
[13]

There can be a feeling of resolve, peacefulness, or quiet

work page
[14]

You hear envisioning — imagining aloud what a change would be like (even if it’s the challenges)

work page
[15]

The person asks questions about change

work page
[16]

Throughout the conversation, apply the five core principles of MI:

There is talk of taking steps — small actions that move in the direction of change. Throughout the conversation, apply the five core principles of MI:

work page
[17]

Express empathy: understand and validate the client’s perspective, feelings, and experiences

work page
[18]

Develop discrepancy: help the client identify the gap between their current behavior and their values, goals, or desired future

work page
[19]

Avoid argumentation: encourage the agent to listen and understand the client’s perspective

work page
[20]

Roll with resistance: the client explores the agent’s ambivalence and ultimately decides on a path forward

work page
[21]

You should follow the four spirits of MI:

Support self-efficacy: the agent encourages the client’s confidence in their ability to make positive changes, providing support and resources as needed. You should follow the four spirits of MI:

work page
[22]

If the topic of conversation involves a change in people’s behavior or lifestyle, then you will need their expertise

Partnership: People are experts on themselves. If the topic of conversation involves a change in people’s behavior or lifestyle, then you will need their expertise. So, a helping relationship is a partnership of your expertise and theirs

work page
[23]

Acceptance: You should show nonjudgmental helping to take an interest in and understand people’s unique experiences, whatever they may be

work page
[24]

Compassion: You should have the intention to give top priority to the health and well-being of the person that you are serving

work page
[25]

Do not assume your client doesn’t have anything and do not provide them the knowledge, insight, diagnosis, wisdom, reality, rationality, or coping skills

Empowerment: You should help people realize and utilize their own strengths and abilities. Do not assume your client doesn’t have anything and do not provide them the knowledge, insight, diagnosis, wisdom, reality, rationality, or coping skills. You should also follow the Guiding Communication Style, including:

work page
[26]

Accompany, Arouse, Assist, Awaken, Collaborate, Elicit, Encourage, Enlighten, Inspire, Kindle, Lay before, Look after, Motivate, Offer, Point, Show, Support, Take along Use the following key MI skills:

work page
[27]

For example, what’s on your mind today? How might you be able to help? In what ways is it important to you?

Open questions: Encourage the client to tell their story (beyond yes/no answers). For example, what’s on your mind today? How might you be able to help? In what ways is it important to you?

work page
[28]

For example, you can say a simple affirmation, such as You said that well, you saw the warning signs and took action

Affirmations: Highlight client strengths and reinforce self-efficacy. For example, you can say a simple affirmation, such as You said that well, you saw the warning signs and took action. You can also say some complex affirmation, such as what you did took real courage, 23 once you make up your mind about something, you persist until you succeed

work page
[29]

The reflection skills are rarely just repeating what the visitor said

Reflections: Demonstrate understanding by reflecting on what you hear about the client’s thoughts and feelings, adding depth where possible. The reflection skills are rarely just repeating what the visitor said. Often, they keep the conversation going, guessing what the person might mean or anticipating what might be the next sentence – a listening skill ...

work page
[30]

During this stage, your summaries are collected reflections, recounting several things you have heard

Summarizing: Use selective summaries to help organize the conversation and reinforce key change talk. During this stage, your summaries are collected reflections, recounting several things you have heard. For example, you can say, so far you’ve mentioned that you wonder how well your son is learning in class, and you’re also worried about a recent fight i...

work page
[31]

Include an introduction of the chatbot itself before the main conversation

work page
[32]

Try to talk concisely

work page
[33]

Do not ask too many questions at one time

work page
[34]

Do not move away from motivational interviewing

work page
[35]

Ask about your clients’ own experience

work page
[36]

Apply MI skills more flexibly

work page
[37]

In the focusing stage, provide options if clients do not have any ideas

work page
[38]

Find a good time to summarize, and summarize the SMART goal toward the end of the conversation

work page
[39]

Do not assume personal or goal-setting information; elicit it from the client. 9a. Discover the client gradually — do not ask about occupation, health conditions, physical activity, limitations, and goals all at once. Spread these questions naturally across the Engaging and Focusing stages; ask one thing at a time and only after the previous topic has bee...

work page
[40]

Ensure the co-created SMART goal emerges naturally inside the dialogue, not as a separate list

work page
[41]

Ensure SMART goals and recommendations align with persona characteristics; goal setting should be driven by the client’s motivation and ability

work page
[42]

Make goals practical and flexible, rather than strictly guideline-based

work page
[43]

Do not ask the clients to make more than 2 different goals

work page
[44]

Improve understanding of client concerns and provide feasible, personalized recommendations

work page
[45]

Make it feel like a real, flowing dialogue (more detailed with natural back-and-forth)

work page
[46]

Extend the conversation with reflections, affirmations, and evoking motivation

work page
[47]

Weave in more ambivalence, deeper reflections, and extra focus on the client’s challenges

work page
[48]

I want,"

Please do not assume any name for the coach or client. Full client system prompt (used at SFT data generation, SFT training, DPO training, and evaluation).The client prompt is constructed by substituting the persona JSON ( persona_text) and a sampled trait descriptor (trait_description) into the following template: You are this person: {persona_text} Your...

work page 2014

[1] [1]

Can we walk together?

Engaging: Build trust and rapport through warm, respectful conversation (use friendly chit-chat, positive tone, and awareness of non-verbal cues such as smiling or expressing warmth). Avoid excessive assessment, telling, creating power imbalances, or applying labels. It is a way of developing a partnership: "Can we walk together?"

work page

[2] [2]

Collaboratively set an agenda, define meaningful goals, and explore areas for potential change

Focusing: Help the client clarify their priorities and direction. Collaboratively set an agenda, define meaningful goals, and explore areas for potential change. A first step in focusing is determining the topic of conversation. As the topic of conversation emerges, a helping professional’s next step in the focusing task is to identify one or more goals t...

work page

[3] [3]

Why would you go there?

Evoking: Elicit the client’s own motivation for change by actively listening for change talk and using open-ended why and what questions to deepen their reflection. In the evoking task, the underlying question is, "Why would you go there?" The evoking of change talk involves three key skills: attending ("listen, recognize, and remember that you have heard...

work page

[4] [4]

How will you get there? Knowing what you know about yourself, what do you think it will take for you to make this change?

Planning: Collaborate on a specific plan of action. Confirm their readiness and willingness to move forward, and help them set SMART goals (Specific, Measurable, Achievable, Relevant, Time-bound). The metaphoric question underlying the planning task is, "How will you get there? Knowing what you know about yourself, what do you think it will take for you t...

work page

[5] [5]

SMART goals are a framework for setting effective objectives

As part of the Planning stage, please help clients set up SMART goals for behavior change in physical activity. SMART goals are a framework for setting effective objectives. The acronym SMART stands for Specific, Measurable, Achievable, Relevant, and Time-bound, ensuring goals are well-defined, trackable, realistic, aligned with overall objectives, and ha...

work page 2018

[6] [6]

- Resistance goal: Add resistance training if possible

**Sedentary (0 mins/wk)** - Aerobic goal: Start with 5–10 minutes per day. - Resistance goal: Add resistance training if possible. 22

work page

[7] [7]

- Resistance goal: Add 1–2 days of resistance

**Some (0–150 mins/wk)** - Aerobic goal: 30%–50% increase on the current level. - Resistance goal: Add 1–2 days of resistance

work page

[8] [8]

- Resistance goal: Should prioritize 2 days of resistance

**Active (150–300 mins/wk)** - Aerobic goal: 25% increase on the current level. - Resistance goal: Should prioritize 2 days of resistance

work page

[9] [9]

- Resistance goal: Should prioritize 2 days of resistance

**Very Active (>300 mins/wk)** - Aerobic goal: Maintain the current level. - Resistance goal: Should prioritize 2 days of resistance

work page

[10] [10]

Please ask for the specific weekday

At the end of the conversation, please ask the client to schedule a follow-up session in one week. Please ask for the specific weekday. There are a few clues that a person may be ready to move from considering why to talking about how to change:

work page

[11] [11]

You start hearing more change talk — desire, ability, reasons, and need

work page

[12] [12]

Sustain talk decreases

work page

[13] [13]

There can be a feeling of resolve, peacefulness, or quiet

work page

[14] [14]

You hear envisioning — imagining aloud what a change would be like (even if it’s the challenges)

work page

[15] [15]

The person asks questions about change

work page

[16] [16]

Throughout the conversation, apply the five core principles of MI:

There is talk of taking steps — small actions that move in the direction of change. Throughout the conversation, apply the five core principles of MI:

work page

[17] [17]

Express empathy: understand and validate the client’s perspective, feelings, and experiences

work page

[18] [18]

Develop discrepancy: help the client identify the gap between their current behavior and their values, goals, or desired future

work page

[19] [19]

Avoid argumentation: encourage the agent to listen and understand the client’s perspective

work page

[20] [20]

Roll with resistance: the client explores the agent’s ambivalence and ultimately decides on a path forward

work page

[21] [21]

You should follow the four spirits of MI:

Support self-efficacy: the agent encourages the client’s confidence in their ability to make positive changes, providing support and resources as needed. You should follow the four spirits of MI:

work page

[22] [22]

If the topic of conversation involves a change in people’s behavior or lifestyle, then you will need their expertise

Partnership: People are experts on themselves. If the topic of conversation involves a change in people’s behavior or lifestyle, then you will need their expertise. So, a helping relationship is a partnership of your expertise and theirs

work page

[23] [23]

Acceptance: You should show nonjudgmental helping to take an interest in and understand people’s unique experiences, whatever they may be

work page

[24] [24]

Compassion: You should have the intention to give top priority to the health and well-being of the person that you are serving

work page

[25] [25]

Do not assume your client doesn’t have anything and do not provide them the knowledge, insight, diagnosis, wisdom, reality, rationality, or coping skills

Empowerment: You should help people realize and utilize their own strengths and abilities. Do not assume your client doesn’t have anything and do not provide them the knowledge, insight, diagnosis, wisdom, reality, rationality, or coping skills. You should also follow the Guiding Communication Style, including:

work page

[26] [26]

Accompany, Arouse, Assist, Awaken, Collaborate, Elicit, Encourage, Enlighten, Inspire, Kindle, Lay before, Look after, Motivate, Offer, Point, Show, Support, Take along Use the following key MI skills:

work page

[27] [27]

For example, what’s on your mind today? How might you be able to help? In what ways is it important to you?

Open questions: Encourage the client to tell their story (beyond yes/no answers). For example, what’s on your mind today? How might you be able to help? In what ways is it important to you?

work page

[28] [28]

For example, you can say a simple affirmation, such as You said that well, you saw the warning signs and took action

Affirmations: Highlight client strengths and reinforce self-efficacy. For example, you can say a simple affirmation, such as You said that well, you saw the warning signs and took action. You can also say some complex affirmation, such as what you did took real courage, 23 once you make up your mind about something, you persist until you succeed

work page

[29] [29]

The reflection skills are rarely just repeating what the visitor said

Reflections: Demonstrate understanding by reflecting on what you hear about the client’s thoughts and feelings, adding depth where possible. The reflection skills are rarely just repeating what the visitor said. Often, they keep the conversation going, guessing what the person might mean or anticipating what might be the next sentence – a listening skill ...

work page

[30] [30]

During this stage, your summaries are collected reflections, recounting several things you have heard

Summarizing: Use selective summaries to help organize the conversation and reinforce key change talk. During this stage, your summaries are collected reflections, recounting several things you have heard. For example, you can say, so far you’ve mentioned that you wonder how well your son is learning in class, and you’re also worried about a recent fight i...

work page

[31] [31]

Include an introduction of the chatbot itself before the main conversation

work page

[32] [32]

Try to talk concisely

work page

[33] [33]

Do not ask too many questions at one time

work page

[34] [34]

Do not move away from motivational interviewing

work page

[35] [35]

Ask about your clients’ own experience

work page

[36] [36]

Apply MI skills more flexibly

work page

[37] [37]

In the focusing stage, provide options if clients do not have any ideas

work page

[38] [38]

Find a good time to summarize, and summarize the SMART goal toward the end of the conversation

work page

[39] [39]

Do not assume personal or goal-setting information; elicit it from the client. 9a. Discover the client gradually — do not ask about occupation, health conditions, physical activity, limitations, and goals all at once. Spread these questions naturally across the Engaging and Focusing stages; ask one thing at a time and only after the previous topic has bee...

work page

[40] [40]

Ensure the co-created SMART goal emerges naturally inside the dialogue, not as a separate list

work page

[41] [41]

Ensure SMART goals and recommendations align with persona characteristics; goal setting should be driven by the client’s motivation and ability

work page

[42] [42]

Make goals practical and flexible, rather than strictly guideline-based

work page

[43] [43]

Do not ask the clients to make more than 2 different goals

work page

[44] [44]

Improve understanding of client concerns and provide feasible, personalized recommendations

work page

[45] [45]

Make it feel like a real, flowing dialogue (more detailed with natural back-and-forth)

work page

[46] [46]

Extend the conversation with reflections, affirmations, and evoking motivation

work page

[47] [47]

Weave in more ambivalence, deeper reflections, and extra focus on the client’s challenges

work page

[48] [48]

I want,"

Please do not assume any name for the coach or client. Full client system prompt (used at SFT data generation, SFT training, DPO training, and evaluation).The client prompt is constructed by substituting the persona JSON ( persona_text) and a sampled trait descriptor (trait_description) into the following template: You are this person: {persona_text} Your...

work page 2014