pith. machine review for the scientific record. sign in

arxiv: 2604.20382 · v1 · submitted 2026-04-22 · 💻 cs.CL

Recognition: unknown

Graph2Counsel: Clinically Grounded Synthetic Counseling Dialogue Generation from Client Psychological Graphs

Authors on Pith no claims yet

Pith reviewed 2026-05-10 00:19 UTC · model grok-4.3

classification 💻 cs.CL
keywords synthetic datacounseling dialoguespsychological graphsLLM fine-tuningmental healthdialogue generationsafety evaluation
0
0 comments X

The pith

Structuring client psychological states into graphs produces more realistic and safer synthetic counseling dialogues.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that builds graphs connecting a client's thoughts, emotions, and behaviors to guide the creation of artificial counseling sessions. These graphs direct a sequence of prompts that incorporate professional counseling approaches, resulting in 760 dialogues from 76 different client profiles. Mental health experts judged the sessions superior to earlier synthetic collections in how specific, authentic, competent, and safe they appear, with strong agreement in their assessments. Training an open language model on this collection raises its performance on two standard counseling evaluation sets. The work addresses the shortage of usable real counseling data caused by privacy rules by supplying structured synthetic alternatives.

Core claim

Graph2Counsel generates synthetic counseling sessions from Client Psychological Graphs encoding relationships among thoughts, emotions, and behaviors, employing structured prompting with counselor strategies to create 760 sessions. Expert evaluation demonstrates outperformance over prior datasets in specificity, counselor competence, authenticity, conversational flow, and safety with Krippendorff's α of 0.70. Fine-tuning an open-source model on the dataset yields improvements on CounselingBench and CounselBench.

What carries the argument

Client Psychological Graphs (CPGs) that encode relationships among clients' thoughts, emotions, and behaviors, directing a structured prompting pipeline for dialogue generation.

If this is right

  • The dataset provides higher quality training material for adapting LLMs to counseling tasks.
  • Fine-tuned models show enhanced results on established counseling benchmarks.
  • Generated dialogues exhibit better psychological consistency and safety.
  • The approach supports generation across diverse client profiles using varied graphs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This graph-based method may help overcome data scarcity in other privacy-sensitive conversational AI applications.
  • The graphs could potentially be used to simulate sessions for training new human counselors.
  • If the quality gains hold, it might accelerate the creation of reliable AI-assisted mental health tools.

Load-bearing premise

Expert evaluators can correctly judge the psychological consistency, safety, and authenticity of the generated dialogues, and the Client Psychological Graphs accurately represent real client states without adding artifacts.

What would settle it

Expert re-evaluation finding no advantage or a disadvantage for the new dataset on the rated dimensions, or fine-tuned models failing to improve or worsening on the benchmarks would indicate the method does not deliver the claimed benefits.

Figures

Figures reproduced from arXiv: 2604.20382 by Aishik Mandal, Clarissa W. Ong, Hiba Arnaout, Iryna Gurevych, Juliet Bockhorst, Kate Sheehan, Rachael Moldow, Tanmoy Chakraborty.

Figure 1
Figure 1. Figure 1: From a real therapy transcript, we extract a Client Psychological Graph (CPG) ( [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Structured knowledge inferred from real counseling sessions: CPGs, CPG-derived client profiles, and [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prompt used to generate a single CPG-grounded client profile. [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: System Prompt used to generate diverse CPG-grounded client profiles. [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: User prompt used to generate diverse CPG-grounded client profiles. [PITH_FULL_IMAGE:figures/full_fig_p026_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Global constraints for counseling dialogue generation. [PITH_FULL_IMAGE:figures/full_fig_p026_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Counselor guidelines (designed with direct input from clinicians) for counseling dialogue generation. [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Client guidelines (designed with direct input from clinicians) for counseling dialogue generation. [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Common Pitfalls (designed with direct input from clinicians) for counseling dialogue generation to avoid. [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: System Prompt used to generate synthetic counseling sessions using base prompting technique. [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: User prompt used to generate synthetic counseling sessions with CPG input and base prompting [PITH_FULL_IMAGE:figures/full_fig_p030_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: User prompt used to generate synthetic counseling sessions with CPG-grounded client profile input and [PITH_FULL_IMAGE:figures/full_fig_p030_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: User prompt used to generate synthetic counseling sessions with both CPG and CPG-grounded client [PITH_FULL_IMAGE:figures/full_fig_p031_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: System prompt used to generate synthetic counseling sessions with Guided Counseling prompting [PITH_FULL_IMAGE:figures/full_fig_p031_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: User prompt used to generate synthetic counseling sessions with CPG as input and Guided Counseling [PITH_FULL_IMAGE:figures/full_fig_p032_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: User prompt used to generate synthetic counseling sessions with CPG-grounded client profile as input [PITH_FULL_IMAGE:figures/full_fig_p032_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: User prompt used to generate synthetic counseling sessions with CPG and CPG-grounded client profile [PITH_FULL_IMAGE:figures/full_fig_p032_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: System prompt used to generate synthetic counseling sessions with GC+CoT prompting technique. [PITH_FULL_IMAGE:figures/full_fig_p033_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: User prompt used to generate synthetic counseling sessions with CPG as input and GC+CoT prompting [PITH_FULL_IMAGE:figures/full_fig_p034_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: User prompt used to generate synthetic counseling sessions with CPG-grounded client profile as input [PITH_FULL_IMAGE:figures/full_fig_p034_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: User prompt used to generate synthetic counseling sessions with CPG and CPG-grounded client profile [PITH_FULL_IMAGE:figures/full_fig_p034_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: System prompt used to generate feedback for sessions generated with GC+MA prompting technique. [PITH_FULL_IMAGE:figures/full_fig_p035_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: System prompt used to regenerate sessions with GC+MA prompting technique. [PITH_FULL_IMAGE:figures/full_fig_p036_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: User prompt used to generate feedback for sessions generated with CPG as input with GC+MA prompting [PITH_FULL_IMAGE:figures/full_fig_p037_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: User prompt used to generate revised sessions with CPG as input with GC+MA prompting technique. [PITH_FULL_IMAGE:figures/full_fig_p037_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: User prompt used to generate feedback for sessions generated with CPG-grounded client profile as input [PITH_FULL_IMAGE:figures/full_fig_p037_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: User prompt used to generate revised sessions with CPG-grounded client profile as input with GC+MA [PITH_FULL_IMAGE:figures/full_fig_p038_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: User prompt used to generate feedback for sessions generated with CPG and CPG-grounded client profile [PITH_FULL_IMAGE:figures/full_fig_p038_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: User prompt used to generate revised sessions with CPG and CPG-grounded client profile as input with [PITH_FULL_IMAGE:figures/full_fig_p038_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Prompt used to extract counselor strategies used from real counseling session. [PITH_FULL_IMAGE:figures/full_fig_p039_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Prompt used to QLoRA fine-tune a Llama3-8B-Instruct model using data from SQPsychConv dataset. [PITH_FULL_IMAGE:figures/full_fig_p039_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Prompt used to QLoRA fine-tune a Llama3-8B-Instruct model using data from MAGneT dataset. [PITH_FULL_IMAGE:figures/full_fig_p040_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Prompt used to QLoRA fine-tune a Llama3-8B-Instruct model using data from Graph2Counsel dataset. [PITH_FULL_IMAGE:figures/full_fig_p040_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: Prompt used to evaluate the generated counseling sessions on CTRS. [PITH_FULL_IMAGE:figures/full_fig_p041_34.png] view at source ↗
Figure 35
Figure 35. Figure 35: Prompt used to evaluate the generated counseling sessions on WAI. [PITH_FULL_IMAGE:figures/full_fig_p041_35.png] view at source ↗
Figure 36
Figure 36. Figure 36: Prompt used to extract client issues from counseling session dialogues in SQPsychConv dataset. [PITH_FULL_IMAGE:figures/full_fig_p042_36.png] view at source ↗
Figure 37
Figure 37. Figure 37: A sample dialogue excerpt from Graph2Counsel: Client Alex, with feedback from experts [PITH_FULL_IMAGE:figures/full_fig_p043_37.png] view at source ↗
Figure 38
Figure 38. Figure 38: A sample dialogue excerpt from Graph2Counsel: Client David. [PITH_FULL_IMAGE:figures/full_fig_p044_38.png] view at source ↗
Figure 39
Figure 39. Figure 39: Prompt used to evaluate the faithfulness of the generated counseling sessions to the input CPG. [PITH_FULL_IMAGE:figures/full_fig_p045_39.png] view at source ↗
Figure 40
Figure 40. Figure 40: Prompt used to evaluate the faithfulness of the generated counseling sessions to the input CPG-grounded [PITH_FULL_IMAGE:figures/full_fig_p045_40.png] view at source ↗
Figure 41
Figure 41. Figure 41: Prompt used to generate model responses to questions in CounselBench-Eval and CounselBench-Adv. [PITH_FULL_IMAGE:figures/full_fig_p045_41.png] view at source ↗
Figure 42
Figure 42. Figure 42: Prompt used to evaluate model responses on CounselBench-Eval. [PITH_FULL_IMAGE:figures/full_fig_p046_42.png] view at source ↗
Figure 43
Figure 43. Figure 43: Prompt used to evaluate model responses on CounselBench-Adv. [PITH_FULL_IMAGE:figures/full_fig_p046_43.png] view at source ↗
Figure 44
Figure 44. Figure 44: Prompt used to generate model responses to questions in CounselingBench using Zero-Shot (ZS) [PITH_FULL_IMAGE:figures/full_fig_p047_44.png] view at source ↗
Figure 45
Figure 45. Figure 45: Prompt used to generate model responses to questions in CounselingBench using Few-Shot (FS) [PITH_FULL_IMAGE:figures/full_fig_p048_45.png] view at source ↗
Figure 46
Figure 46. Figure 46: Prompt used to generate model responses to questions in CounselingBench using Few-Shot Chain-of [PITH_FULL_IMAGE:figures/full_fig_p049_46.png] view at source ↗
read the original abstract

Rising demand for mental health support has increased interest in using Large Language Models (LLMs) for counseling. However, adapting LLMs to this high-risk safety-critical domain is hindered by the scarcity of real-world counseling data due to privacy constraints. Synthetic datasets provide a promising alternative, but existing approaches often rely on unstructured or semi-structured text inputs and overlook structural dependencies between a client's cognitive, emotional, and behavioral states, often producing psychologically inconsistent interactions and reducing data realism and quality. We introduce Graph2Counsel, a framework for generating synthetic counseling sessions grounded in Client Psychological Graphs (CPGs) that encode relationships among clients' thoughts, emotions, and behaviors. Graph2Counsel employs a structured prompting pipeline guided by counselor strategies and CPG, and explores prompting strategies including CoT (Wei et al., 2022) and Multi-Agent Feedback (Li et al., 2025a). Graph2Counsel produces 760 sessions from 76 CPGs across diverse client profiles. In expert evaluation, our dataset outperforms prior datasets on specificity, counselor competence, authenticity, conversational flow, and safety, with substantial inter-annotator agreement (Krippendorff's $\alpha$ = 0.70). Fine-tuning an open-source model on this dataset improves performance on CounselingBench (Nguyen et al., 2025) and CounselBench (Li et al., 2025b), showing downstream utility. We also make our code and data public.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces Graph2Counsel, a framework for generating synthetic counseling dialogues grounded in Client Psychological Graphs (CPGs) that encode relationships among clients' cognitive, emotional, and behavioral states. Using structured prompting with counselor strategies, Chain-of-Thought, and multi-agent feedback, it produces 760 sessions from 76 CPGs across diverse profiles. Expert evaluation indicates superiority over prior datasets in specificity, counselor competence, authenticity, conversational flow, and safety, with Krippendorff's α = 0.70. Fine-tuning an open-source LLM on the dataset yields improvements on CounselingBench and CounselBench, and the authors make code and data publicly available.

Significance. Should the expert evaluations and downstream improvements prove robust, this approach could significantly advance the creation of high-quality, privacy-preserving synthetic data for LLM training in counseling, addressing key challenges in safety-critical mental health applications. The emphasis on structural dependencies in client states and the public release of resources are notable strengths for reproducibility and further research.

major comments (3)
  1. [Expert Evaluation] The claim that the generated dataset outperforms prior ones on specificity, counselor competence, authenticity, conversational flow, and safety rests entirely on expert human ratings (Krippendorff’s α = 0.70). The manuscript provides no information on whether raters are licensed clinicians, whether evaluation was blinded to dataset origin, or whether ratings correlate with external criteria such as real-session transcripts or client outcomes. This is load-bearing for the central claim of improved psychological consistency and safety.
  2. [CPG Construction and Dataset Generation] The construction of the Client Psychological Graphs (CPGs) is described at a high level but lacks details on whether it is manual, LLM-assisted, or hybrid, and no validation against actual client data or assessment of introduced artifacts is reported. Any such artifacts could be reproduced in the generated dialogues without detection by raters, undermining the grounding and realism claims.
  3. [Downstream Evaluation] The abstract and results report that fine-tuning improves performance on CounselingBench and CounselBench but supply no quantitative details, effect sizes, baseline comparisons, or statistical significance tests. This prevents assessment of the practical magnitude of the downstream utility.
minor comments (2)
  1. [Abstract] The abstract could briefly note the number of expert annotators and the exact rating scales used, in addition to the reported α value, to strengthen the inter-annotator agreement claim.
  2. [Method] Even with the public code release, including one or two example prompt templates from the structured prompting pipeline in the main text or appendix would improve immediate readability and reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive review of our manuscript on Graph2Counsel. Their comments are valuable for improving the clarity and robustness of our claims. We address each major comment below, indicating the revisions we intend to make to the manuscript.

read point-by-point responses
  1. Referee: [Expert Evaluation] The claim that the generated dataset outperforms prior ones on specificity, counselor competence, authenticity, conversational flow, and safety rests entirely on expert human ratings (Krippendorff’s α = 0.70). The manuscript provides no information on whether raters are licensed clinicians, whether evaluation was blinded to dataset origin, or whether ratings correlate with external criteria such as real-session transcripts or client outcomes. This is load-bearing for the central claim of improved psychological consistency and safety.

    Authors: We agree that more details on the expert evaluation are necessary to support our claims. The manuscript will be revised to include information on the raters' qualifications and expertise in counseling psychology, the blinding procedure employed during rating, and any efforts or limitations in correlating ratings with external criteria like real transcripts or outcomes. We will also highlight the role of the structured prompting and multi-agent feedback in promoting psychological consistency and safety. These additions will address the load-bearing nature of the evaluation for our central claims. revision: yes

  2. Referee: [CPG Construction and Dataset Generation] The construction of the Client Psychological Graphs (CPGs) is described at a high level but lacks details on whether it is manual, LLM-assisted, or hybrid, and no validation against actual client data or assessment of introduced artifacts is reported. Any such artifacts could be reproduced in the generated dialogues without detection by raters, undermining the grounding and realism claims.

    Authors: We acknowledge the high-level description of CPG construction in the current manuscript. We will expand this section to detail the construction methodology, including the extent to which it is manual, LLM-assisted, or hybrid, and describe the validation processes used, such as expert review for consistency with psychological principles. We will also include an analysis of potential artifacts and how the generation pipeline, including Chain-of-Thought and multi-agent feedback, helps to detect and mitigate them. While direct validation against real client data is precluded by privacy considerations, we will discuss this as a limitation and explain the grounding mechanisms employed. revision: yes

  3. Referee: [Downstream Evaluation] The abstract and results report that fine-tuning improves performance on CounselingBench and CounselBench but supply no quantitative details, effect sizes, baseline comparisons, or statistical significance tests. This prevents assessment of the practical magnitude of the downstream utility.

    Authors: We concur that quantitative details are essential for evaluating the downstream utility. The revised manuscript will include specific performance numbers, effect sizes, baseline comparisons (e.g., against models fine-tuned on other synthetic datasets), and statistical tests for significance on both CounselingBench and CounselBench. These will be added to the results section to demonstrate the practical improvements achieved by fine-tuning on the Graph2Counsel dataset. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical generation and evaluation pipeline

full rationale

The paper presents an empirical framework that constructs Client Psychological Graphs, applies structured prompting (including external techniques like Chain-of-Thought) to generate dialogues, evaluates them via expert ratings, and tests downstream utility on separate benchmarks. No mathematical derivations, equations, fitted parameters, or predictions appear in the provided text. All load-bearing claims rest on external expert judgments and benchmark results rather than any self-referential reduction, self-citation chain, or ansatz smuggled through prior work by the same authors. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on domain assumptions about LLM controllability and evaluator reliability rather than new mathematical axioms or fitted parameters.

axioms (2)
  • domain assumption Structured graphs plus counselor strategies can guide LLMs to produce psychologically consistent counseling dialogues
    Core premise of the prompting pipeline described in the abstract.
  • domain assumption Human experts can reliably assess psychological consistency, safety, and authenticity in generated dialogues
    Required for the quality and safety claims.
invented entities (1)
  • Client Psychological Graphs (CPGs) no independent evidence
    purpose: To encode relationships among clients' thoughts, emotions, and behaviors for grounding dialogue generation
    New representational structure introduced by the paper.

pith-pipeline@v0.9.0 · 5594 in / 1447 out tokens · 45691 ms · 2026-05-10T00:19:07.374615+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 6 canonical work pages · 1 internal anchor

  1. [1]

    Julian Burger, Christina Ralph-Nearman, and Cheri A

    A novel approach for constructing per- sonalized networks from longitudinal perceived causal relations.Behaviour Research and Therapy, 173:104456. Julian Burger, Christina Ralph-Nearman, and Cheri A. Levinson. 2022. Integrating clinician and patient case conceptualization with momentary assessment data to construct idiographic networks: Moving to- ward pe...

  2. [2]

    InThe Twelfth International Con- ference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

    Talk like a graph: Encoding graphs for large language models. InThe Twelfth International Con- ference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net. Aaron J. Fisher, Hannah G. Bosley, Katya C. Fernandez, Jonathan W. Reeves, Peter D. Soyster, Allison E. Diamond, and Jonathan Barkin. 2019. Open trial of a personali...

  3. [3]

    Zhijun Guo, Alvina Lai, Johan H Thygesen, Joseph Farrington, Thomas Keen, Kezhi Li, and 1 others

    OpenReview.net. Zhijun Guo, Alvina Lai, Johan H Thygesen, Joseph Farrington, Thomas Keen, Kezhi Li, and 1 others

  4. [4]

    Stephen N

    Large language models for mental health ap- plications: systematic review.JMIR mental health, 11(1):e57400. Stephen N. Haynes, William H. O’Brien, and Antonio Godoy. 2020. A proposed model for the psychomet- ric evaluation of clinical case formulations with quan- tified causal diagrams.Psychological Assessment, 32(6):541–552. Place: US Publisher: American...

  5. [5]

    arXiv preprint arXiv:2506.08584 (2025)

    Personalizing eating disorder treatment us- ing idiographic models: An open series trial.Journal of Consulting and Clinical Psychology, 91(1):14–28. Place: US Publisher: American Psychological Asso- ciation. Xiang Li, Duyi Pan, Hongru Xiao, Jiale Han, Jing Tang, Jiabao Ma, Wei Wang, and Bo Cheng. 2025a. Dia- logueagents: A hybrid agent-based speech synthe...

  6. [6]

    InFindings of the Association for Compu- tational Linguistics: ACL 2025, pages 13750–13770, Vienna, Austria

    Eeyore: Realistic depression simulation via expert-in-the-loop supervised and preference opti- mization. InFindings of the Association for Compu- tational Linguistics: ACL 2025, pages 13750–13770, Vienna, Austria. Association for Computational Lin- guistics. Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023b. G-eval: NLG e...

  7. [7]

    Clarissa W

    Association for Computational Linguistics. Clarissa W. Ong, Hiba Arnaout, Kate Sheehan, Es- tella Fox, Eugen Owtscharow, and Iryna Gurevych. 2025a. Using large language models to create per- sonalized networks from therapy sessions.CoRR, abs/2512.05836. Clarissa W. Ong, Kate Sheehan, Adam J.D. Mann, and Estella Fox. 2025b. Examining the effects of process...

  8. [8]

    InProceedings of the 2022 Conference on Empirical Methods in Natural Lan- guage Processing, pages 2438–2459, Abu Dhabi, United Arab Emirates

    D4: a Chinese dialogue dataset for depression- diagnosis-oriented chat. InProceedings of the 2022 Conference on Empirical Methods in Natural Lan- guage Processing, pages 2438–2459, Abu Dhabi, United Arab Emirates. Association for Computa- tional Linguistics. Congchi Yin, Feng Li, Shu Zhang, Zike Wang, Jun Shao, Piji Li, Jianhua Chen, and Xun Jiang. 2025. ...

  9. [9]

    I am not sure

    and the Working Alliance Inventory (WAI) (Horvath and Greenberg, 1989). Both metrics are scored by GPT-4o (OpenAI, 2024) in an LLM-as- a-judge setup evaluating the generated counseling sessions. 8DeepSpeed 9Hugging Face 15 Strategy Evidence Therapy Modality Alternative Perspec- tive Counselor: So, now that you have a bit this stance, what are some other a...

  10. [11]

    Presenting Problem - What issue/symptoms do you want to discuss? (If there are multiple issues, discuss with the counselor to determine the most important or first issue to address) - When did the problem/symptoms start? - What was the stress level when the problem/symptoms first occurred? (What do you think might be the cause?) - How has the problem/symp...

  11. [15]

    24 CPG-grounded diverse Client Profile Generation System Prompt Your task is to generate diverse synthetic client intake forms for mental health counseling sessions

    Is there anyone you can talk to or get help from when you encounter difficulties or problems? ##Example output 1: {example_output_1} ##Example output 2: {example_output_2} Client Intake Form: Figure 3: Prompt used to generate a single CPG-grounded client profile. 24 CPG-grounded diverse Client Profile Generation System Prompt Your task is to generate dive...

  12. [16]

    You must generate exactly 10 distinct client intake forms

  13. [17]

    Each profile must be unique in: -name, age, gender, background -symptom expression and wording -life history, stressors, and coping attempts

  14. [18]

    DO NOT copy, paraphrase, or structurally reuse the example profiles

  15. [19]

    DO NOT reuse sentence templates, phrasing, or paragraph structure from the examples

  16. [20]

    DO NOT repeat any example content, even partially

  17. [21]

    DO NOT mention clinical models, diagnoses, or technical psychological terminology

  18. [22]

    Write strictly from the client’s perspective, using everyday language

  19. [23]

    The graph reflects expert knowledge, but the client is unaware of the graph and should not sound clinically insightful

  20. [24]

    Infer content from the graph implicitly, not by naming nodes or edges

  21. [25]

    No profile may resemble another in tone, life stage, or narrative arc

  22. [26]

    Output must be valid JSON only, with no surrounding text or commentary

  23. [27]

    ## Task Infer 10 diverse client intake forms based on a client graph

    Any violation of formatting or repetition invalidates the output. ## Task Infer 10 diverse client intake forms based on a client graph. The **client graph** is given as a list of nodes representing recurring psychological and behavioral patterns of the client, and edges representing connections between them. **Each client intake form must include the foll...

  24. [28]

    Basic Information - name, age, gender, occupation, education, marital status, family details

  25. [29]

    Presenting Problem - What symptoms do you want to discuss? - When did the problem/symptoms start? - What was the stress level when the problem/symptoms first occurred? (What do you think might be the cause?) - How has the problem/symptoms progressed? (Changes over time, aggravating factors, alleviating factors, etc.) - Currently, in what situations, how o...

  26. [30]

    Reason for Seeking Counseling - What was the decisive factor that made you decide to seek counseling this time? (If the problem has been long standing, what made you decide to seek counseling now?)

  27. [31]

    Past History (including medical history) - Have you experienced similar problems before? Under what circumstances or stress did the problem occur, and what were the patterns? How did you cope? - Have you received treatment/counseling for other psychological problems/symptoms? (When, for how long, any medication use, reasons for stopping - improved? stoppe...

  28. [32]

    Academic/occupational functioning level (attendance, grades/job performance, etc.) - Interpersonal relationships - Daily life (including sleep, eating, self-care, etc.) - Social Support System

  29. [33]

    profile":

    Is there anyone you can talk to or get help from when you encounter difficulties or problems? ## Output Format - Output only valid JSON - Do not include any explanation or comments. Just output the profiles. ## Example Output The following is an example output. Do not copy any profiles directly. [ {{"profile": "Example profile number 1"}}, {{"profile": "E...

  30. [34]

    {Not used in Profile}

    The dialogue must be consistent with the client intake form {not used in CPG} and client graph. {Not used in Profile}

  31. [35]

    {Not used in Profile}

    Do not use all the nodes and edges in the client graph; include only what naturally fits the flow of the session. {Not used in Profile}

  32. [36]

    mm-hm",

    Use natural conversational signals whenever appropriate (e.g., "mm-hm", "um", "yeah","right","...")

  33. [37]

    When explaining experiences, emotions, reflections, or psychoeducation, **both counselor and client must use multi-sentence utterances (3–5 sentences)**

  34. [38]

    Do **not** advance to new topics or conclusions in consecutive turns

    The session should progress through ideas gradually. Do **not** advance to new topics or conclusions in consecutive turns. Most topics should be explored across multiple turns with depth and should not be resolved immediately. Figure 6: Global constraints for counseling dialogue generation. 26 Counselor guidelines for counseling dialogue generation

  35. [39]

    For counselor turns, encourage natural elaboration rather than brevity. In each counselor utter- ance, explicitly use at least one counseling technique, such as reflection, open-ended questioning, summarizing, or gentle reframing, without sounding mechanical or repetitive

  36. [40]

    Do not dismiss the client’s experience

    Maintain a nonjudgmental, collaborative stance; avoid jumping to conclusions or positioning yourself as the authority. Do not dismiss the client’s experience

  37. [41]

    take your time

    The counselor should support the client in examining, questioning, and reshaping their own thoughts and experiences at their own pace through acknowledging pauses, hesitation, or silence (e.g., “take your time”, “we can sit with that for a moment”)

  38. [42]

    The dialogue should not feel like an interview**

    **The counselor must not end every utterance with a question. The dialogue should not feel like an interview**

  39. [43]

    Rather they should build towards the information they introduce

    The counselor should not introduce new information randomly. Rather they should build towards the information they introduce

  40. [44]

    The counselor should encourage the client to apply concepts to their real life, specific scenarios and/or review past week and upcoming week assignments, focusing on specific ways to connect session content with real-life applications

  41. [45]

    does this sound useful to you?

    The counselor should prioritize understanding, emotional safety, and rapport before offering interventions or insights. When appropriate, the counselor should check in to ensure shared under- standing (e.g., “does this sound useful to you?”, “does this make sense?”, “sounds like you’re going through [client’s issue] — is that right?”). **These check-ins s...

  42. [46]

    I want to check in about something, just to make sure I understand how you’re doing

    The counselor should do assessment/follow-up on client comments that could be indicative of a larger issue (e.g., hopelessness = assess for suicidality, weight loss = assess for eating disorder/ap- petite changes, difficult relationship = assess for safety at home, etc). The counselor should frame these questions as curiosity and care, not assumptions (e....

  43. [47]

    Psychoeducation should be preceded by a brief reflection or summary that connects it directly to what the client just shared

    The counselor should offer psychoeducation when it directly supports the client’s understanding or client expresses misunderstanding of treatment concepts. Psychoeducation should be preceded by a brief reflection or summary that connects it directly to what the client just shared. The counselor should use clear, everyday language for psychoeducation and c...

  44. [48]

    The counselor must respect pacing and readiness; invite exploration without rushing

  45. [49]

    **Repeating exact phrasing is disallowed**; repeating therapeutic functions (e.g., reflection, validation) using varied language is expected

  46. [50]

    It shouldn’t be the other way around where the counselor asks the client if some techniques comes to mind

    **The counselor should be the one to suggest psychological techniques to the client**. It shouldn’t be the other way around where the counselor asks the client if some techniques comes to mind. It is fine for the counselor to ask the client if they have tried anything already. Figure 7: Counselor guidelines (designed with direct input from clinicians) for...

  47. [51]

    {Not used in Profile}

    The client graph influences responses implicitly and must not be named directly. {Not used in Profile}

  48. [52]

    attachment issues,

    The client should express their experiences in everyday, non-clinical language and should not self-diagnose or use professional terminology (e.g., “attachment issues,” “cognitive distortions”) as a trained counselor would. The client may, however, use informal or popular mental health terms commonly encountered on social media, as well as terms that have ...

  49. [53]

    The client should not provide detailed concrete descriptions or mini- narratives unless the counselor explicitly asks for elaboration or invites reflection

    Early in the session, the client should respond with **brief, surface-level descriptions** of emotions and experiences. The client should not provide detailed concrete descriptions or mini- narratives unless the counselor explicitly asks for elaboration or invites reflection. Detailed emotional descriptions and mini-narratives (what happened, what was not...

  50. [54]

    I hear what you’re saying, but at the same time it doesn’t really feel true for me

    **The client can express ambivalence, confusion, or difficulty naming emotions when appropri- ate**. Ambivalence should be expressed as simultaneous pull in opposing directions, not passive agreement (e.g.,“I hear what you’re saying, but at the same time it doesn’t really feel true for me.”)

  51. [55]

    I’m not really sure what that means

    When the counselor offers an interpretation, suggestion, coping strategy, or reframing, the client must first respond with at least one of the following before any agreement: Confusion (“I’m not really sure what that means. . . ”), Skepticism (“I don’t see how that would help. . . ”), Partial resistance (“I get what you’re saying, but. . . ”), Difficulty ...

  52. [56]

    speaker":

    A structured rationale explaining why that utterance was generated at that moment, grounded in: - The current dialogue history, - Appropriate counseling techniques for counselor turns, - The client intake form Not used in CPG and client graph Not used in Profile for client turns. The rationale should justify intent and alignment, not reveal hidden interna...

  53. [59]

    If there is any deficiency, no matter how minor, assign a score of 4 or lower

    Assign a score based on the criteria, grading very strictly and uptight. If there is any deficiency, no matter how minor, assign a score of 4 or lower

  54. [60]

    Do not add any prefix

    Output the score and the explanation, separated by a comma. Do not add any prefix. Counseling conversation: {conversation} Evaluation Question: {question} Criteria: {criteria} Figure 34: Prompt used to evaluate the generated counseling sessions on CTRS. W AI LLM-as-a-judge Evaluation Prompt The following is a psychological counseling session between a cou...

  55. [61]

    Read the counseling session transcript carefully

  56. [62]

    Review the evaluation questions and criteria provided below

  57. [63]

    Assign a score based on the criteria, grading very strictly

  58. [64]

    utterance 1

    Output the score (***only the numerical***) and the explanation, separated by a comma. ***Do not add any prefix.*** Counseling conversation: {conversation} Question: {question} Criteria: {criteria} Figure 35: Prompt used to evaluate the generated counseling sessions on W AI. 41 Client Issue Extraction Prompt for SQPsychConv Extract the presenting problem ...