pith. sign in

arxiv: 2605.30273 · v1 · pith:UWPPTDQAnew · submitted 2026-05-28 · 💻 cs.HC · cs.AI· cs.CL· cs.CY· cs.SI

LLUMI: Improving LLM Writing Assistance for Mental Health Support with Online Community Feedback

Pith reviewed 2026-06-29 05:22 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.CLcs.CYcs.SI
keywords LLM fine-tuningmental health supportpreference optimizationcommunity feedbackopen-source modelsprivacy preservationSFTDPO
0
0 comments X

The pith

Small open-source models match GPT performance on mental health responses when trained on Reddit community upvotes and downvotes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces LLUMI, a two-part system with a generation model that drafts replies to mental health queries and an improvement model that revises them. Feedback from Reddit mental health communities supplies the training signal: upvotes and downvotes create chosen-rejected pairs that drive supervised fine-tuning and direct preference optimization on smaller open-source models. The resulting system stays runnable inside protected environments rather than relying on external cloud APIs. Across automated linguistic checks and human ratings on readability, empathy, connection, actionability, and safety, LLUMI performs at levels comparable to larger proprietary models.

Core claim

LLUMI consists of a generation model that drafts supportive responses to mental health queries and an improvement model that revises an initial human-crafted response; both are aligned by constructing chosen-rejected pairs from community endorsement patterns such as upvotes and downvotes for SFT and DPO, yielding performance comparable to GPT models across linguistic analyses and human evaluations while remaining deployable in-house.

What carries the argument

Construction of chosen-rejected response pairs from Reddit upvotes and downvotes to drive SFT and DPO on open-source generation and improvement models.

If this is right

  • Smaller open-source models become practical for sensitive mental health assistance without sending user data to external providers.
  • Community endorsement patterns can replace expert-labeled data for aligning models on empathy, actionability, and safety.
  • The dual-model design allows an initial draft to be refined using the same preference signals.
  • Human evaluations on five explicit dimensions confirm the alignment holds under direct scrutiny.
  • The entire pipeline can run locally, reducing privacy and governance risks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same community-signal method could transfer to other sensitive domains such as crisis text lines or peer support for chronic illness.
  • Local retraining on fresh community votes could let the system track shifting norms faster than static expert datasets.
  • Organizations with strict data residency rules could adopt the approach without new infrastructure for cloud compliance.
  • Extending the improvement model to incorporate live community threads might create ongoing, low-cost adaptation loops.

Load-bearing premise

Upvotes and downvotes in Reddit mental health communities serve as reliable proxies for response quality, empathy, and safety.

What would settle it

A side-by-side rating by licensed mental health clinicians showing no reliable difference in empathy or safety between community-chosen and community-rejected responses.

read the original abstract

Large language models (LLMs) show promise in generating supportive responses for mental health queries, but improving their usefulness, empathy, and safety often requires substantial compute, expert input, and labeled data. At the same time, deploying proprietary, cloud-based models for mental health-related interactions raises important privacy and data-governance concerns, given the sensitivities. To address this challenge, we introduce LLUMI setup that can be hosted in-house within protected environments. LLUMI consists of two complementary components: a generation model (GM), which drafts supportive responses to mental health queries, and an improvement model (IM), which revises an initial human-crafted response. We leverage feedback signals from Reddit mental health communities, using community endorsement patterns such as upvotes and downvotes to construct chosen-rejected response pairs for Supervised Fine Tuning (SFT) and Direct Preference Optimization (DPO). We further align LLUMI using human evaluation across five dimensions: readability, empathy, connection, actionability, and safety. Our results show that, despite relying on smaller open-source models rather than proprietary cloud-based GPT models, LLUMI achieves comparable performance across linguistic analyses and human evaluations. These findings suggest that open-source models, when trained with community-derived preference signals, can support high-quality mental health support assistance while offering a more privacy-preserving alternative for sensitive support contexts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces LLUMI, a two-component system (generation model GM and improvement model IM) for drafting and revising supportive responses to mental health queries. It constructs chosen-rejected pairs from Reddit mental-health subreddit upvotes/downvotes for SFT and DPO training of smaller open-source LLMs, then aligns further via human evaluation on five dimensions (readability, empathy, connection, actionability, safety). The central claim is that these models achieve performance comparable to proprietary GPT-scale systems on linguistic analyses and human evaluations while enabling in-house, privacy-preserving deployment.

Significance. If the results hold, the work shows that community-derived preference signals can substitute for expert-labeled data and proprietary models in a high-stakes domain, offering a reproducible path to effective, locally hosted mental-health assistance tools. The use of open-source backbones plus explicit human evaluation on safety and empathy dimensions is a concrete strength.

major comments (3)
  1. [Methods (data construction)] Methods / data-construction section: chosen-rejected pairs are formed directly from subreddit upvotes/downvotes without any reported correlation analysis or pilot study showing that these signals align with the five human-evaluation dimensions (empathy, safety, etc.). Because both the DPO objective and the downstream comparability claim rest on this proxy, the absence of validation is load-bearing.
  2. [Results] Results section: the abstract asserts 'comparable performance' to GPT models, yet the provided text supplies no quantitative tables, exact metric values, confidence intervals, or statistical tests comparing LLUMI variants against GPT baselines on the linguistic or human-evaluation axes. Without these numbers the central claim cannot be assessed.
  3. [Human evaluation] Human-evaluation protocol: the five dimensions are used both for final alignment and for claiming parity, but no inter-rater reliability statistics, rater qualifications, or blinding procedure are described. This directly affects the credibility of the safety and empathy scores that underwrite the main result.
minor comments (2)
  1. [Abstract] Abstract states outcomes but omits all dataset sizes, model parameter counts, and ablation details; these should appear in the abstract or a dedicated table for reproducibility.
  2. [Overview] Notation for GM and IM is introduced without an explicit diagram or pseudocode; a figure showing the two-stage inference flow would improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important areas for improving methodological transparency, results presentation, and evaluation rigor. We address each point below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: Methods / data-construction section: chosen-rejected pairs are formed directly from subreddit upvotes/downvotes without any reported correlation analysis or pilot study showing that these signals align with the five human-evaluation dimensions (empathy, safety, etc.). Because both the DPO objective and the downstream comparability claim rest on this proxy, the absence of validation is load-bearing.

    Authors: We agree that explicit validation of the proxy would add rigor. Community votes are used as an implicit preference signal following established practices in preference optimization; the subsequent human evaluation on the five dimensions serves as the primary validation. In revision we will add a pilot analysis correlating vote polarity with dimension scores on a held-out sample and expand the methods discussion of the proxy's rationale. revision: yes

  2. Referee: Results section: the abstract asserts 'comparable performance' to GPT models, yet the provided text supplies no quantitative tables, exact metric values, confidence intervals, or statistical tests comparing LLUMI variants against GPT baselines on the linguistic or human-evaluation axes. Without these numbers the central claim cannot be assessed.

    Authors: The full results section contains comparative figures; however, we accept that tabular presentation with exact values, confidence intervals, and statistical tests (e.g., paired tests with p-values) is needed for clarity. We will insert comprehensive tables and report the requested statistics in the revised manuscript. revision: yes

  3. Referee: Human-evaluation protocol: the five dimensions are used both for final alignment and for claiming parity, but no inter-rater reliability statistics, rater qualifications, or blinding procedure are described. This directly affects the credibility of the safety and empathy scores that underwrite the main result.

    Authors: We will expand the human-evaluation subsection to report inter-rater reliability (Krippendorff's alpha), rater qualifications and training, and explicit blinding procedures. These details were collected during the study and will be added to the revision. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external signals and separate evaluation

full rationale

The paper constructs training pairs from Reddit upvotes/downvotes and reports performance via separate human evaluation on five dimensions plus linguistic analyses. No equations, fitted parameters, or self-citations are shown that reduce the comparability claim to the inputs by construction. The central result is an empirical outcome measured against external benchmarks rather than a self-referential definition or renamed fit. This matches the default expectation of non-circularity for papers whose claims rest on observable external data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; ledger entries are therefore provisional and limited to elements explicitly named in the abstract.

axioms (1)
  • domain assumption Community upvotes and downvotes reliably indicate higher-quality supportive responses suitable for preference learning.
    Used to construct chosen-rejected pairs for SFT and DPO.

pith-pipeline@v0.9.1-grok · 5806 in / 1045 out tokens · 21216 ms · 2026-06-29T05:22:13.245764+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 6 canonical work pages · 4 internal anchors

  1. [1]

    Modeling Empathy and Distress in Reaction to News Stories

    Sensitive self-disclosures, responses, and so- cial support on instagram: the case of# depression. InProceedings of the 2017 ACM conference on com- puter supported cooperative work and social comput- ing, pages 1485–1500. Nikolay Babakov, David Dale, Ilya Gusev, Irina Kro- tova, and Alexander Panchenko. 2023. Don’t lose the message while paraphrasing: A s...

  2. [2]

    J.; Yoo, D

    Linguistic markers indicating therapeutic out- comes of social media disclosures of schizophrenia. Proceedings of the ACM on Human-Computer Inter- action, 1(CSCW):1–27. Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated con...

  3. [3]

    CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering

    Counselbench: a large-scale expert evalua- tion and adversarial benchmarking of large language models in mental health question answering.arXiv preprint arXiv:2506.08584. Nicholas Mcinnes and Bo JA Haglund. 2011. Read- ability of online health information: implications for health literacy.Informatics for health and social care, 36(4):173–189. Tomas Mikolo...

  4. [4]

    CALM-IT: Generating Realistic Long-Form Motivational Interviewing Dialogues with Dual-Actor Conversational Dynamics Tracking

    Calm-it: Generating realistic long-form mo- tivational interviewing dialogues with dual-actor conversational dynamics tracking.arXiv preprint arXiv:2601.10085. John C Norcross and Michael J Lambert. 2018. Psy- chotherapy relationships that work iii.Psychother- apy, 55(4):303. James W Pennebaker, Cindy K Chung, Joey Frazee, Gary M Lavergne, and David I Bea...

  5. [5]

    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

    Direct preference optimization: Your language model is secretly a reward model.Advances in neural information processing systems, 36:53728–53741. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084. 11 Koustuv Saha, Sindhu Kiranmai Ernala, Sarmistha Dutta, Eva Sharma, and M...

  6. [6]

    Understanding moderation in online mental health communities. InHCII. Springer. Koustuv Saha, Yoshee Jain, Chunyu Liu, Sidharth Kali- appan, and Ravi Karkar. 2025. Ai vs. humans for online support: Comparing the language of responses from llms and online communities of alzheimer’s dis- ease.ACM Transactions on Computing for Health- care. Koustuv Saha and ...

  7. [7]

    Soorya Ram Shimgekar, Violeta J Rodriguez, Paul A Bloom, Dong Whi Yoo, and Koustuv Saha

    Mapping caregiver needs to ai chatbot de- sign: Strengths and gaps in mental health support for alzheimer’s and dementia caregivers.arXiv preprint arXiv:2506.15047. Soorya Ram Shimgekar, Violeta J Rodriguez, Paul A Bloom, Dong Whi Yoo, and Koustuv Saha. 2025. Interpersonal theory of suicide as a lens to examine suicidal ideation in online spaces.arXiv pre...

  8. [8]

    InWork- shop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA), held in conjunc- tion with EACL 2021, pages 92–104

    Wassa 2021 shared task: predicting empathy and emotion in reaction to news stories. InWork- shop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA), held in conjunc- tion with EACL 2021, pages 92–104. Association for Computational Linguistics. Yla R Tausczik and James W Pennebaker. 2010. The psychological meaning of words: Liwc and...

  9. [9]

    In Proceedings of the ACM Web Conference 2023, pages 2677–2685

    Mental health coping stories on social media: A causal-inference study of papageno effect. In Proceedings of the ACM Web Conference 2023, pages 2677–2685. 13 A Appendix A.1 Training Pipeline PHASE 1 Supervision from Online Peer Feedback PHASE 2 Preference Alignment with Crowd-Sourced Evaluations GM Training Pipeline 1 Train SFT on Reddit Comments Posts pa...

  10. [10]

    - Avoid overly formal, clinical, or``therapist-like''language

    **Tone and Style** - Sound like a genuine Reddit user -warm, human, and conversational. - Avoid overly formal, clinical, or``therapist-like''language. - Do not use generic or repetitive empathy (e.g.,``I'm so sorry you're going through this'' repeatedly). - Avoid using emojis. - Do not suggest any type of hotline or crisis resource in your response (e.g.,...

  11. [12]

    - Keep it **concise** --- avoid long, repetitive paragraphs

    **Content Quality** - Improve clarity, flow, and emotional resonance. - Keep it **concise** --- avoid long, repetitive paragraphs. - Ensure the improved comment provides **constructive emotional or practical support**. 17 - Maintain or enhance any useful information from the original comment. </commenting_guidelines> <examples> Post: [Retrieved similar Re...

  12. [13]

    - Avoid overly formal, clinical, or``therapist-like''language

    **Tone and Style** - Sound like a genuine Reddit user - warm, human, and conversational. - Avoid overly formal, clinical, or``therapist-like''language. - Do not use generic or repetitive empathy (e.g.,``I'm so sorry you're going through this'' repeatedly). - Avoid using emojis. - Do not suggest any type of hotline or crisis resource in your response (e.g....

  13. [14]

    - Offer encouragement, gentle advice, or shared perspective where appropriate

    **Empathy and Support** - Show understanding and validation of the original poster's feelings. - Offer encouragement, gentle advice, or shared perspective where appropriate. - Be kind, authentic, and emotionally intelligent --- not robotic or exaggerated

  14. [15]

    - Keep it **concise** --- avoid long, repetitive paragraphs

    **Content Quality** - Improve clarity, flow, and emotional resonance. - Keep it **concise** --- avoid long, repetitive paragraphs. - Ensure the improved comment provides **constructive emotional or practical support**. - Maintain or enhance any useful information from the original comment. </commenting_guidelines> 18 <response_format> - Output only the **...