pith. sign in

arxiv: 2604.17871 · v1 · submitted 2026-04-20 · 💻 cs.HC

Design and Evaluation of a Culturally Adapted Multimodal Virtual Agent for PTSD Screening

Pith reviewed 2026-05-10 04:30 UTC · model grok-4.3

classification 💻 cs.HC
keywords PTSD screeningvirtual agentmultimodal interactionconversational AIcultural adaptationmilitary healthcarePCL-5human-AI collaboration
0
0 comments X

The pith

Molhim is a feasible culturally adapted multimodal AI platform for PTSD screening in military healthcare contexts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Molhim, a conversational AI system with a high-fidelity virtual avatar that conducts structured dialogues for PTSD screening. It integrates large language models, speech recognition, visual input processing, and text-to-speech to administer the PCL-5 questionnaire in real time. The work evaluates this setup in a military context and concludes that such systems can support clinical screening while emphasizing considerations for cooperative human-AI interactions. This matters because PTSD is underreported among combat personnel, and AI platforms could provide accessible, private screening options.

Core claim

The authors present Molhim as a configurable platform that supports PTSD screening through a pipeline of session setup, real-time multimodal dialogue with a virtual avatar, and automated post-session analysis. They demonstrate its use in administering the PTSD Checklist for DSM-5 and suggest its feasibility for use in military healthcare environments.

What carries the argument

Molhim's conversational pipeline, which combines a large language model-driven avatar with real-time speech recognition, visual understanding, text-to-speech, and post-session feedback for structured PTSD assessment.

If this is right

  • Supports real-time multi-turn conversations for accurate administration of standardized screening tools like the PCL-5.
  • Provides automated analysis and feedback after sessions to aid clinical decision-making.
  • Highlights the importance of cultural adaptations in virtual agents for clinical applications.
  • Offers a model for socially cooperative human-AI systems in sensitive healthcare settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Such systems could potentially lower barriers to mental health screening by offering anonymity and reducing stigma.
  • Future work might test the accuracy of AI-generated assessments against clinician judgments in larger trials.
  • Adaptations of the platform could extend to screening for other conditions or in different cultural groups.

Load-bearing premise

The multimodal features and cultural adaptations will enable the AI to reliably detect or screen for PTSD without significant errors in interpretation or user engagement.

What would settle it

A study that compares the PTSD screening results and user experience from Molhim sessions against gold-standard clinical interviews in a sample of combat-exposed military personnel.

Figures

Figures reproduced from arXiv: 2604.17871 by Abdulrhman Aljouie, Bdour Alwuqaysi, Cengiz Ozel, Ehsan Hoque, Rahaf Fahad Alnufaie, Rakan Altasan, Sabri Boughorbel, Samuel Potter, Waleed Nadeem, Wejdan Alotaibi, Yahya Bokhari.

Figure 1
Figure 1. Figure 1: Molhim conversational interface. Users interact [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: System architecture of Molhim. User speech is transcribed by ASR and visual cues are extracted by a VLM. A flow [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example post-session results page. The interface [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: State-machine structure of the Molhim PTSD screen [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

Post-traumatic stress disorder (PTSD) is highly prevalent yet chronically underreported among combat-exposed military personnel. This paper presents Molhim, a culturally adapted multimodal conversational AI platform that supports purpose-specific interactions through a configurable conversational pipeline consisting of session setup, real-time dialogue with a high-fidelity virtual avatar, and post-session analysis and feedback. In this work, we examine the PTSD screening configuration of the Molhim platform in a military healthcare context. The system employs a conversational avatar driven by a large language model, integrating real-time speech recognition, visual understanding of user input, text-to-speech synthesis, and a high-fidelity human avatar to support structured multi-turn dialogue and automated post-session analysis, including administration of the PTSD Checklist for DSM-5 (PCL-5). These findings suggest the feasibility of Molhim as a conversational platform for PTSD screening and highlight design considerations for socially cooperative human-AI systems in clinical environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents Molhim, a culturally adapted multimodal conversational AI platform for PTSD screening in military healthcare contexts. It describes a configurable pipeline including session setup, real-time dialogue via an LLM-driven high-fidelity virtual avatar with speech recognition, visual understanding, text-to-speech synthesis, structured administration of the PCL-5, and automated post-session analysis and feedback. The authors conclude that the described system suggests feasibility for PTSD screening and highlights design considerations for socially cooperative human-AI systems in clinical environments.

Significance. If supported by empirical validation, the work could advance accessible, culturally sensitive AI tools for mental health screening in high-stigma settings such as military healthcare by combining multimodal interaction with validated clinical instruments. The architectural details of the configurable pipeline offer a potentially reusable template for clinical conversational agents. However, the complete absence of any evaluation data means the claimed feasibility remains untested and the practical significance is not yet established.

major comments (2)
  1. [Abstract] Abstract: The statement 'These findings suggest the feasibility of Molhim as a conversational platform for PTSD screening' is unsupported. The manuscript supplies no participant cohort details, quantitative metrics (accuracy, sensitivity, specificity, inter-rater agreement), qualitative user-study outcomes, or comparison baselines to substantiate feasibility or reliability of the PTSD screening configuration.
  2. [Manuscript body] The manuscript as a whole: No tables, figures, or dedicated evaluation section report any performance data, error analysis, or validation results against the PCL-5 or other instruments, leaving the central feasibility claim without evidentiary basis.
minor comments (1)
  1. [Abstract] Abstract: The phrase 'these findings' is used without any preceding or subsequent reference to specific results or data within the paper.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and for identifying the mismatch between our feasibility claims and the evidentiary content of the manuscript. We agree that the current version is a system-description paper and does not contain participant data, performance metrics, or a validation study. We will revise the text to remove unsupported claims, clarify the scope, and add an explicit limitations and future-work section.

read point-by-point responses
  1. Referee: [Abstract] The statement 'These findings suggest the feasibility of Molhim as a conversational platform for PTSD screening' is unsupported. The manuscript supplies no participant cohort details, quantitative metrics (accuracy, sensitivity, specificity, inter-rater agreement), qualitative user-study outcomes, or comparison baselines to substantiate feasibility or reliability of the PTSD screening configuration.

    Authors: We agree. The phrase 'these findings' was imprecise and implied empirical results that are not present. The manuscript reports only the successful technical integration of the configurable pipeline, avatar, speech components, and PCL-5 administration logic. We will replace the sentence with a statement that the implemented system demonstrates technical and architectural feasibility for structured PTSD screening dialogues, and we will add a sentence noting that clinical validation remains future work. revision: yes

  2. Referee: [Manuscript body] The manuscript as a whole: No tables, figures, or dedicated evaluation section report any performance data, error analysis, or validation results against the PCL-5 or other instruments, leaving the central feasibility claim without evidentiary basis.

    Authors: This assessment is correct. The paper focuses on the design of the multimodal pipeline, cultural adaptation choices, and integration of the virtual avatar rather than on an empirical evaluation. We will insert a new 'Limitations and Future Directions' section that (1) states the absence of user studies or quantitative validation in the present work, (2) describes the planned evaluation protocol (participant cohort, metrics, and comparison with standard PCL-5 administration), and (3) removes any language that suggests current feasibility has been empirically demonstrated. revision: yes

Circularity Check

0 steps flagged

No significant circularity; paper is a descriptive system design without derivations or predictions.

full rationale

The manuscript describes the Molhim platform's architecture, including conversational pipeline, avatar, and PCL-5 administration, but contains no mathematical equations, fitted parameters, or predictive models. The central claim of feasibility is presented as a suggestion based on the design rather than derived from data or prior self-citations in a circular manner. No load-bearing steps reduce to inputs by construction, making the derivation chain self-contained as a non-mathematical design paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The feasibility suggestion rests on the assumption that the described multimodal pipeline functions as intended in clinical use, with the new platform as the main addition.

axioms (1)
  • domain assumption The PTSD Checklist for DSM-5 (PCL-5) can be validly administered and interpreted through automated conversational analysis
    Invoked in the post-session analysis description.
invented entities (1)
  • Molhim platform no independent evidence
    purpose: Configurable multimodal conversational AI with high-fidelity avatar for culturally adapted PTSD screening
    New system introduced and examined in the paper.

pith-pipeline@v0.9.0 · 5514 in / 1257 out tokens · 49428 ms · 2026-05-10T04:30:50.473330+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

  1. [1]

    Dana Al-Ghadhban and Nora Al-Twairesh. 2020. Nabiha: an Arabic dialect chatbot.International Journal of Advanced Computer Science and Applications11, 3 (2020)

  2. [2]

    Tuka Al Hanai, Mohammad M Ghassemi, and James R Glass. 2018. Detecting depression with audio/text sequence modeling of interviews.. InInterspeech. 1716–1720. Design and Evaluation of a Culturally Adapted Multimodal Virtual Agent for PTSD Screening

  3. [3]

    Renad Al-Monef, Hassan Alhuzali, Nora Alturayeif, and Ashwag Alasmari. 2026. From words to proverbs: Evaluating LLMs’ linguistic and cultural competence in Saudi dialects with Absher.Alexandria Engineering Journal137 (2026), 25–41

  4. [4]

    Essam Alghamdi, Martin Halvey, and Emma Nicol. 2025. The effect of inter- action language on preferences for communication repair strategies in Digital Voice Assistants (DVAs): a comparative study. InProceedings of the Mensch und Computer 2025. 146–162

  5. [5]

    Bandar S Alharbi, Majed M Aljabri, and Endale Alemayehu Ali. 2026. Determi- nants of Trust in Artificial Intelligence (AI) for Health-Related Decision-Making Among Adults in Saudi Arabia: A Cross-Sectional Study. InHealthcare, Vol. 14. MDPI, 506

  6. [6]

    Mohammad Rafayet Ali and Ehsan Hoque. 2017. Social skills training with virtual assistant and real-time feedback. InProceedings of the 2017 ACM international joint conference on pervasive and ubiquitous computing and proceedings of the 2017 ACM international symposium on wearable computers. 325–329

  7. [7]

    Richard F Armenta, Toni Rush, Cynthia A LeardMann, Jeffrey Millegan, Adam Cooper, Charles W Hoge, and Millennium Cohort Study team. 2018. Factors associated with persistent posttraumatic stress disorder among US military service members and veterans.BMC psychiatry18, 1 (2018), 48

  8. [8]

    Bliese, Kathleen M

    Paul D. Bliese, Kathleen M. Wright, Amy B. Adler, Oscar Cabrera, Carl A. Castro, and Charles W. Hoge. 2008. Validating the primary care posttraumatic stress disorder screen and the posttraumatic stress disorder checklist with soldiers returning from combat.Journal of Consulting and Clinical Psychology76, 2 (2008), 272–281. doi:10.1037/0022-006X.76.2.272

  9. [9]

    Feng Chen, Dror Ben-Zeev, Gillian Sparks, Arya Kadakia, and Trevor Cohen

  10. [10]

    InBiocomputing 2026: Proceedings of the Pacific Symposium

    Detecting PTSD in clinical interviews: A comparative analysis of NLP methods and large language models. InBiocomputing 2026: Proceedings of the Pacific Symposium. World Scientific, 265–279

  11. [11]

    Herbert H

    Herbert H. Clark and Susan E. Brennan. 1991. Grounding in Communication. InPerspectives on Socially Shared Cognition, Lauren B. Resnick, John M. Levine, and Stephanie D. Teasley (Eds.). American Psychological Association, 127–149. doi:10.1037/10096-006

  12. [12]

    Emmelyn AJ Croes, Marjolijn L Antheunis, Chris Van Der Lee, and Jan MS De Wit

  13. [13]

    Digital confessions: The willingness to disclose intimate information to a chatbot and its impact on emotional well-being.Interacting with Computers36, 5 (2024), 279–292

  14. [14]

    David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jon Gratch, Arno Hartholt, Margaux Lhommet, et al . 2014. SimSensei Kiosk: A virtual human interviewer for healthcare decision support. InProceedings of the 2014 international conference on Autonomous agents and multi-agent systems. 1061–1068

  15. [15]

    Ahmed Fadhil et al. 2019. OlloBot-towards a text-based arabic health conversa- tional agent: Evaluation and results. InProceedings of the international conference on recent advances in natural language processing (RANLP 2019). 295–303

  16. [16]

    Ali Farghaly and Khaled Shaalan. 2009. Arabic natural language processing: Challenges and solutions.ACM Transactions on Asian Language Information Processing (TALIP)8, 4 (2009), 1–22

  17. [17]

    Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial.JMIR mental health4, 2 (2017), e7785

  18. [18]

    Russell Fulmer, Angela Joerin, Breanna Gentile, Lysanne Lakerink, and Michiel Rauws. 2018. Using psychological artificial intelligence (Tess) to relieve symp- toms of depression and anxiety: randomized controlled trial.JMIR mental health 5, 4 (2018), e9782

  19. [19]

    Ulrich Gnewuch, Stefan Morana, Marc TP Adam, and Alexander Maedche. 2022. Opposing Effects of Response Time in Human–Chatbot Interaction: U. Gnewuch et al.Business & Information Systems Engineering64, 6 (2022), 773–791

  20. [20]

    Jonathan Gratch, Ron Artstein, Gale Lucas, and et al. 2014. The Distress Analysis Interview Corpus of Human and Computer Interviews. InLREC

  21. [21]

    Hee Jeong Han, Sanjana Mendu, Beth K Jaworski, Jason E Owen, and Saeed Abdullah. 2021. PTSDialogue: designing a conversational agent to support individuals with post-traumatic stress disorder. InAdjunct proceedings of the 2021 ACM international joint conference on pervasive and ubiquitous computing and proceedings of the 2021 ACM international symposium o...

  22. [22]

    Hee Jeong Han, Sanjana Mendu, Beth K Jaworski, Jason E Owen, and Saeed Abdullah. 2023. Preliminary evaluation of a conversational agent to support self- management of individuals living with posttraumatic stress disorder: interview study with clinical experts.JMIR Formative Research7 (2023), e45894

  23. [23]

    Hee Jeong Han, Sanjana Mendu, Beth K Jaworski, Jason E Owen, and Saeed Abdullah. 2024. Assessing acceptance and feasibility of a conversational agent to support individuals living with post-traumatic stress disorder.Digital Health 10 (2024), 20552076241286133

  24. [24]

    Masum Hasan, Cengiz Ozel, Sammy Potter, and Ehsan Hoque. 2023. SAPIEN: affective virtual agents powered by large language models. In2023 11th interna- tional conference on affective computing and intelligent interaction workshops and demos (ACIIW). IEEE, 1–3

  25. [25]

    Charles W Hoge, Carl A Castro, Stephen C Messer, Dennis McGurk, Dave I Cotting, and Robert L Koffman. 2004. Combat duty in Iraq and Afghanistan, mental health problems, and barriers to care.New England journal of medicine 351, 1 (2004), 13–22

  26. [26]

    Jiawei Hu, Chunxiao Zhao, Congrong Shi, Ziyi Zhao, and Zhihong Ren. 2024. Speech-based recognition and estimating severity of PTSD using machine learn- ing.Journal of affective disorders362 (2024), 859–868

  27. [27]

    Benjamin Kane, Catherine Giugno, Lenhart Schubert, Kurtis Haut, Caleb Wohn, and Ehsan Hoque. 2022. A flexible schema-guided dialogue management frame- work: From friendly peer to virtual standardized cancer patient.arXiv preprint arXiv:2207.07276(2022)

  28. [28]

    Eric Kuhn, Carolyn Greene, Julia Hoffman, Tam Nguyen, Laura Wald, Janet Schmidt, Kelly M Ramsey, and Josef Ruzek. 2014. Preliminary evaluation of PTSD Coach, a smartphone app for post-traumatic stress symptoms.Military medicine179, 1 (2014), 12–18

  29. [29]

    Hyein S Lee, Colton Wright, Julia Ferranto, Jessica Buttimer, Clare E Palmer, Andrew Welchman, Kathleen M Mazor, Kimberly A Fisher, David Smelson, Laurel O’Connor, et al. 2025. Artificial intelligence conversational agents in mental health: Patients see potential, but prefer humans in the loop.Frontiers in Psychiatry15 (2025), 1505024

  30. [30]

    Gale M Lucas, Jonathan Gratch, Aisha King, and Louis-Philippe Morency. 2014. It’s only a computer: Virtual humans increase willingness to disclose.Computers in Human Behavior37 (2014), 94–100

  31. [31]

    Gale M Lucas, Albert Rizzo, Jonathan Gratch, Stefan Scherer, Giota Stratou, Jill Boberg, and Louis-Philippe Morency. 2017. Reporting mental health symptoms: breaking down barriers to care with virtual human interviewers.Frontiers in Robotics and AI4 (2017), 51

  32. [32]

    Odette Megnin-Viggars, Ifigeneia Mavranezouli, Neil Greenberg, Steve Hajioff, and Jonathan Leach. 2019. Post-traumatic stress disorder: what does NICE guidance mean for primary care?The British Journal of General Practice69, 684 (2019), 328

  33. [33]

    Jingbo Meng and Yue Dai. 2021. Emotional support from AI chatbots: Should a supportive partner self-disclose or not?Journal of Computer-Mediated Commu- nication26, 4 (2021), 207–222

  34. [34]

    Charles S Milliken, Jennifer L Auchterlonie, and Charles W Hoge. 2007. Longitu- dinal assessment of mental health problems among active and reserve component soldiers returning from the Iraq war.Jama298, 18 (2007), 2141–2148

  35. [35]

    Aila Naderbagi, Victoria Loblay, Iqthyer Uddin Md Zahed, Mahalakshmi Ekam- bareshwar, Adam Poulsen, Yun JC Song, Laura Ospina-Pinillos, Michael Krausz, Mostafa Mamdouh Kamel, Ian B Hickie, et al . 2024. Cultural and contextual adaptation of digital health interventions: narrative review.Journal of medical Internet research26 (2024), e55130

  36. [36]

    Jason E Owen, Beth K Jaworski, Eric Kuhn, Kerry N Makin-Byrd, Kelly M Ramsey, and Julia E Hoffman. 2015. mHealth in the wild: using novel data to examine the reach, use, and impact of PTSD coach.JMIR mental health2, 1 (2015), e7

  37. [37]

    Hannah Pelikan and Emily Hofstetter. 2023. Managing delays in human-robot interaction.ACM Transactions on Computer-Human Interaction30, 4 (2023), 1–42

  38. [38]

    Matthew D Pickard, Catherine A Roster, and Yixing Chen. 2016. Revealing sensi- tive information in personal interviews: Is self-disclosure easier with humans or avatars and under what conditions?Computers in Human Behavior65 (2016), 23–30

  39. [39]

    Melissa A Polusny, Chirstopher R Erbes, Maureen Murdoch, Paul A Arbisi, Paul Thuras, and MB Rath. 2011. Prospective risk factors for new-onset post-traumatic stress disorder in National Guard soldiers deployed to Iraq.Psychological medicine 41, 4 (2011), 687–698

  40. [40]

    David C Rozek, Victoria L Steigerwald, Shelby N Baker, Georgina Gross, Kelly P Maieritsch, Rani Hoff, Ilan Harpaz-Rotem, and Noelle B Smith. 2023. Under- standing veteran barriers to specialty outpatient PTSD clinical care.Journal of anxiety disorders95 (2023), 102675

  41. [41]

    Jeff Sawalha, Muhammad Yousefnezhad, Zehra Shah, Matthew RG Brown, An- drew J Greenshaw, and Russell Greiner. 2022. Detecting presence of PTSD using sentiment analysis from text data.Frontiers in psychiatry12 (2022), 811392

  42. [42]

    Ryan M Schuetzler, Justin Scott Giboney, G Mark Grimes, and Jay F Nunamaker Jr

  43. [43]

    The influence of conversational agent embodiment and conversational relevance on socially desirable responding.Decision Support Systems114 (2018), 94–102

  44. [44]

    Marie-Louise Sharp, Nicola T Fear, Roberto J Rona, Simon Wessely, Neil Green- berg, Norman Jones, and Laura Goodwin. 2015. Stigma as a barrier to seeking health care among military personnel with mental health problems.Epidemio- logic reviews37, 1 (2015), 144–162

  45. [45]

    Gabriel Skantze. 2021. Turn-taking in conversational systems and human-robot interaction: a review.Computer Speech & Language67 (2021), 101178

  46. [46]

    Maria M Steenkamp, Brett T Litz, Charles W Hoge, and Charles R Marmar. 2015. Psychotherapy for military-related PTSD: A review of randomized clinical trials. Jama314, 5 (2015), 489–500

  47. [47]

    Myrthe L Tielman, Mark A Neerincx, Rafael Bidarra, Ben Kybartas, and Willem- Paul Brinkman. 2017. A therapy system for post-traumatic stress disorder using a virtual agent and virtual storytelling to reconstruct traumatic memories.Journal Ozel et al. of medical systems41, 8 (2017), 125

  48. [48]

    Eric J Topol. 2019. High-performance medicine: the convergence of human and artificial intelligence.Nature medicine25, 1 (2019), 44–56

  49. [49]

    Christopher H Warner, George N Appenzeller, Keri Mullen, Carolynn M Warner, and Thomas Grieger. 2008. Soldier attitudes toward mental health screening and seeking care upon return from combat.Military medicine173, 6 (2008), 563–569