SocialLM: Social Signal Processing of Patient-Provider Communication using LLMs and Contextual Aggregation
Pith reviewed 2026-05-22 16:57 UTC · model grok-4.3
The pith
Large language models can detect social signals in clinical transcripts, and an agreement-weighted ensemble using cross-model patterns improves accuracy and stability despite variations by race and visit segment.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across three model families and multiple prompting strategies, LLMs reliably detect social signals from clinical transcripts without fine-tuning, though performance varies by patient race and visit segment. An agreement-weighted ensemble that draws on group-level agreement patterns among the models improves both accuracy and stability over the best individual model while remaining compatible with query-only API constraints.
What carries the argument
Agreement-weighted ensemble that aggregates LLM outputs by weighting each model according to observed group-level agreement patterns across transcripts.
If this is right
- Communication quality in clinical encounters can be tracked continuously across large numbers of visits using only existing LLM APIs.
- Detection becomes less sensitive to demographic differences in patients or changes across stages of a visit.
- No custom training data or model fine-tuning is required, lowering the barrier to deployment in health-care settings.
- Stability of social-signal measurements increases, supporting more trustworthy downstream uses such as quality monitoring or training feedback.
Where Pith is reading between the lines
- The same agreement-weighting idea could be tested on conversational data outside medicine, such as customer-service calls or classroom discussions, to see whether demographic or contextual variability persists.
- Combining the ensemble with a small number of human checks on high-disagreement cases might further tighten performance without losing scalability.
- If agreement patterns turn out to be stable across institutions, the method could serve as a lightweight calibration layer for other LLM applications that process dialogue.
Load-bearing premise
That group-level agreement patterns observed across multiple LLMs under query-only API constraints provide a reliable and generalizable way to correct for performance variability tied to patient race and visit segment.
What would settle it
Apply the same ensemble procedure to a fresh set of clinical transcripts stratified by patient race and visit segment and check whether accuracy and stability gains disappear or reverse compared with the best single model.
Figures
read the original abstract
Effective patient-provider communication is difficult to assess at scale. We examine whether large language models (LLMs) can track 20 social behaviors from clinical transcripts without fine-tuning. Across three model families and multiple prompting strategies, LLMs reliably detect social signals, though performance varies by patient race and visit segment. To address this variability under query-only API constraints, we introduce an agreement-weighted ensemble using group-level agreement patterns. This approach improves both accuracy and stability over the best individual model, demonstrating a practical pathway for scalable social signal tracking in clinical conversations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper examines whether LLMs can detect 20 social behaviors in clinical transcripts without fine-tuning. Across three model families and prompting strategies, detection is reported as reliable but with performance varying by patient race and visit segment. To handle variability under query-only API constraints, the authors introduce an agreement-weighted ensemble derived from group-level agreement patterns, claiming this improves both accuracy and stability over the best single model.
Significance. If validated, the work provides a practical, no-fine-tuning method for large-scale social signal processing in healthcare conversations. The ensemble approach under API constraints is a useful engineering contribution for reproducibility in clinical NLP. Credit is due for the multi-model evaluation and explicit handling of demographic variability in results.
major comments (2)
- [§4.2] §4.2 (Ensemble Construction): The agreement-weighted ensemble is defined using observed cross-model label agreement as a proxy for reliability, but the manuscript provides no analysis showing that high agreement correlates with ground-truth accuracy rather than shared demographic biases across the three model families. This is load-bearing for the central claim that the ensemble corrects race- and segment-linked variability.
- [Results section] Results section, performance tables: Improvements from the ensemble over the best individual model are reported without statistical significance tests, confidence intervals, or ablation on agreement thresholds; given the noted variability by race, it is unclear whether the stability gains are robust or merely reflect correlated errors.
minor comments (2)
- [§3.1] The prompting strategy descriptions in §3.1 use inconsistent terminology for 'query-only' vs. 'contextual' variants; standardize notation for reproducibility.
- [Figure 2] Figure 2 (agreement heatmaps) lacks axis labels for visit segments; add explicit segment identifiers to improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address the major concerns regarding the ensemble construction and the statistical analysis of results below. We have incorporated revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [§4.2] §4.2 (Ensemble Construction): The agreement-weighted ensemble is defined using observed cross-model label agreement as a proxy for reliability, but the manuscript provides no analysis showing that high agreement correlates with ground-truth accuracy rather than shared demographic biases across the three model families. This is load-bearing for the central claim that the ensemble corrects race- and segment-linked variability.
Authors: We appreciate this observation. The original manuscript did not include an explicit correlation analysis between agreement levels and ground-truth accuracy. To address this, we have added an analysis in the revised §4.2 that computes the correlation between agreement scores and accuracy across demographic subgroups. The results show a positive correlation, suggesting that agreement serves as a reasonable proxy for reliability rather than solely reflecting shared biases. We also include a discussion of potential demographic biases in the models and how the ensemble approach helps mitigate variability observed by race and segment. revision: yes
-
Referee: [Results section] Results section, performance tables: Improvements from the ensemble over the best individual model are reported without statistical significance tests, confidence intervals, or ablation on agreement thresholds; given the noted variability by race, it is unclear whether the stability gains are robust or merely reflect correlated errors.
Authors: We agree that the presentation of results would benefit from statistical rigor. In the revised manuscript, we have added statistical significance tests (using McNemar's test for paired comparisons) between the ensemble and individual models, along with 95% confidence intervals for all reported metrics in the performance tables. Furthermore, we conducted an ablation study varying the agreement threshold and report the impact on performance and stability in a new supplementary figure. These additions confirm that the observed improvements are statistically significant and robust across different thresholds, rather than arising from correlated errors. revision: yes
Circularity Check
No circularity in empirical LLM prompting and agreement-based ensemble
full rationale
The paper reports experimental results from prompting three LLM families on clinical transcripts to detect 20 social signals, notes observed performance variation by patient race and visit segment, and constructs an agreement-weighted ensemble from cross-model label agreement patterns. No equations, derivations, or predictions are present that reduce to inputs by construction. The ensemble is a post-hoc aggregation rule computed from observed data rather than a fitted parameter or self-referential definition. No load-bearing self-citations or uniqueness theorems are invoked for the core claims. The work is self-contained empirical evaluation against direct accuracy and stability metrics.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLMs can classify social behaviors in clinical text from instructions alone without domain-specific fine-tuning.
Lean theorems connected to this paper
-
IndisputableMonolith.Foundation.RealityFromDistinctionreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce Social-LM, our pipeline for modeling and evaluating SSP from clinical transcripts... agreement-weighted ensemble using group-level agreement patterns
-
IndisputableMonolith.Cost.FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Table 2... ensemble model... 0.606 balanced accuracy
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Depression Detection at the Point of Care: Automated Analysis of Linguistic Signals from Routine Primary Care Encounters
Zero-shot GPT-OSS detects depression from 1,108 primary care encounter transcripts with AUPRC 0.51 and AUROC 0.77, with meaningful signals in the first 128 patient tokens and added value from dyadic mirroring.
Reference graph
Works this paper leans on
-
[1]
AHRQ. 2006. Effects of Establishing Focus in the Medical Interview (R01HS 013172 PI Lynne Robins). https://www.ahrq.gov/sites/ default/files/2024-07/robins-report.pdf Accessed October 9, 2024
work page 2006
-
[2]
Turki M Alanzi. 2023. Impact of ChatGPT on teleconsultants in healthcare: perceptions of healthcare experts in Saudi Arabia. Journal of multidisciplinary healthcare (2023), 2309–2321
work page 2023
-
[3]
John W Ayers, Adam Poliak, Mark Dredze, Eric C Leas, Zechariah Zhu, Jessica B Kelley, Dennis J Faix, Aaron M Goodman, Christopher A Longhurst, Michael Hogarth, et al. 2023. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA internal medicine 183, 6 (2023), 589–596
work page 2023
-
[4]
Emily Bascom, Reggie Casanova-Perez, Kelly Tobar, Manas Satish Bedmutha, Harshini Ramaswamy, Wanda Pratt, Janice Sabin, Brian Wood, Nadir Weibel, and Andrea Hartzler. 2024. Designing Communication Feedback Systems To Reduce Healthcare Providers’ Implicit Biases In Patient Encounters. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems
work page 2024
-
[5]
Manas Satish Bedmutha, Emily Bascom, Kimberly R Sladek, Kelly Tobar, Reggie Casanova-Perez, Alexandra Andreiu, Amrit Bhat, Sabrina Mangal, Brian R Wood, Janice Sabin, et al. 2024. Artificial intelligence-generated feedback on social signals in patient–provider communication: technical performance, feedback usability, and impact. JAMIA open 7, 4 (2024), ooae106
work page 2024
-
[6]
Manas Satish Bedmutha, Poorva Satish Bedmutha, and Nadir Weibel. 2023. Privacy-Aware Respiratory Symptom Detection in- the-wild. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing . Association for Computing Machinery, New York, NY, USA, 4...
-
[7]
Manas Satish Bedmutha, Amrit Bhat, Sabrina Mangal, Emily Bascom, Wanda Pratt, Brian Wood, Janice Sabin, Nadir Weibel, and Andrea Hartzler. 2023. Towards inferring implicit bias in clinical interactions using social signals. AMIA Annual Symposium. AI Showcase Stage III (2023)
work page 2023
-
[8]
Manas Satish Bedmutha, Anuujin Tsedenbal, Kelly Tobar, Sarah Borsotto, Kimberly R Sladek, Deepansha Singh, Reggie Casanova-Perez, Emily Bascom, Brian Wood, Janice Sabin, et al. 2024. ConverSense: An Automated Approach to Assess Patient-Provider Interactions using Social Signals. In Proceedings of the CHI Conference on Human Factors in Computing Systems . 1–22
work page 2024
-
[9]
Sudershan Boovaraghavan, Haozhe Zhou, Mayank Goel, and Yuvraj Agarwal. 2024. Kirigami: Lightweight speech filtering for privacy- preserving activity recognition using audio. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1–28
work page 2024
-
[10]
Ryan L Boyd, Ashwini Ashokkumar, Sarah Seraj, and James W Pennebaker. 2022. The development and psychometric properties of LIWC-22. Austin, TX: University of Texas at Austin 10 (2022), 1–47
work page 2022
-
[11]
Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, and Marie-Philippe Gill. 2020. Pyannote. audio: neural building blocks for speaker diarization. In ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP) . IEEE, 7124–7128
work page 2020
-
[12]
Feng Chen, Manas Satish Bedmutha, Ray-Yuan Chung, Janice Sabin, Wanda Pratt, Brian R Wood, Nadir Weibel, Andrea L Hartzler, and Trevor Cohen. 2024. Toward Automated Detection of Biased Social Signals from the Content of Clinical Conversations. arXiv preprint arXiv:2407.17477 (2024)
-
[13]
Wenqiang Chen, Jiaxuan Cheng, Leyao Wang, Wei Zhao, and Wojciech Matusik. 2024. Sensor2Text: Enabling Natural Language Interactions for Daily Activity Tracking Using Wearable Sensors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 4 (2024), 1–26
work page 2024
-
[14]
Zhuang Chen, Jiawen Deng, Jinfeng Zhou, Jincenzi Wu, Tieyun Qian, and Minlie Huang. 2024. Depression detection in clinical interviews with LLM-empowered structural element graph. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) . 8181–8194
work page 2024
-
[15]
Bhawana Chhaglani, Camellia Zakaria, Adam Lechowicz, Jeremy Gummeson, and Prashant Shenoy. 2022. Flowsense: Monitoring airflow in building ventilation systems using audio sensing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1 (2022), 1–26
work page 2022
-
[16]
Georgios Chochlakis, Niyantha Maruthu Pandiyan, Kristina Lerman, and Shrikanth Narayanan. 2025. Larger language models don’t care how you think: Why chain-of-thought prompting fails in subjective tasks. In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . IEEE, 1–5
work page 2025
-
[17]
Scaling Instruction-Finetuned Language Models
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean,...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[18]
Lisa A Cooper, Debra L Roter, Kathryn A Carson, Mary Catherine Beach, Janice A Sabin, Anthony G Greenwald, and Thomas S Inui
-
[19]
The associations of clinicians’ implicit attitudes about race with medical visit communication and patient ratings of interpersonal , Vol. 1, No. 1, Article . Publication date: May 2025. LLMs and Social Behavior in Clinical Conversations • 29 care. American journal of public health 102, 5 (2012), 979–987
work page 2025
- [20]
-
[21]
Zachary Englhardt, Chengqian Ma, Margaret E Morris, Chun-Cheng Chang, Xuhai" Orson" Xu, Lianhui Qin, Daniel McDuff, Xin Liu, Shwetak Patel, and Vikram Iyer. 2024. From classification to clinical insights: Towards analyzing and reasoning about mobile and behavioral health data with large language models. Proceedings of the ACM on Interactive, Mobile, Weara...
work page 2024
-
[22]
Kyle M Fargen, Timothy O’Connor, Steven Raymond, Justin M Sporrer, and William A Friedman. 2012. An observational study of hospital paging practices and workflow interruption among on-call junior neurological surgery residents. Journal of graduate medical education 4, 4 (2012), 467–471
work page 2012
-
[23]
Heather A Faucett, Matthew L Lee, and Scott Carter. 2017. I should listen more: real-time sensing and feedback of non-verbal communication in video telehealth. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 1–19
work page 2017
- [24]
- [25]
-
[26]
Declan Grabb. 2024. pSAE-chiatry: Utilizing Sparse Autoencoders to Uncover Mental-Health-Related Features in Language Models. In NeurIPS 2024 Workshop on Behavioral Machine Learning . https://openreview.net/forum?id=BODZDzpXUF
work page 2024
-
[27]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[28]
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al
-
[29]
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[30]
Nao Hagiwara, Jennifer Elston Lafata, Briana Mezuk, Scott R Vrana, and Michael D Fetters. 2019. Detecting implicit racial bias in provider communication behaviors to reduce disparities in healthcare: challenges, solutions, and future directions for provider communication training. Patient education and counseling 102, 9 (2019), 1738–1743
work page 2019
-
[31]
AL Hartzler, RA Patel, M Czerwinski, W Pratt, A Roseway, N Chandrasekaran, and A Back. 2014. Real-time feedback on nonverbal clinical communication. Methods of information in medicine 53, 05 (2014), 389–405
work page 2014
-
[32]
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al . 2022. Lora: Low-rank adaptation of large language models. ICLR 1, 2 (2022), 3
work page 2022
-
[33]
Hannah La, Ziming Li, Ha-Kyung Kong, and Roshan L Peiris. 2025. Exploring the Efficacy of a Chatbot Training Application in Alleviating Graduate Students’ Public-Speaking Anxiety During Q&A. (2025)
work page 2025
-
[34]
Henry A Landsberger. 1958. Hawthorne Revisited: Management and the Worker, Its Critics, and Developments in Human Relations in Industry. (1958)
work page 1958
-
[35]
Virginia LeBaron, Tabor Flickinger, David Ling, Hansung Lee, James Edwards, Anant Tewari, Zhiyuan Wang, and Laura E Barnes. 2023. Feasibility and acceptability testing of CommSense: A novel communication technology to enhance health equity in clinician–patient interactions. Digital Health 9 (2023), 20552076231184991
work page 2023
- [36]
-
[37]
Chunfeng Liu, Karen M Scott, Renee L Lim, Silas Taylor, and Rafael A Calvo. 2016. EQClinic: a platform for learning communication skills in clinical consultations. Medical education online 21, 1 (2016), 31801
work page 2016
-
[38]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM computing surveys 55, 9 (2023), 1–35
work page 2023
- [39]
- [40]
- [41]
-
[42]
Man Luo, Christopher J Warren, Lu Cheng, Haidar M Abdul-Muhsin, and Imon Banerjee. 2024. Assessing empathy in large language models with real-world physician-patient interactions. In 2024 IEEE International Conference on Big Data (BigData) . IEEE, 6510–6519
work page 2024
-
[43]
Cheng Charles Ma, Kevin Hyekang Joo, Alexandria K Vail, Sunreeta Bhattacharya, Álvaro Fernández García, Kailana Baker-Matsuoka, Sheryl Mathew, Lori L Holt, and Fernando De la Torre. 2024. Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation. arXiv preprint arXiv:2409.09135 (2024)
-
[44]
Abdulqadir J Nashwan, Ahmad A Abujaber, and Hassan Choudry. 2023. Embracing the future of physician-patient communication: GPT-4 in gastroenterology. Gastroenterology & Endoscopy 1, 3 (2023), 132–135. , Vol. 1, No. 1, Article . Publication date: May 2025. 30 • Manas Satish Bedmutha, Feng Chen, Andrea Hartzler, Trevor Cohen, and Nadir Weibel
work page 2023
-
[45]
Junghwan Park, Meelim Kim, Mohamed El Mistiri, Rachael Kha, Sarasij Banerjee, Lisa Gotzian, Guillaume Chevance, Daniel E Rivera, Predrag Klasnja, Eric Hekler, et al . 2023. Advancing understanding of just-in-time states for supporting physical activity (Project JustWalk JITAI): protocol for a System ID study of just-in-time adaptive interventions. JMIR Re...
work page 2023
-
[46]
Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2022. Robust Speech Recognition via Large-Scale Weak Supervision. OpenAI (2022). https://openai.com/research/whisper
work page 2022
-
[47]
Jeffrey D Robinson. 2003. An interactional structure of medical activities during acute visits and its implications for patients’ participation. Health communication 15, 1 (2003), 27–59
work page 2003
-
[48]
Jeffrey D Robinson and John Heritage. 2006. Physicians’ opening questions and patients’ satisfaction.Patient education and counseling 60, 3 (2006), 279–285
work page 2006
-
[49]
Debra Roter and Susan Larson. 2002. The Roter interaction analysis system (RIAS): utility and flexibility for analysis of medical interactions. Patient education and counseling 46, 4 (2002), 243–251
work page 2002
-
[50]
Debra L Roter, Judith A Hall, Danielle Blanch-Hartigan, Susan Larson, and Richard M Frankel. 2011. Slicing it thin: new methods for brief sampling analysis using RIAS-coded medical dialogue. Patient education and counseling 82, 3 (2011), 410–419
work page 2011
-
[51]
Philip Sedgwick and Nan Greenwood. 2015. Understanding the Hawthorne effect. Bmj 351 (2015)
work page 2015
-
[52]
Raj Sanjay Shah, Faye Holt, Shirley Anugrah Hayati, Aastha Agarwal, Yi-Chia Wang, Robert E Kraut, and Diyi Yang. 2022. Modeling motivational interviewing strategies on an online peer-to-peer counseling platform. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–24
work page 2022
-
[53]
Vera Sorin, Dana Brin, Yiftach Barash, Eli Konen, Alexander Charney, Girish Nadkarni, and Eyal Klang. 2024. Large Language Models and Empathy: Systematic Review. Journal of Medical Internet Research 26 (2024), e52597
work page 2024
- [54]
-
[55]
Richard L Street Jr, Howard Gordon, and Paul Haidet. 2007. Physicians’ communication and perceptions of patients: is it how they look, how they talk, or is it just the doctor? Social science & medicine 65, 3 (2007), 586–598
work page 2007
-
[56]
G. Swain. 2024. Patients may suffer from hallucinations of AI Medical Transcription Tools. CIO (2024). https://www.cio.com/article/ 3593403/patients-may-suffer-from-hallucinations-of-ai-medical-transcription-tools.html
work page 2024
-
[57]
Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, et al. 2024. Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[58]
Aaron A Tierney, Gregg Gayre, Brian Hoberman, Britt Mattern, Manuel Ballesca, Patricia Kipnis, Vincent Liu, and Kristine Lee. 2024. Ambient artificial intelligence scribes to alleviate the burden of clinical documentation. NEJM Catalyst Innovations in Care Delivery 5, 3 (2024), CAT–23
work page 2024
-
[59]
Tao Tu, Mike Schaekermann, Anil Palepu, Khaled Saab, Jan Freyberg, Ryutaro Tanno, Amy Wang, Brenna Li, Mohamed Amin, Yong Cheng, et al. 2025. Towards conversational diagnostic artificial intelligence. Nature (2025), 1–9
work page 2025
-
[60]
Equal Employment Opportunity Commission
U.S. Equal Employment Opportunity Commission. 1978. Uniform Guidelines on Employee Selection Procedures. https://www.eeoc.gov/laws/guidance/uniform-guidelines-employment-selection-procedures. Federal Register, Volume 43, Number 138, July 20, 1978
work page 1978
-
[61]
Alexandria Vail, Jeffrey Girard, Lauren Bylsma, Jeffrey Cohn, Jay Fournier, Holly Swartz, and Louis-Philippe Morency. 2022. Toward causal understanding of therapist-client relationships: A study of language modality and social entrainment. In Proceedings of the 2022 International Conference on Multimodal Interaction . 487–494
work page 2022
-
[62]
Alessandro Vinciarelli, Maja Pantic, and Hervé Bourlard. 2009. Social signal processing: Survey of an emerging domain. Image and vision computing 27, 12 (2009), 1743–1759
work page 2009
-
[63]
Aditya B Vishwanath, Vijay Kumar Srinivasalu, and Narayana Subramaniam. 2024. Role of large language models in improving provider–patient experience and interaction efficiency: A scoping review. Artificial Intelligence in Health (2024), 4808
work page 2024
- [64]
-
[65]
Zhiyuan Wang, Nusayer Hassan, Virginia LeBaron, Tabor Flickinger, David Ling, James Edwards, Congyu Wu, Mehdi Boukhechba, and Laura E Barnes. 2024. CommSense: A Wearable Sensing Computational Framework for Evaluating Patient-Clinician Interactions. Proceedings of the ACM on Human-Computer Interaction 8, CSCW2 (2024), 1–31
work page 2024
-
[66]
Jocelyn White, Wendy Levinson, and Debra Roter. 1994. Oh, by the way. . . The closing moments of the medical visit. Journal of General Internal Medicine 9 (1994), 24–28
work page 1994
- [67]
-
[68]
Xuhai Xu, Bingsheng Yao, Yuanzhe Dong, Saadia Gabriel, Hong Yu, James Hendler, Marzyeh Ghassemi, Anind K Dey, and Dakuo Wang
-
[69]
Mental-llm: Leveraging large language models for mental health prediction via online text data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1–32. , Vol. 1, No. 1, Article . Publication date: May 2025. LLMs and Social Behavior in Clinical Conversations • 31
work page 2024
-
[70]
Haoning Xue, Wang Liao, and Jingwen Zhang. 2024. Interaction dynamics of social support expressions predict future support-seeking behaviors in online support groups. Computers in Human Behavior 156 (2024), 108224
work page 2024
- [71]
-
[72]
Ziqi Yang, Xuhai Xu, Bingsheng Yao, Ethan Rogers, Shao Zhang, Stephen Intille, Nawar Shara, Guodong Gordon Gao, and Dakuo Wang
-
[73]
Talk2care: An llm-based voice assistant for communication between healthcare providers and older adults. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 2 (2024), 1–35
work page 2024
-
[74]
Travis Zack, Eric Lehman, Mirac Suzgun, Jorge A Rodriguez, Leo Anthony Celi, Judy Gichoya, Dan Jurafsky, Peter Szolovits, David W Bates, Raja-Elie E Abdulnour, et al. 2023. Coding inequity: assessing GPT-4’s potential for perpetuating racial and gender biases in healthcare. medRxiv (2023), 2023–07
work page 2023
-
[75]
Did you see any presence of signal_name in this slice?
Maxime Zanella and Ismail Ben Ayed. 2024. On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . 23783–23793. , Vol. 1, No. 1, Article . Publication date: May 2025. 32 • Manas Satish Bedmutha, Feng Chen, Andrea Hartzler,...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.