arxiv: 2604.21933 · v1 · submitted 2026-03-23 · 💻 cs.HC

Recognition: no theorem link

Not Another EHR: Reimagining Physician Information Needs with Generative AI Technology

Ruican Zhong , Jiachen Li , Gary Hsieh , David W. McDonald , Selin S. Everett , Alyssa Unell , Jonathan Carlson , Katie Claveau

show 10 more authors

Noel Codella Khalil Malik Scott Mackie Eduardo Olvera Scott Saponas Eric Horvitz David Rhew Jim Weinstein Jacob Gross Amanda K. Hall

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:02 UTC · model grok-4.3

classification 💻 cs.HC

keywords electronic health recordsgenerative AIphysician workflowsadaptive interfacesclinical information needsuser interface designcognitive load

0 comments

The pith

Generative AI can power adaptive interfaces that help physicians navigate and synthesize complex patient data in real time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Electronic health records have made patient information more accessible but have also created significant cognitive burdens due to data volume and complexity. This position paper proposes that advances in large language models open the way for generative interfaces that dynamically adapt to a physician's specific questions and workflow needs. Interviews with physicians reveal key challenges in data navigation and synthesis during diagnosis, along with how clinicians expect AI to assist. The authors derive design considerations for interfaces that align with these mental models to build appropriate trust and interaction patterns. Such approaches could shift from static record displays to responsive systems that reduce information overload.

Core claim

By interviewing physicians about their information needs and AI conceptualizations, the paper establishes that generative AI can support clinician-centered workflows through dynamic interactions with patient data, moving beyond the limitations of traditional EHR systems.

What carries the argument

Generative user interfaces that use large language models to enable adaptive, query-driven synthesis of patient data based on physicians' diagnostic workflows and trust expectations.

If this is right

Clinicians could query patient records in natural language to receive synthesized summaries tailored to the current diagnostic step.
Interfaces would adapt dynamically as the physician's focus shifts during a case.
Designs informed by mental models could increase appropriate reliance on AI outputs.
Overall cognitive load from data review would decrease, allowing more time for direct patient care.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

These generative interfaces might be generalized across different healthcare systems if they rely on common data standards.
Integration with existing EHR platforms could face technical hurdles related to data privacy and model accuracy.
Further studies in varied clinical environments would be needed to validate the interview findings.

Load-bearing premise

Findings from interviews with a small number of internal Microsoft physicians will generalize to other clinical settings and translate into practical, effective generative interface designs.

What would settle it

A controlled study in which physicians using generative AI interfaces show no reduction in time spent on data review or no improvement in diagnostic accuracy compared to using conventional EHRs.

Figures

Figures reproduced from arXiv: 2604.21933 by Alyssa Unell, Amanda K. Hall, David Rhew, David W. McDonald, Eduardo Olvera, Eric Horvitz, Gary Hsieh, Jacob Gross, Jiachen Li, Jim Weinstein, Jonathan Carlson, Katie Claveau, Khalil Malik, Noel Codella, Ruican Zhong, Scott Mackie, Scott Saponas, Selin S. Everett.

**Figure 1.** Figure 1: This figure presents physicians’ current workflow to collect and analyze patient information throughout patient visits. [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: This figure illustrates the informational needs that physicians identified, and the corresponding roles (scribe/intern, colleague, [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Electronic health records (EHRs) have improved data accessibility but have also introduced cognitive burden for physicians, given the sheer volume and complexity of the data involved. Advances in large language models (LLMs) create new opportunities to rethink how clinicians interact with medical data through dynamic, adaptive interfaces. In this position paper, we explore how generative AI can support physicians' information needs by enabling more dynamic interactions with patient data. Through semi-structured interviews with internal physicians at Microsoft, we identify key challenges in data navigation and synthesis, and characterize clinicians' information needs during diagnostic workflows. We further examine how physicians conceptualize AI can help their work process and how these mental models shape expectations for interaction and trust. Based on these insights, we discuss design considerations for generative user interfaces that support clinician-centered workflows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Useful interview-based design considerations for AI in EHRs, but the internal Microsoft sample raises generalizability flags.

read the letter

The core of this paper is a set of design considerations for generative interfaces that could ease physicians' cognitive load in EHRs. It draws from semi-structured interviews to map out information needs, navigation issues, and how doctors imagine AI fitting into their diagnostic process. That's the punchline: it's not presenting new tech or data, but a thoughtful synthesis aimed at guiding future interface work. What the paper does well is connect LLM capabilities directly to specific workflow pain points, like handling volume and complexity of data. The discussion of mental models and trust expectations adds depth – it shows the authors thought about how clinicians might actually use and rely on these systems, rather than just listing features. The interview insights seem to come from real conversations, which gives the proposals some grounding even if they're exploratory. On the soft spots, the biggest one is the participant pool. All interviews were with internal Microsoft physicians, who probably bring a level of tech comfort and exposure that isn't typical across the broader medical field. The abstract doesn't spell out the number or selection process, which makes it tougher to assess how representative the findings are. If the full paper has more details on saturation or diversity, that would help, but as it stands, the risk of bias toward a tech-savvy group is there. Since it's a position paper, there's also no empirical test of the suggested designs, so readers have to take the translation to practice on faith. This kind of work is aimed at researchers in human-computer interaction and medical informatics who are thinking about AI applications in clinical settings. It could spark ideas for prototypes or further studies. I think it deserves peer review – the topic is relevant, the approach is honest about its exploratory nature, and the concerns it raises are worth discussing in the community, even with the limitations on scope.

Referee Report

2 major / 1 minor

Summary. This position paper argues that advances in large language models enable a rethinking of physician interactions with electronic health records through dynamic, adaptive generative AI interfaces. Based on semi-structured interviews with internal Microsoft physicians, it identifies challenges in data navigation and synthesis during diagnostic workflows, characterizes clinicians' information needs and mental models of AI assistance, and proposes design considerations for generative user interfaces that support clinician-centered processes.

Significance. If the interview-derived insights prove robust and generalizable, the work could meaningfully inform HCI and health informatics research by shifting focus from incremental EHR improvements to generative AI-driven interfaces that reduce cognitive load and better align with clinical mental models. The position-paper format usefully surfaces forward-looking design considerations and trust-related expectations, providing a foundation for future empirical studies on AI-augmented clinical tools.

major comments (2)

[Abstract/Methods] Abstract and Methods: The semi-structured interviews with internal Microsoft physicians are the sole empirical basis for identifying information needs, mental models, and design considerations, yet no sample size, recruitment criteria, interview protocol, saturation details, or analysis method are reported. This absence directly affects the ability to evaluate the validity and scope of the central claims.
[Discussion] Discussion: The paper does not explicitly address potential biases or limits to generalizability arising from sampling only internal Microsoft physicians, who may differ systematically from academic, community, or international clinicians in technology exposure, workflow constraints, and institutional context. This is load-bearing for translating the insights into broadly applicable generative UI proposals.

minor comments (1)

[Abstract] The abstract would benefit from explicitly labeling the work as a position paper and briefly noting the qualitative synthesis approach to set reader expectations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our position paper. The comments highlight important issues of methodological transparency and scope that we will address through revision. We respond to each major comment below.

read point-by-point responses

Referee: [Abstract/Methods] Abstract and Methods: The semi-structured interviews with internal Microsoft physicians are the sole empirical basis for identifying information needs, mental models, and design considerations, yet no sample size, recruitment criteria, interview protocol, saturation details, or analysis method are reported. This absence directly affects the ability to evaluate the validity and scope of the central claims.

Authors: We agree that the current manuscript lacks sufficient detail on the interview methodology. Although the work is framed as a position paper in which the interviews primarily serve to inform forward-looking design considerations rather than to generate generalizable empirical results, we will revise the manuscript to add a dedicated Methods subsection. This subsection will report the sample size, recruitment criteria, interview protocol, thematic analysis procedure, and any steps taken toward saturation. revision: yes
Referee: [Discussion] Discussion: The paper does not explicitly address potential biases or limits to generalizability arising from sampling only internal Microsoft physicians, who may differ systematically from academic, community, or international clinicians in technology exposure, workflow constraints, and institutional context. This is load-bearing for translating the insights into broadly applicable generative UI proposals.

Authors: We agree that the sampling frame introduces potential biases and constrains generalizability, and that these issues should be stated explicitly. In the revised manuscript we will expand the Discussion to include a dedicated limitations subsection that addresses differences in technology exposure, institutional context, and workflow constraints relative to academic, community, and international settings. We will also note the implications for the proposed design considerations and suggest avenues for future validation with broader clinician populations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in interview-driven position paper

full rationale

The paper is a position paper that draws insights from semi-structured interviews with internal Microsoft physicians to characterize information needs and discuss design considerations for generative AI interfaces. There are no equations, derivations, fitted parameters, or self-citations that form a load-bearing chain reducing claims to inputs by construction. The central claims rest on external interview data rather than self-referential modeling, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities; this is a qualitative position paper without mathematical or formal modeling components.

pith-pipeline@v0.9.0 · 5495 in / 965 out tokens · 59126 ms · 2026-05-15T00:02:10.109228+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

Md Manjurul Ahsan, Shahana Akter Luna, and Zahed Siddique. 2022. Machine-learning-based disease diagnosis: A comprehensive review. In Healthcare, Vol. 10. MDPI, 541

work page 2022
[2]

Abidemi O Akinrinmade, Temitayo M Adebile, Chioma Ezuma-Ebong, Kafayat Bolaji, Afomachukwu Ajufo, Aisha O Adigun, Majed Mohammad, Juliet C Dike, and Okelue E Okobi. 2023. Artificial intelligence in healthcare: perception and reality.Cureus15, 9 (2023). Manuscript submitted to ACM 4 Zhong, et al

work page 2023
[3]

AR Bakker. 2007. The need to know the history of the use of digital patient data, in particular the EHR.International Journal of Medical Informatics 76, 5-6 (2007), 438–441

work page 2007
[4]

Mohsen Bayati, Mark Braverman, Michael Gillam, Karen M Mack, George Ruiz, Mark S Smith, and Eric Horvitz. 2014. Data-driven decisions for reducing readmissions for heart failure: General methodology and case study.PloS one9, 10 (2014), e109264

work page 2014
[5]

Jazbo Beason, Ruijia Cheng, Eldon Schoop, and Jeffrey Nichols. 2025. Athena: Intermediate Representations for Iterative Scaffolded App Generation with an LLM.arXiv preprint arXiv:2508.20263(2025)

work page arXiv 2025
[6]

Nur Efsan Cetinkaya and Nicole Krämer. 2025. Between transparency and trust: identifying key factors in AI system perception.Behaviour & Information Technology(2025), 1–15

work page 2025
[7]

Siyuan Chen and Julien Epps. 2014. Using task-induced pupil diameter and blink rate to infer cognitive load.Human–Computer Interaction29, 4 (2014), 390–413

work page 2014
[8]

Han Shi Jocelyn Chew and Palakorn Achananuparp. 2022. Perceptions and needs of artificial intelligence in health care to increase adoption: scoping review.Journal of medical Internet research24, 1 (2022), e32939

work page 2022
[9]

Daniel C Delrose and Richard W Steinberg. 2000. The clinical significance of the digital patient record.The Journal of the American Dental Association 131 (2000), 57S–60S

work page 2000
[10]

Leela Naga Sai Vamsi Krishna Dogiparthi. 2025. Building Intelligent Adaptive User Interfaces (IAUI) With Artificial Intelligence & LLM Models. (2025)

work page 2025
[11]

Cheng-Feng Dou, Ying Zhang, Zhi Jin, Wen-Pin Jiao, Hai-Yan Zhao, Yong-Qiang Zhao, and Zheng-Wei Tao. 2025. Exploring LLM-Based Data Synthesis Strategies for Aligning Medical Consultation Preferences.Journal of Computer Science and Technology40, 6 (2025), 1485–1498

work page 2025
[12]

Yifei Duan, Liuqingqing Yang, Tong Zhang, Zhijun Song, and Fenghua Shao. 2025. Automated UI Interface Generation via Diffusion Models: Enhancing Personalization and Efficiency. In2025 4th International Symposium on Computer Applications and Information Technology (ISCAIT). IEEE, 780–783

work page 2025
[13]

Diana E Forsythe, Bruce G Buchanan, Jerome A Osheroff, and Randolph A Miller. 1992. Expanding the concept of medical information: an observational study of physicians’ information needs.Computers and Biomedical Research25, 2 (1992), 181–200

work page 1992
[14]

Sebastian J Fritsch, Andrea Blankenheim, Alina Wahl, Petra Hetfeld, Oliver Maassen, Saskia Deffge, Julian Kunze, Rolf Rossaint, Morris Riedel, Gernot Marx, et al. 2022. Attitudes and perception of artificial intelligence in healthcare: a cross-sectional survey among patients.Digital health8 (2022), 20552076221116772

work page 2022
[15]

Dana B Gal, Brian Han, Chistopher Longhurst, David Scheinker, and Andrew Y Shin. 2021. Quantifying electronic health record data: a potential risk for cognitive overload.Hospital Pediatrics11, 2 (2021), 175–178

work page 2021
[16]

Michael Gerlich. 2023. Perceptions and acceptance of artificial intelligence: A multi-dimensional study.Social Sciences12, 9 (2023), 502

work page 2023
[17]

Nafiseh Ghaffar Nia, Erkan Kaplanoglu, and Ahad Nasab. 2023. Evaluation of artificial intelligence techniques in disease diagnosis and prediction. Discover Artificial Intelligence3, 1 (2023), 5

work page 2023
[18]

Akash Ghosh, Bo Huang, Yan Yan, and Wenjun Lin. 2024. Enhancing healthcare user interfaces through large language models within the adaptive user interface framework. InInternational Congress on Information and Communication Technology. Springer, 527–540

work page 2024
[19]

Ana I González-González, Martin Dawes, José Sánchez-Mateos, Rosario Riesgo-Fuertes, Esperanza Escortell-Mayor, Teresa Sanz-Cuesta, and Tomás Hernández-Fernández. 2007. Information needs and information-seeking behavior of primary care physicians.The Annals of Family Medicine5, 4 (2007), 345–352

work page 2007
[20]

Yi Gui, Yao Wan, Zhen Li, Zhongyi Zhang, Dongping Chen, Hongyu Zhang, Yi Su, Bohua Chen, Xing Zhou, Wenbin Jiang, et al. 2025. UICoPilot: Automating UI synthesis via hierarchical code generation from webpage designs. InProceedings of the ACM on Web Conference 2025. 1846–1855

work page 2025
[21]

Katharine Kowalyshyn and Matthias Scheutz. 2025. LLMs and their Limited Theory of Mind: Evaluating Mental State Annotations in Situated Dialogue.arXiv preprint arXiv:2509.02292(2025)

work page arXiv 2025
[22]

Prithima Reddy Mosaly, Lukasz M Mazur, Fei Yu, Hua Guo, Merck Derek, David H Laidlaw, Carlton Moore, Lawrence B Marks, and Javed Mostafa

work page
[23]

International Journal of Human–Computer Interaction34, 5 (2018), 467–475

Relating task demand, mental effort and task difficulty with physicians’ performance during interactions with electronic health records (EHRs). International Journal of Human–Computer Interaction34, 5 (2018), 467–475

work page 2018
[24]

C Murphy and IJ Keogh. 2023. The evolution of the medical record from paper to digital: an ENT perspective.The Journal of Laryngology & Otology 137, 3 (2023), 246–248

work page 2023
[25]

Kálmán Balázs Neszlényi, Alex Milos, and Attila Kiss. 2024. AssistantGPT: Enhancing User Interaction with LLM Integration. In2024 IEEE 22nd Jubilee International Symposium on Intelligent Systems and Informatics (SISY). IEEE, 000619–000624

work page 2024
[26]

Stephen H Walsh. 2004. The clinician’s perspective on electronic health records and how they can affect patient care.Bmj328, 7449 (2004), 1184–1187

work page 2004
[27]

Shixiao Wang, Runsheng Zhang, and Xue Shi. 2025. Generative UI Design with Diffusion Models: Exploring Automated Interface Creation and Human-Computer Interaction.Transactions on Computational and Scientific Methods5, 3 (2025)

work page 2025
[28]

Jenna Wiens, John Guttag, and Eric Horvitz. 2016. Patient risk stratification with time-varying parameters: a multitask learning approach.Journal of Machine Learning Research17, 79 (2016), 1–23

work page 2016
[29]

Jason Wu, Amanda Swearngin, Arun Krishna Vajjala, Alan Leung, Jeffrey Nichols, and Titus Barik. 2025. Improving User Interface Generation Models from Designer Feedback.arXiv preprint arXiv:2509.16779(2025). Manuscript submitted to ACM Not Another EHR: Reimagining Physician Information Needs with Generative AI Technology 5

work page arXiv 2025
[30]

Chaoran Yu and Ernest Johann Helwig. 2022. The role of AI technology in prediction, diagnosis and treatment of colorectal cancer.Artificial intelligence review55, 1 (2022), 323–343

work page 2022
[31]

Xiaoan Zhan, Yang Xu, and Yingchia Liu. 2024. Personalized UI layout generation using deep learning: An adaptive interface design approach for enhanced user experience.International Journal of Engineering and Management Research14, 5 (2024), 134–147

work page 2024
[32]

Bo Zhang, Huiping Shi, and Hongtao Wang. 2023. Machine learning and AI in cancer prognosis, prediction, and treatment selection: a critical approach.Journal of multidisciplinary healthcare(2023), 1779–1791. A Appendix Fig. 1. This figure presents physicians’ current workflow to collect and analyze patient information throughout patient visits. Manuscript ...

work page 2023