AI-Care: A Conversational Agentic System for Task Coordination in Alzheimer's Disease Care
Pith reviewed 2026-05-12 02:30 UTC · model grok-4.3
The pith
AI-Care is a LangGraph-based conversational agent that coordinates daily tasks for Alzheimer's patients through natural language with caregiver-grounded safety controls and multi-turn clarification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A preliminary pilot with four individuals with mild-to-moderate AD/ADRD showed that users found the system trustworthy, competent, and likable, and were able to complete the evaluated coordination tasks through conversation.
Load-bearing premise
That qualitative feedback from four pilot users in an unspecified setting is sufficient to indicate the system reduces cognitive load and maintains safety for the broader population of people with AD/ADRD.
read the original abstract
Individuals with Alzheimer's disease (AD) and Alzheimer's disease-related dementia (ADRD) experience memory and thinking changes that impact their ability to use digital daily management tools. For example, adding an event to a digital calendar requires multiple steps that may act as barriers to independent use for individuals with AD/ADRD. This paper presents AI-Care, a conversational agentic artificial intelligence (AI) layer built on top of a remote caregiving platform co-designed with people with AD/ADRD. AI-Care is designed to reduce the cognitive load on individuals with AD/ADRD when managing everyday tasks such as setting calendar reminders and organizing to-do lists through natural-language interaction with a voice-first chatbot. The system uses a LangGraph-based stateful orchestration approach in which each request passes through sanitization, intent classification, context loading, safety checks, deterministic slot collection, tool execution, and response composition. Safety-critical responses, particularly around medications and allergies, are grounded in caregiver-verified records rather than free-form model generation. The system does not make autonomous medical or treatment decisions. Incomplete or ambiguous requests are handled through controlled multi-turn clarification rather than silent failure or guessing. The system supports both typed and spoken input, with voice output through ElevenLabs text-to-speech. Longer responses are chunked before synthesis to avoid rushed playback. A preliminary pilot with four individuals with mild-to-moderate AD/ADRD showed that users found the system trustworthy, competent, and likable, and were able to complete the evaluated coordination tasks through conversation. We describe the design goals, system architecture, safety controls, and findings from this formative evaluation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AI-Care, a LangGraph-based conversational agentic system layered on a remote caregiving platform for individuals with mild-to-moderate Alzheimer's disease and related dementias (AD/ADRD). It details a pipeline for natural-language task coordination (e.g., calendar reminders, to-do lists) that includes sanitization, intent classification, context loading, safety checks grounded in caregiver-verified records, deterministic slot filling, tool execution, and multi-turn clarification. The system supports voice and text input with chunked TTS output and explicitly avoids autonomous medical decisions. A formative pilot with four participants reported that users perceived the system as trustworthy, competent, and likable and successfully completed the evaluated coordination tasks via conversation.
Significance. If the usability and safety claims can be substantiated with quantitative metrics and larger samples, the work would address a genuine barrier in digital self-management tools for AD/ADRD by providing a voice-first, stateful interface with explicit safety grounding. The explicit separation of verified caregiver data from model generation and the controlled clarification strategy are constructive design choices that could inform other assistive agents in high-variability cognitive-impairment domains.
major comments (2)
- [§5] §5 (Formative Evaluation / Pilot Results): The central claim that the system reduces cognitive load and supports safe task completion rests on qualitative impressions from n=4 participants; no task-completion rates, error counts, cognitive-load instruments, baseline comparisons, or safety-incident logs are reported, so the extrapolation to the broader mild-to-moderate AD/ADRD population lacks empirical warrant.
- [§4] §4 (System Architecture and Safety Controls): While the pipeline description states that safety-critical responses are grounded in caregiver-verified records, the manuscript supplies no quantitative assessment of how often safety checks are invoked, their precision/recall in the pilot, or failure modes when intent classification is uncertain, leaving the safety claim untested at the level required for the target population.
minor comments (2)
- [Abstract / §3] The abstract and §3 mention ElevenLabs TTS and LangGraph without providing version numbers, configuration parameters, or rationale for these choices relative to alternatives.
- [§5] The pilot description does not specify the exact tasks, number of turns allowed, or criteria used to judge “successful completion,” making the reported positive outcomes difficult to replicate or compare.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We value the recognition of AI-Care's design choices for safety grounding and multi-turn clarification in the AD/ADRD context. The comments correctly identify that the current evaluation is limited in scope and quantitative depth. Below we respond point by point, indicating where we will revise the manuscript to better contextualize the formative pilot while remaining faithful to the data collected.
read point-by-point responses
-
Referee: [§5] §5 (Formative Evaluation / Pilot Results): The central claim that the system reduces cognitive load and supports safe task completion rests on qualitative impressions from n=4 participants; no task-completion rates, error counts, cognitive-load instruments, baseline comparisons, or safety-incident logs are reported, so the extrapolation to the broader mild-to-moderate AD/ADRD population lacks empirical warrant.
Authors: We agree that the pilot provides only qualitative impressions from n=4 and does not include standardized instruments, baselines, or quantitative performance logs. The manuscript already labels the study as formative and preliminary; its purpose was to assess initial feasibility, user perceptions of trustworthiness and likability, and the viability of the LangGraph orchestration for natural-language task coordination. All four participants successfully completed the evaluated tasks through conversation. In the revised manuscript we will expand §5 with an explicit limitations subsection that states the small sample, qualitative focus, absence of cognitive-load scales or error-rate logging, and the preliminary nature of any claims about load reduction. We will also add more detail on the specific tasks and observed interaction patterns. We cannot add numerical task-completion rates, error counts, or instrument scores because the approved protocol collected only post-session interviews and observations. revision: partial
-
Referee: [§4] §4 (System Architecture and Safety Controls): While the pipeline description states that safety-critical responses are grounded in caregiver-verified records, the manuscript supplies no quantitative assessment of how often safety checks are invoked, their precision/recall in the pilot, or failure modes when intent classification is uncertain, leaving the safety claim untested at the level required for the target population.
Authors: The manuscript describes the deterministic grounding of safety-critical actions in caregiver-verified records and the use of multi-turn clarification for uncertain intents, but it does not report invocation counts, precision/recall, or a systematic failure-mode analysis from the pilot. During the four sessions no safety incidents occurred. In the revision we will augment §4 with a clearer enumeration of the safety pipeline steps, examples of clarification triggers, and an explicit statement that quantitative assessment of the safety layer (invocation frequency, classification performance, failure modes) was not instrumented in this formative study and is planned for subsequent work. These points will also be referenced in the new limitations subsection of §5. revision: partial
- Quantitative safety metrics (invocation frequency, precision/recall) and standardized performance data (task-completion rates, cognitive-load scores, error counts) from the n=4 pilot, because these were not collected under the formative evaluation protocol.
Circularity Check
No circularity: system description and pilot report with no derivations or fitted quantities
full rationale
The paper is a descriptive account of an AI system architecture (LangGraph orchestration, safety checks, voice I/O) plus a qualitative pilot with n=4 users. No equations, parameters, predictions, or uniqueness theorems appear. The pilot findings are reported as formative observations rather than extrapolated claims that reduce to prior inputs. No self-citations are load-bearing on any derivation. The content is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Building for the future: Creating homes and communities for aging well,
J. Binette and F. Farago, “Building for the future: Creating homes and communities for aging well,” 2024. [2] Alzheimer’s Dementia, vol. 21, no. 4, p. e70235, Apr. 2025. [Online]. Available: https://pmc.ncbi.nlm.nih.gov/articles/PMC12040760/
work page 2024
-
[2]
A. Weakley, R. Park, P. Sangrawiakararat, S. Namboodiri, S. T. Farias, O. Mohammed, B. Brown, M. Meyer, and L. Hinton, “Technology use patterns, preferences, and desires of non-co-residing family members caring for older individuals with memory impairment,” February 2025, presented at the fifty-third International Neuropsychological Society Annual Meeting
work page 2025
-
[3]
M. Grammatikopoulou, I. Lazarou, V. Alepopoulos, L. Mpaltadoros, V. P. Oikonomou, T. G. Stavropoulos, S. Nikolopoulos, I. Kompatsiaris,and M. Tsolaki, “Assessing the cognitive decline of people in the spectrum of AD by monitoring their activities of daily living in an IoT-enabled smart home environment: a cross-sectional pilot study,” Front. Aging Neurosc...
work page 2024
-
[4]
A. Weakley, X. Liu, S. Duvvur, H. Kaushal, N. Mussi, S. Namboodiri, Y. Choi, and S. Tomaszewski Farias, “Interactive care: a web-based platform for remote caregiving and functional independence in older adults with cognitive impairment,” Alzheimers Dement., vol. 17, p.e055322, 2021
work page 2021
-
[5]
Y. Du, C. O’Connor, G. Byun, L. H. Kim, S. Amrgousian, and P. Vora, “Voice assistive technology for activities of daily living: developing an Alexa telehealth training for adults with cognitive-communication disorders,” in Proc. 2024 CHI Conf. Human Factors in Computing Systems, 2024
work page 2024
-
[6]
S. N. Pimento, H. Agarwal, B. Minor, S. Karia, D. Cook, M. Schmitter-Edgecombe, S. Tomaszewski Farias, R. Lorabi, and A. Weakley, “Interactive-Wear: an intelligent watch application to aid memory for intentions and everyday functioning in older adults with cognitive impairments,” in 2024 IEEE First Int. Conf. Artificial Intelligence for Medicine, Health a...
work page 2024
-
[7]
A plug-and-play desktop system for remote care of older adults with Alzheimer’s disease,
S. Aswar, A. T. Weakley, P. Koppolu, S. Tomaszewski Farias, and A. Weakley, “A plug-and-play desktop system for remote care of older adults with Alzheimer’s disease,” in Proc. Int. Conf. Human Factors Design, Eng., Comput. (AHFE), 2026
work page 2026
-
[8]
M. Lussier, S. Adam, B. Chikhaoui, C. Consel, M. Gagnon, B. Gilbert, S. Giroux, M. Guay, C. Hudon, H. Imbeault, F. Langlois, J. Macoir, H. Pigot, L. Talbot, and N. Bier, “Smart home technology: a new approach for performance measurements of activities of daily living and prediction of mild cognitive impairment in older adults,” J. Alzheimers Dis., vol. 68...
work page 2019
-
[9]
Smart home sensing and monitoring in households with dementia: user-centered design approach,
F. Tiersen, P. Batey, M. J. C. Harrison, L. Naar, A.-I. Serban, S. J. C. Daniels, and R. A. Calvo, “Smart home sensing and monitoring in households with dementia: user-centered design approach,” JMIR Aging, vol. 4, no. 3, p. e27047, 2021
work page 2021
-
[10]
E. Broadbent, K. Peri, N. Kerse, C. Jayawardena, I. Kuo, C. Datta, and B. MacDonald, “Robots in older people’s homes to improve medication adherence and quality of life: a randomised cross-over trial,” in Int. Conf. Social Robotics. Springer, 2014, pp. 64–73
work page 2014
-
[11]
M. R. Lima, A. O’Connell, F. Zhou, A. Nagahara, A. Hulyalkar, A. Deshpande, J. Thomason, R. Vaidyanathan, and M. Matari´c, “Promoting cognitive health in elder care with large language model-powered socially assistive robots,” in Proc. 2025 CHI Conf. Human Factors in Computing Systems, 2025
work page 2025
-
[12]
Smartphone text input method performance, usability, and preference with younger and older adults,
A. L. Smith and B. S. Chaparro, “Smartphone text input method performance, usability, and preference with younger and older adults,” Hum. Factors, vol. 57, no. 6, pp. 1015–1028, 2015
work page 2015
-
[13]
R. A. Marziali, C. Franceschetti, A. Dinculescu, A. Nistorescu, D. M. Krist´aly, A. A. Mos¸oi, R. Broekx, M. Marin, C. Vizitiu, and S.-A. Moraru, “Reducing loneliness and social isolation of older adults through voice assistants: literature review and bibliometric analysis,” J. Med. Internet Res., vol. 26, p. e50534, 2024. [16] J. Krueger, “Home as mind: ...
work page 2024
-
[14]
AI agents in Alzheimer’s disease management: Challenges and future directions,
G. Grammenos, A. G. Vrahatis, K. Lazaros, T. P. Exarchos, P. Vlamos, and M. G. Krokidis, “AI agents in Alzheimer’s disease management: Challenges and future directions,” Front. Aging Neurosci., vol. 17, p. 1735892, 2026
work page 2026
-
[15]
Redefining elderly care with agentic AI: Challenges and opportunities,
R. A. Khalil, K. Ahmad, and H. Ali, “Redefining elderly care with agentic AI: Challenges and opportunities,” IEEE Open J. Comput. Soc., vol. 7, pp. 326–342, 2026
work page 2026
-
[16]
AgenticAD: A specialized multi-agent system framework for holistic Alzheimer’s disease management,
A. Bazgir, A. Habibdoust, X. Song, and Y. Zhang, “AgenticAD: A specialized multi-agent system framework for holistic Alzheimer’s disease management,” unpublished
-
[17]
C. Bartneck, D. Kuli´c, E. Croft, and S. Zoghbi, “Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots,” International journal of social robotics, vol. 1, no. 1, pp. 71–81, 2009
work page 2009
-
[18]
M. Cohn, M. Pushkarna, G. O. Olanubi, J. M. Moran, D. Padgett, Z. Mengesha, and C. Heldreth, “Believing anthropomorphism: Examining the role of anthropomorphic cues on trust in large language models,” in Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–15. [22] M. Cohn, S. Barreda, K. Graf Estes, Z. Yu, and G. Ze...
work page 2024
-
[19]
A. Addlesee and A. Eshghi, “You have interrupted me again!: making voice assistants more dementia-friendly with incremental clarification,” Front. Dement., vol. 3, p. 1343052, 2024
work page 2024
-
[20]
Challenges in automatic speech recognition for adults with cognitive impairment,
M. Cohn, A. Lanzi, Y. Ishihara, C.-N. Chuah, G. Zellou, and A. Weakley, “Challenges in automatic speech recognition for adults with cognitive impairment,” in Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems. ACM, 2026
work page 2026
-
[21]
Automated speech recognition systems and older adults: a literature review and synthesis,
L. Werner, G. Huang, and B. J. Pitts, “Automated speech recognition systems and older adults: a literature review and synthesis,” in Proc. Human Factors and Ergonomics Society Annual Meeting, 2019, pp. 42–46
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.