Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults

Ali Abedi; Charlene H. Chu; Shehroz S. Khan

arxiv: 2604.09548 · v1 · submitted 2026-01-16 · 💻 cs.IR · cs.AI

Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults

Ali Abedi , Charlene H. Chu , Shehroz S. Khan This is my paper

Pith reviewed 2026-05-16 14:08 UTC · model grok-4.3

classification 💻 cs.IR cs.AI

keywords retrieval-augmented generationlarge language modelscannabidiololder adultshealth educationAI safetyguideline alignmentdrug interactions

0 comments

The pith

Retrieval-augmented large language models deliver more cautious cannabidiol guidance for older adults than standalone models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a retrieval-augmented large language model system designed to offer evidence-based advice on using cannabidiol for older adults dealing with pain, sleep problems, or other issues. It uses a curated set of evidence combined with prompt engineering to generate responses tailored to individual scenarios including comorbidities and medications. Evaluation on 64 diverse cases showed that models with retrieval produced recommendations that were more aligned with safety guidelines, avoiding overconfidence or unsafe suggestions. The best results came from an ensemble of retrieval systems. This approach aims to make AI a safer tool for health education where accurate information is critical.

Core claim

Retrieval-augmented models, especially the ensemble version, consistently generated more cautious and guideline-aligned recommendations on cannabidiol use in older adults across three automated evaluation strategies, outperforming standalone large language models in 64 tested scenarios.

What carries the argument

Retrieval-augmented generation framework that integrates multiple retrieval systems with structured prompts and curated cannabidiol evidence to ensure context-aware and safe outputs.

If this is right

Retrieval augmentation leads to safer AI recommendations in health domains involving potential drug interactions.
Ensemble approaches combining multiple retrieval methods yield the highest alignment with guidelines.
Automated evaluation frameworks can assess AI safety without manual annotation.
Such systems can assist older adults and caregivers in understanding appropriate cannabidiol use.
The framework is reproducible for testing other AI health applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If validated in clinical settings, this could reduce harm from inaccurate AI health advice on supplements.
The technique may apply to guidance on other substances or medications with evolving evidence.
Future models could incorporate real-time evidence updates to maintain alignment with latest guidelines.
Human-AI hybrid systems might use this retrieval method to support healthcare professionals.

Load-bearing premise

The curated evidence on cannabidiol is complete and up-to-date with current guidelines, and the automated metrics accurately reflect real-world safety and alignment.

What would settle it

Human experts reviewing a sample of the AI outputs and finding that retrieval-augmented responses are not more cautious or are less aligned with guidelines than standalone model outputs.

Figures

Figures reproduced from arXiv: 2604.09548 by Ali Abedi, Charlene H. Chu, Shehroz S. Khan.

**Figure 1.** Figure 1: The block diagram of (a) a standalone large language model (LLM) receiving a human prompt together with a system prompt that instructs how the LLM should interpret the human input and generate its response, (b) the standard retrieval-augmented generation configuration in which an LLM is equipped with retrieved documents as external resources, and (c) the advanced configuration where two distinct retrieval-… view at source ↗

**Figure 2.** Figure 2: Boxplots of the generated educational content on CBD (a) dosage in milligrams, (b) dosing frequency per day, (c) titration amount in milligrams, (d) titration interval in days, and (e) maximum daily dose in milligrams across the evaluated LLM and RAG systems [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: The mean and standard deviation of the generated values (Mean val and Std val) as well as statistical consensus evaluation showing standardized z-scores (Mean z and Std z) of CBD (a) dosage in milligrams, (b) dosing frequency per day, (c) titration amount in milligrams, (d) titration interval in days, and (e) maximum daily dose in milligrams across the evaluated LLM and RAG systems [PITH_FULL_IMAGE:figure… view at source ↗

**Figure 4.** Figure 4: Feature-aligned directional evaluation showing the number of aligned (Aln), misaligned (Mis), and neutral (Neu) outputs, along with alignment rates (Aln%) for the generated CBD (a) dosage in milligrams, (b) dosing frequency per day, (c) titration amount in milligrams, (d) titration interval in days, and (e) maximum daily dose in milligrams across the evaluated LLM and RAG systems [PITH_FULL_IMAGE:figures/… view at source ↗

**Figure 5.** Figure 5: LLM-as-a-judge rubric-based evaluation of model outputs across five quality dimensions and the total score, using (a) GPT 5.1 and (b) Gemini 2.5 Pro as the evaluating models. Discussion [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

read the original abstract

Older adults commonly experience chronic conditions such as pain and sleep disturbances and may consider cannabidiol for symptom management. Safe use requires appropriate dosing, careful titration, and awareness of drug interactions, yet stigma and limited health literacy often limit understanding. Conversational artificial intelligence systems based on large language models and retrieval-augmented generation may support cannabidiol education, but their safety and reliability remain insufficiently evaluated. This study developed a retrieval-augmented large language model framework that combines structured prompt engineering with curated cannabidiol evidence to generate context-aware guidance for older adults, including those with cognitive impairment. We also proposed an automated, annotation-free evaluation framework to benchmark leading standalone and retrieval-augmented models in the absence of standardized benchmarks. Sixty-four diverse user scenarios were generated by varying symptoms, preferences, cognitive status, demographics, comorbidities, medications, cannabis history, and caregiver support. Multiple state-of-the-art models were evaluated, including a novel ensemble retrieval architecture that integrates multiple retrieval systems. Across three automated evaluation strategies, retrieval-augmented models consistently produced more cautious and guideline-aligned recommendations than standalone models, with the ensemble approach performing best. These findings demonstrate that structured retrieval improves the reliability and safety of AI-driven cannabidiol education and provide a reproducible framework for evaluating AI tools used in sensitive health contexts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RAG plus structured prompts makes LLM output on CBD for older adults more cautious and guideline-aligned than baselines, but the annotation-free evaluation strategies have no external validation against expert judgment.

read the letter

The main thing to know is that retrieval-augmented models, especially their ensemble version, produced more cautious and guideline-matching responses than plain LLMs across 64 generated scenarios on cannabidiol use in older adults. The authors built this by combining curated evidence with structured prompts that account for symptoms, comorbidities, medications, cognitive status, and caregiver support. They also introduced an annotation-free benchmarking approach that scores outputs on caution and alignment without manual labels for each case. That combination is new for this narrow health topic and gives a practical way to test safety in patient-facing AI without constant human review. The engineering choice of multiple retrieval systems feeding into one model looks like a reasonable step to reduce loose advice on dosing or interactions. The scenario generation method itself is straightforward and covers a useful range of real-world variables. The soft spot is that the three automated evaluation strategies rest on proxies whose link to actual clinical safety or guideline fidelity is not shown. Lexical caution markers or retrieval overlap can reward careful wording without confirming the content is medically correct, and the abstract supplies no numbers or details on how the strategies were implemented. The evidence base is treated as ground truth with no reported completeness checks. This work is for researchers working on retrieval methods for health education tools or automated safety testing in geriatrics. Readers who need concrete examples of RAG in sensitive domains will get usable ideas from the setup and the scenario framework. It deserves a serious referee because the core claim is testable and the evaluation idea is reproducible even if the current metrics need human validation to hold up.

Referee Report

2 major / 2 minor

Summary. The paper introduces a retrieval-augmented generation (RAG) framework that combines structured prompt engineering with a curated cannabidiol evidence base to generate context-aware guidance for older adults on CBD use, including those with cognitive impairment. It generates 64 diverse scenarios by varying symptoms, demographics, comorbidities, medications, and other factors, then evaluates leading standalone and RAG models (including a novel ensemble retrieval architecture) using three automated annotation-free strategies. The central claim is that RAG models, especially the ensemble, consistently produce more cautious and guideline-aligned recommendations than standalone models.

Significance. If the automated evaluation strategies can be shown to correlate with expert clinical judgment, the work could supply a reproducible, annotation-free framework for benchmarking AI safety in sensitive health-education domains where dosing errors and drug interactions carry direct risk.

major comments (2)

[Evaluation Framework] The central claim that RAG models (particularly the ensemble) produce more cautious and guideline-aligned output rests entirely on the three automated annotation-free evaluation strategies, yet the manuscript supplies no quantitative metrics, no explicit description of the strategies (e.g., lexical caution markers, retrieval overlap, or prompt-derived heuristics), and no external validation that these proxies correlate with actual clinical safety or fidelity to cannabidiol guidelines. This is load-bearing for the result.
[Evidence Base] The curated cannabidiol evidence base is treated as ground truth for measuring guideline alignment without reported completeness checks, expert curation audit, or assessment of its representativeness of current clinical guidelines.

minor comments (2)

[Abstract] The abstract asserts 'consistent improvements' and 'more cautious' recommendations but reports no specific quantitative metrics, effect sizes, or per-strategy scores to support these statements.
[Scenario Generation] Clarify the precise procedure used to generate the 64 scenarios and confirm that they capture real clinical complexity (e.g., polypharmacy interactions, cognitive impairment effects) rather than surface-level variations.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thorough review and constructive comments on our manuscript. We have revised the paper to address the major concerns raised and provide detailed responses below.

read point-by-point responses

Referee: [Evaluation Framework] The central claim that RAG models (particularly the ensemble) produce more cautious and guideline-aligned output rests entirely on the three automated annotation-free evaluation strategies, yet the manuscript supplies no quantitative metrics, no explicit description of the strategies (e.g., lexical caution markers, retrieval overlap, or prompt-derived heuristics), and no external validation that these proxies correlate with actual clinical safety or fidelity to cannabidiol guidelines. This is load-bearing for the result.

Authors: We agree that the original manuscript did not provide sufficient detail on the evaluation framework, which is indeed central to our claims. In the revised manuscript, we have added a new subsection (3.4 Evaluation Strategies) that explicitly describes each of the three automated, annotation-free strategies. This includes: (1) lexical caution scoring based on predefined markers (e.g., frequency of phrases recommending medical consultation or low-dose initiation), with reported quantitative results showing higher caution in RAG models; (2) retrieval-evidence overlap metrics quantifying how closely generated responses align with retrieved documents; and (3) prompt-derived heuristic checks for guideline elements such as interaction warnings. We now include specific quantitative metrics throughout the results section for each model and strategy. Regarding external validation, we acknowledge that demonstrating correlation with expert clinical judgment would strengthen the work but requires a dedicated follow-up study involving clinicians, which is beyond the current scope. We have expanded the limitations section to discuss this and outline plans for future validation. revision: yes
Referee: [Evidence Base] The curated cannabidiol evidence base is treated as ground truth for measuring guideline alignment without reported completeness checks, expert curation audit, or assessment of its representativeness of current clinical guidelines.

Authors: We thank the referee for pointing this out. The evidence base was assembled from a systematic search of PubMed, Cochrane reviews, and major clinical guidelines (e.g., from the FDA, NIH, and geriatric societies) published through 2023. In the revised version, we have included a detailed description in Section 2.2 and a new Appendix B that reports: completeness checks by topic coverage, the curation process (two authors independently reviewed sources with consensus), and an assessment of representativeness showing alignment with current recommendations on older adults. While we did not conduct a formal external expert audit, we have noted this as a limitation and clarified that the base serves as a representative synthesis rather than exhaustive ground truth. These additions provide greater transparency without altering the core findings. revision: yes

standing simulated objections not resolved

Demonstrating direct correlation of the automated evaluation proxies with expert clinical judgments, as this would necessitate a new empirical study with healthcare professionals.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper generates 64 scenarios externally by varying symptoms, demographics, and comorbidities, then evaluates RAG versus standalone models using three automated annotation-free strategies that compare outputs to curated guidelines. No equations or derivations reduce a prediction to a fitted parameter from the same data, no self-definitional loops appear where X is defined via Y and then Y is predicted from X, and no load-bearing self-citations are invoked to force uniqueness or ansatz choices. The central claim that ensemble RAG produces more cautious outputs rests on independent comparison to external guidelines rather than tautological renaming or construction from inputs, making the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that curated evidence is reliable; no free parameters or new invented entities are introduced beyond standard RAG components.

axioms (1)

domain assumption Curated cannabidiol evidence base is accurate and comprehensive for generating safe guidance.
Invoked when stating that retrieval improves guideline alignment.

pith-pipeline@v0.9.0 · 5537 in / 1127 out tokens · 40899 ms · 2026-05-16T14:08:52.161510+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Across three automated evaluation strategies, retrieval-augmented models consistently produced more cautious and guideline-aligned recommendations than standalone models, with the ensemble approach performing best.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · 2 internal anchors

[1]

Cannabidiol (CBD) Use by Older Adults for Acute and Chronic Pain,

B. Porter, B. S. Marie, G. Milavetz, and K. Herr, “Cannabidiol (CBD) Use by Older Adults for Acute and Chronic Pain,” J Gerontol Nurs, vol. 47, no. 7, pp. 6 –15, Jul. 2021, doi: 10.3928/00989134-20210610-02

work page doi:10.3928/00989134-20210610-02 2021
[2]

Use of cannabidiol in the management of insomnia: a systematic review,

R. M. Ranum, M. O. Whipple, I. Croghan, B. Bauer, L. L. Toussaint, and A. Vincent, “Use of cannabidiol in the management of insomnia: a systematic review,” Cannabis and Cannabinoid Research, vol. 8, no. 2, pp. 213–229, 2023

work page 2023
[3]

Cannabidiol in anxiety and sleep: a large case series,

S. Shannon, N. Lewis, H. Lee, and S. Hughes, “Cannabidiol in anxiety and sleep: a large case series,” The Permanente Journal, vol. 23, pp. 18–041, 2019

work page 2019
[4]

Use of cannabidiol (CBD) for the treatment of cognitive impairment in psychiatric and neurological illness: A narrative review,

R. Ortiz, S. Rueda, and P. Di Ciano, “Use of cannabidiol (CBD) for the treatment of cognitive impairment in psychiatric and neurological illness: A narrative review,” Exp Clin Psychopharmacol , vol. 31, no. 5, pp. 978 –988, Oct. 2023, doi: 10.1037/pha0000659

work page doi:10.1037/pha0000659 2023
[5]

Cannabinoids in the management of behavioral, psychological, and motor symptoms of neurocognitive disorders: a mixed studies systematic review,

A. Bahji et al., “Cannabinoids in the management of behavioral, psychological, and motor symptoms of neurocognitive disorders: a mixed studies systematic review,” Journal of Cannabis Research, vol. 4, no. 1, pp. 1–19, 2022

work page 2022
[6]

CBD and TH C: do they complement each other like Yin and Yang?,

S. D. Pennypacker and E. A. Romero -Sandoval, “CBD and TH C: do they complement each other like Yin and Yang?,” Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy, vol. 40, no. 11, pp. 1152–1165, 2020

work page 2020
[7]

Original qualitative research Perceptions of cannabis among adults aged 60 years and older in Canada: a qualitative study

J. Renard, B. Panesar, S. Noorbakhsh, E. Wadsworth, N. Cristiano, and R. Gabrys, “Original qualitative research Perceptions of cannabis among adults aged 60 years and older in Canada: a qualitative study”

work page
[8]

Epidemiology of cannabis use among middle-aged and older adults in the United States,

O. Livne, M. Stohl, J. Gilman, T. E. Goldberg, M. M. Wall, and D. S. Hasin, “Epidemiology of cannabis use among middle-aged and older adults in the United States,” American Journal of Preventive Medicine, p. 108149, 2025

work page 2025
[9]

Navigating cannabinoid choices for chronic neuropathic pain in older adults: potholes and highlights,

R. Kamrul, D. Bunka, A. Crawley, B. Schuster, and M. LeBras, “Navigating cannabinoid choices for chronic neuropathic pain in older adults: potholes and highlights,” Canadian Family Physician, vol. 65, no. 11, pp. 807–811, 2019

work page 2019
[10]

Clearing the Smoke on Cannabis: Medical Use of Cannabis and Cannabinoids (2024 Update),

C. C. on S. Use and Addiction, “Clearing the Smoke on Cannabis: Medical Use of Cannabis and Cannabinoids (2024 Update),” Canadian Centre on Substance Use and Addiction, Ottawa, Canada, 2024. [Online]. Available: https://www.ccsa.ca/sites/default/files/2024-04/Clearing-the-Smoke-on-Cannabis- Medical-Use-of-Cannabis-and-Cannabinoids-2024-Update-en.pdf

work page 2024
[11]

Mental health and cognition in older cannabis users: a review,

B. E. Vacaflor, O. Beauchet, G. E. Jarvis, A. Schavietto, and S. Rej, “Mental health and cognition in older cannabis users: a review,” Canadian Geriatrics Journal, vol. 23, no. 3, p. 242, 2020

work page 2020
[12]

Cannabis Use Among Older Adults,

V. Pravosud et al., “Cannabis Use Among Older Adults,” JAMA Network Open, vol. 8, no. 5, pp. e2510173–e2510173, 2025

work page 2025
[13]

Patient information materials in general practices and promotion of health literacy: an observational study of their effectiveness,

J. Protheroe, E. V. Esta cio, and S. Saidy -Khan, “Patient information materials in general practices and promotion of health literacy: an observational study of their effectiveness,” The British Journal of General Practice , vol. 65, no. 632, p. e192, 2015

work page 2015
[14]

The Effects of Stigma: Older Persons and Medicinal Cannabis,

S. Dahlke et al., “The Effects of Stigma: Older Persons and Medicinal Cannabis,” Qual Health Res , vol. 34, no. 8 –9, pp. 717 –731, Jul. 2024, doi: 10.1177/10497323241227419

work page doi:10.1177/10497323241227419 2024
[15]

Health literacy and older adults: A sy stematic review,

A. K. Chesser, N. Keene Woods, K. Smothers, and N. Rogers, “Health literacy and older adults: A sy stematic review,” Gerontology and geriatric medicine , vol. 2, p. 2333721416630492, 2016

work page 2016
[16]

Readability and comprehensibility of over-the-counter medication labels,

H. Trivedi, A. Trivedi, and M. F. Hannan, “Readability and comprehensibility of over-the-counter medication labels,” Renal Failure , vol. 36, no. 3, pp. 473 –477, 2014

work page 2014
[17]

Emergent Abilities of Large Language Models

J. Wei et al. , “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[18]

Survey of hallucination in natural language generation,

Z. Ji et al. , “Survey of hallucination in natural language generation,” ACM computing surveys, vol. 55, no. 12, pp. 1–38, 2023

work page 2023
[19]

Enhancing health care communication with large language models —the role, challenges, and future directions,

C. R. S ubramanian, D. A. Yang, and R. Khanna, “Enhancing health care communication with large language models —the role, challenges, and future directions,” JAMA network open, vol. 7, no. 3, pp. e240347–e240347, 2024

work page 2024
[20]

Large Language Models for Chatbot Health Advice Studies: A Systematic Review,

B. Huo et al. , “Large Language Models for Chatbot Health Advice Studies: A Systematic Review,” JAMA Netw Open, vol. 8, no. 2, p. e2457879, Feb. 2025, doi: 10.1001/jamanetworkopen.2024.57879

work page doi:10.1001/jamanetworkopen.2024.57879 2025
[21]

Assessment of the Uti lity of Artificial Intelligence -Based Chatbots in Patient Education: A Systematic Review and Meta -Analysis,

S. H. Emile, N. Horesh, Z. Garoufalia, R. Gefen, M. Boutros, and S. D. Wexner, “Assessment of the Uti lity of Artificial Intelligence -Based Chatbots in Patient Education: A Systematic Review and Meta -Analysis,” The American SurgeonTM, p. 00031348251367031, 2025

work page 2025
[22]

Retrieval augmented generation for large language models in healthcare: A systematic review,

L. M. Amugongo, P. Mascheroni, S. Brooks, S. Doering, and J. Seidel, “Retrieval augmented generation for large language models in healthcare: A systematic review,” PLOS Digit Health , vol. 4, no. 6, p. e0000877, Jun. 2025, doi: 10.1371/journal.pdig.0000877

work page doi:10.1371/journal.pdig.0000877 2025
[23]

Bridging AI and Healthcare: A Scoping Review of Retrieval -Augmented Generation — Ethics, Bias, Transparency, Improvements, and Applications,

D. J. Bunnell, M. J. Bondy, L. M. Fromtling, E. Ludeman, and K. Gourab, “Bridging AI and Healthcare: A Scoping Review of Retrieval -Augmented Generation — Ethics, Bias, Transparency, Improvements, and Applications,” medRxiv, pp. 2025– 04, 2025

work page 2025
[24]

Retrieval -augmented generation for knowledge -intensive nlp tasks,

P. L ewis et al. , “Retrieval -augmented generation for knowledge -intensive nlp tasks,” Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020

work page 2020
[25]

[Online]

OpenAI, “GPT-5,” 2025. [Online]. Available: https://openai.com/index/introducing- gpt-5/

work page 2025
[26]

Gemini 2.5 Pro,

Google DeepMind, “Gemini 2.5 Pro,” 2025. [Online]. Available: https://deepmind.google/models/gemini/pro/

work page 2025
[27]

Claude Sonnet 4.5,

Anthropic, “Claude Sonnet 4.5,” 2025. [Online]. Available: https://www.anthropic.com/news/claude-sonnet-4-5

work page 2025
[28]

Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness,

Y. H. Ke et al., “Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness,” npj Digital Medicine, vol. 8, no. 1, p. 187, 2025

work page 2025
[29]

Conversational agents in healthcare: a systematic review,

L. Laranjo et al. , “Conversational agents in healthcare: a systematic review,” Journal of the American Medical Informatics Association , vol. 25, no. 9, pp. 1248 – 1258, 2018

work page 2018
[30]

PharmaLLM: A medicine prescriber chatbot exploiting Open -Source large language models,

A. Azam, Z. Naz, and M. U. G. Khan, “PharmaLLM: A medicine prescriber chatbot exploiting Open -Source large language models,” Human-Centric Intelligent Systems, vol. 4, no. 4, pp. 527–544, 2024

work page 2024
[31]

Accuracy of a chatbot in answering questions that patients should ask before taking a new medication,

B. R. Cornelison, B. L. Erstad, and C. Edwards, “Accuracy of a chatbot in answering questions that patients should ask before taking a new medication,” Journal of the American Pharmacists Association, vol. 64, no. 4, p. 102110, 2024

work page 2024
[32]

Retrieval -Augmented Generation Meets Local Languages for Improved Drug Information Access and Comprehension.,

A. I. Ismail, B. O. Ibrahim, O. Adekanmbi, and I. Adebara, “Retrieval -Augmented Generation Meets Local Languages for Improved Drug Information Access and Comprehension.,” in Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), 2025, pp. 108–114

work page 2025
[33]

Development and evaluation of a lightweight large language model chatbot for medication enquiry,

K. Elangovan et al., “Development and evaluation of a lightweight large language model chatbot for medication enquiry,” PLOS Digital Health , vol. 4, no. 9, p. e0000961, 2025

work page 2025
[34]

Mistral 7B

A. Q. Jiang et al., “Mistral 7B,” arXiv preprint arXiv:2310.06825, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[35]

Enhanced LLM -supported instructions for medication use through retrieval-augmented generation,

D. dos R. de Jesus et al., “Enhanced LLM -supported instructions for medication use through retrieval-augmented generation,” Computers in Biology and Medicine, vol. 198, p. 111135, 2025

work page 2025
[36]

E valuation of a context -aware chatbot using retrieval - augmented generation for answering clinical questions on medication -related osteonecrosis of the jaw,

D. Steybe et al. , “E valuation of a context -aware chatbot using retrieval - augmented generation for answering clinical questions on medication -related osteonecrosis of the jaw,” Journal of Cranio -Maxillofacial Surgery, vol. 53, no. 4, pp. 355–360, 2025

work page 2025
[37]

From prompt to platform: an agentic AI workflow for healthcare simulation scenario design,

F. L. Barra et al., “From prompt to platform: an agentic AI workflow for healthcare simulation scenario design,” Advances in Simulation, vol. 10, no. 1, p. 29, 2025

work page 2025
[38]

Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation,

J. Kang, “Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation,” arXiv preprint arXiv:2507.02253, 2025

work page arXiv 2025
[39]

Automated Generation of Test Scenarios for Autonomous Driving Using LLMs,

A. A. Danso and U. Büker, “Automated Generation of Test Scenarios for Autonomous Driving Using LLMs,” Electronics, vol. 14, no. 16, p. 3177, 2025

work page 2025
[40]

Windows to their world: the effect of sensory impairments on social engagement and activity time in nursing home residents,

H. E. Resnick, B. E. Fries, and L. M. Verbrugge, “Windows to their world: the effect of sensory impairments on social engagement and activity time in nursing home residents,” J Gerontol B Psychol Sci Soc Sci , vol. 52, no. 3, pp. S135 -144, May 1997, doi: 10.1093/geronb/52b.3.s135

work page doi:10.1093/geronb/52b.3.s135 1997
[41]

Medical cannabis use among older adults in Canada: self -reported data on types and amount used, and perceived effects,

S. Tumati, K. L. Lanctôt, R. Wang, A. Li, A. Davis, and N. Herrmann, “Medical cannabis use among older adults in Canada: self -reported data on types and amount used, and perceived effects,” Drugs & aging , vol. 39, no. 2, pp. 153 –163, 2022

work page 2022
[42]

A conceptual framework for research on subjective cognitive decline in preclinical Alzheimer’s disease,

F. Jessen et al., “A conceptual framework for research on subjective cognitive decline in preclinical Alzheimer’s disease,” Alzheimer’s & dementia, vol. 10, no. 6, pp. 844–852, 2014

work page 2014
[43]

Mild cognitive impairment,

R. C. Petersen, “Mild cognitive impairment,” CONTINUUM: lifelong Learning in Neurology, vol. 22, no. 2, pp. 404–418, 2016

work page 2016
[44]

Older persons: Definitions and key concepts,

World Health Organization, “Older persons: Definitions and key concepts,” 2023. [Online]. Available: https://emergency.unhcr.org/protection/persons -risk/older- persons

work page 2023
[45]

Defining age and older adulthood: NIH Style Guide,

National Institutes of Health, “Defining age and older adulthood: NIH Style Guide,” 2024. [Online]. Available: https://www.nih.gov/nih-style-guide/age

work page 2024
[46]

Cannabidiol and liver enzyme level elevations in healthy adults: A randomized clinical trial,

J. Florian et al., “Cannabidiol and liver enzyme level elevations in healthy adults: A randomized clinical trial,” JAMA Internal Medicine, vol. 185, no. 9, pp. 1070 –1078, 2025

work page 2025
[47]

High prevalence of comorbidities in older adult patients with type 2 diabetes: a cross -sectional survey,

R. Hashemi et al., “High prevalence of comorbidities in older adult patients with type 2 diabetes: a cross -sectional survey,” BMC geriatrics , vol. 24, no. 1, p. 873, 2024

work page 2024
[48]

Exploring The Contours: Navig ating Cannabis Use Among Older Adults,

Y. M. Shin, M. Moussa, and J. Akwe, “Exploring The Contours: Navig ating Cannabis Use Among Older Adults,” Journal of Brown Hospital Medicine, vol. 3, no. 3, p. 120951, 2024

work page 2024
[49]

Taking Care of Themselves: Cannabis Use Among Informal Care Partners of Older Adults,

B. Kaskie et al., “Taking Care of Themselves: Cannabis Use Among Informal Care Partners of Older Adults,” Cannabis and cannabinoid research, 2025

work page 2025
[50]

Prompt engineering as an important emerging skill for medical professionals: tutorial,

B. Meskó, “Prompt engineering as an important emerging skill for medical professionals: tutorial,” Journal of medical Internet research , vol. 25, p. e50638, 2023

work page 2023
[51]

Better zero -shot reasoning with role -play prompting,

A. Kong et al. , “Better zero -shot reasoning with role -play prompting,” in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 4099–4113

work page 2024
[52]

Schema-guided natural language generation,

Y. Du et al., “Schema-guided natural language generation,” in Proceedings of the 13th International Conference on Natural Language Generation, 2020, pp. 283–295

work page 2020
[53]

Chain -of-thought prompting elicits reasoning in large language models,

J. Wei et al. , “Chain -of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems , vol. 35, pp. 24824– 24837, 2022

work page 2022
[54]

Rehabilitation Exercise Quality Assessment and Feedback Generation Using Large Language Models with Prompt Engineering,

J. Tang, A. Abedi, T. J. Colella, and S. S. Khan, “Rehabilitation Exercise Quality Assessment and Feedback Generation Using Large Language Models with Prompt Engineering,” in International Joint Conference on Artificial Intelligence , Springer, 2025, pp. 60–75

work page 2025
[55]

RAG in health care: a novel framework for improving communication and decision -making by addressing LLM limitations,

K. K. Y. Ng, I. Matsuba, and P. C. Zhang, “RAG in health care: a novel framework for improving communication and decision -making by addressing LLM limitations,” Nejm Ai, vol. 2, no. 1, p. AIra2400380, 2025

work page 2025
[56]

Mistral Medium 3,

Mistral AI , “Mistral Medium 3,” 2025. [Online]. Available: https://mistral.ai/news/mistral-medium-3/

work page 2025
[57]

[Online]

xAI, “Grok 4,” 2025. [Online]. Available: https://x.ai/news/grok-4

work page 2025
[58]

DeepSeek V3.2 -Exp,

DeepSeek, “DeepSeek V3.2 -Exp,” 2025. [Online]. Available: https://api - docs.deepseek.com/news/news250929

work page 2025
[59]

Is temperature the creativity parameter of large language models?,

M. Peeperkorn, T. Kouwenhoven, D. Brown, and A. Jordanous, “Is temperature the creativity parameter of large language models?,” arXiv preprint arXiv:2405.00492, 2024

work page arXiv 2024
[60]

Challenges and applications of large language models.arXiv preprint arXiv:2307.10169, 2023

J. Kaddour, J. Harris, M. Mozes, H. Bradley, R. Raileanu, and R . McHardy, “Challenges and applications of large language models,” arXiv preprint arXiv:2307.10169, 2023

work page arXiv 2023
[61]

LangGraph,

LangChain, “LangGraph,” 2025. [Online]. Available: https://www.langchain.com/langgraph

work page 2025
[62]

Johnson, M

J. Johnson, M. Douze, and H. Jégou, “Billion -scale similarity search with GPUs,” IEEE Transactions on Big Data , vol. 7, no. 3, pp. 535 –547, 2021, doi: 10.1109/TBDATA.2019.2921572

work page doi:10.1109/tbdata.2019.2921572 2021
[63]

A content analysis of internet information sources on medical cannabis,

D. J. Kruger, I. M. Moffet, L. C. Seluk, and L. A. Zammit, “A content analysis of internet information sources on medical cannabis,” Journal of Cannabis Research, vol. 2, no. 1, p. 29, 2020

work page 2020
[64]

The information-seeking behavior and unmet knowledge needs of older medicinal cannabis consumers in Canada: A qualitative descriptive study,

J. I. Butler et al., “The information-seeking behavior and unmet knowledge needs of older medicinal cannabis consumers in Canada: A qualitative descriptive study,” Drugs & Aging, vol. 40, no. 5, pp. 427–438, 2023

work page 2023
[65]

Can-stress: A real-world multimodal dataset for understanding cannabis use, stress, and physiological responses.arXiv preprint arXiv:2503.19935, 2025

R. R. Azghan et al. , “CAN -STRESS: A Real -World Multimodal Dataset for Understanding Cannabis Use, Stress, and Physiological Responses,” arXiv preprint arXiv:2503.19935, 2025

work page arXiv 2025
[66]

Evaluating large language models and agents in healthcare: key challenges in clinical applications,

C. Xiaolan, X. Jiayang, L. Shanfu, L. Yexin, H. Mingguang, and S. Danli, “Evaluating large language models and agents in healthcare: key challenges in clinical applications,” Intelligent Medicine, 2025

work page 2025
[67]

On the influence of an iterative affect annotation approach on inter-observer and self -observer reliability,

S. K. D’Mello, “On the influence of an iterative affect annotation approach on inter-observer and self -observer reliability,” IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 136–149, 2015

work page 2015
[68]

Annotated dataset creation through large language models for non-english medical NLP,

J. Frei and F. Kramer, “Annotated dataset creation through large language models for non-english medical NLP,” Journal of Biomedical Infor matics, vol. 145, p. 104478, 2023

work page 2023
[69]

Self -instruct: Aligning language models with self -generated instructions,

Y. Wang et al. , “Self -instruct: Aligning language models with self -generated instructions,” in Proceedings of the 61st annual meeting of the association for computational linguistics (volume 1: long papers), 2023, pp. 13484–13508

work page 2023
[70]

Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models,

T. Wu, M. T. Ribeiro, J. Heer, and D. S. Weld, “Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models,” arXiv preprint arXiv:2101.00288, 2021

work page arXiv 2021
[71]

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

P. Shojaee, K. Meidani, S. Gupta, A. B. Farimani, and C. K. Reddy, “Llm-sr: Scientific equation discovery via programming with large language models,” arXiv preprint arXiv:2404.18400, 2024

work page arXiv 2024
[72]

Are large language models good annotators?,

J. Mohta, K. Ak, Y. Xu, and M. Shen, “Are large language models good annotators?,” in Proceedings on, PMLR, 2023, pp. 38–48

work page 2023
[73]

29.Carion, N.et al.Sam 3: Segment anything with concepts (2025)

Z. Tan et al. , “Large language models for data annotation and synthesis: A survey,” arXiv preprint arXiv:2402.13446, 2024

work page arXiv 2024
[74]

Cannabis: an emerging treatment for common symptoms in older adults,

K. H. Yang et al., “Cannabis: an emerging treatment for common symptoms in older adults,” Journal of the American Geriatrics Society, vol. 69, no. 1, pp. 91 –97, 2021

work page 2021
[75]

Risk factors for cannabis-related mental health harms in older adults: a review,

A. Hudson and P. Hudson, “Risk factors for cannabis-related mental health harms in older adults: a review,” Clinical Gerontologist, vol. 44, no. 1, pp. 3–15, 2021

work page 2021
[76]

Evaluating clinical AI summaries with large language models as judges,

E. Croxford et al., “Evaluating clinical AI summaries with large language models as judges,” npj Digital Medicine, vol. 8, no. 1, p. 640, 2025

work page 2025
[77]

Alignbench: Benchmarking chinese alignment of large language models,

X. Liu et al. , “Alignbench: Benchmarking chinese alignment of large language models,” in Proceedings of the 62nd Annual Meeting of the A ssociation for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 11621–11640

work page 2024
[78]

LiveBench

“LiveBench.” [Online]. Available: https://livebench.ai/

work page
[79]

Self -rag: Learning to retrieve, generate, and critique through self-reflection,

A. Asai, Z. Wu, Y. Wang, A. Sil, and H. Hajishirzi, “Self -rag: Learning to retrieve, generate, and critique through self-reflection,” 2024

work page 2024
[80]

Corrective retrieval augmented generation,

S.-Q. Yan, J. -C. Gu, Y. Zhu, and Z. -H. Ling, “Corrective retrieval augmented generation,” 2024

work page 2024

[1] [1]

Cannabidiol (CBD) Use by Older Adults for Acute and Chronic Pain,

B. Porter, B. S. Marie, G. Milavetz, and K. Herr, “Cannabidiol (CBD) Use by Older Adults for Acute and Chronic Pain,” J Gerontol Nurs, vol. 47, no. 7, pp. 6 –15, Jul. 2021, doi: 10.3928/00989134-20210610-02

work page doi:10.3928/00989134-20210610-02 2021

[2] [2]

Use of cannabidiol in the management of insomnia: a systematic review,

R. M. Ranum, M. O. Whipple, I. Croghan, B. Bauer, L. L. Toussaint, and A. Vincent, “Use of cannabidiol in the management of insomnia: a systematic review,” Cannabis and Cannabinoid Research, vol. 8, no. 2, pp. 213–229, 2023

work page 2023

[3] [3]

Cannabidiol in anxiety and sleep: a large case series,

S. Shannon, N. Lewis, H. Lee, and S. Hughes, “Cannabidiol in anxiety and sleep: a large case series,” The Permanente Journal, vol. 23, pp. 18–041, 2019

work page 2019

[4] [4]

Use of cannabidiol (CBD) for the treatment of cognitive impairment in psychiatric and neurological illness: A narrative review,

R. Ortiz, S. Rueda, and P. Di Ciano, “Use of cannabidiol (CBD) for the treatment of cognitive impairment in psychiatric and neurological illness: A narrative review,” Exp Clin Psychopharmacol , vol. 31, no. 5, pp. 978 –988, Oct. 2023, doi: 10.1037/pha0000659

work page doi:10.1037/pha0000659 2023

[5] [5]

Cannabinoids in the management of behavioral, psychological, and motor symptoms of neurocognitive disorders: a mixed studies systematic review,

A. Bahji et al., “Cannabinoids in the management of behavioral, psychological, and motor symptoms of neurocognitive disorders: a mixed studies systematic review,” Journal of Cannabis Research, vol. 4, no. 1, pp. 1–19, 2022

work page 2022

[6] [6]

CBD and TH C: do they complement each other like Yin and Yang?,

S. D. Pennypacker and E. A. Romero -Sandoval, “CBD and TH C: do they complement each other like Yin and Yang?,” Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy, vol. 40, no. 11, pp. 1152–1165, 2020

work page 2020

[7] [7]

Original qualitative research Perceptions of cannabis among adults aged 60 years and older in Canada: a qualitative study

J. Renard, B. Panesar, S. Noorbakhsh, E. Wadsworth, N. Cristiano, and R. Gabrys, “Original qualitative research Perceptions of cannabis among adults aged 60 years and older in Canada: a qualitative study”

work page

[8] [8]

Epidemiology of cannabis use among middle-aged and older adults in the United States,

O. Livne, M. Stohl, J. Gilman, T. E. Goldberg, M. M. Wall, and D. S. Hasin, “Epidemiology of cannabis use among middle-aged and older adults in the United States,” American Journal of Preventive Medicine, p. 108149, 2025

work page 2025

[9] [9]

Navigating cannabinoid choices for chronic neuropathic pain in older adults: potholes and highlights,

R. Kamrul, D. Bunka, A. Crawley, B. Schuster, and M. LeBras, “Navigating cannabinoid choices for chronic neuropathic pain in older adults: potholes and highlights,” Canadian Family Physician, vol. 65, no. 11, pp. 807–811, 2019

work page 2019

[10] [10]

Clearing the Smoke on Cannabis: Medical Use of Cannabis and Cannabinoids (2024 Update),

C. C. on S. Use and Addiction, “Clearing the Smoke on Cannabis: Medical Use of Cannabis and Cannabinoids (2024 Update),” Canadian Centre on Substance Use and Addiction, Ottawa, Canada, 2024. [Online]. Available: https://www.ccsa.ca/sites/default/files/2024-04/Clearing-the-Smoke-on-Cannabis- Medical-Use-of-Cannabis-and-Cannabinoids-2024-Update-en.pdf

work page 2024

[11] [11]

Mental health and cognition in older cannabis users: a review,

B. E. Vacaflor, O. Beauchet, G. E. Jarvis, A. Schavietto, and S. Rej, “Mental health and cognition in older cannabis users: a review,” Canadian Geriatrics Journal, vol. 23, no. 3, p. 242, 2020

work page 2020

[12] [12]

Cannabis Use Among Older Adults,

V. Pravosud et al., “Cannabis Use Among Older Adults,” JAMA Network Open, vol. 8, no. 5, pp. e2510173–e2510173, 2025

work page 2025

[13] [13]

Patient information materials in general practices and promotion of health literacy: an observational study of their effectiveness,

J. Protheroe, E. V. Esta cio, and S. Saidy -Khan, “Patient information materials in general practices and promotion of health literacy: an observational study of their effectiveness,” The British Journal of General Practice , vol. 65, no. 632, p. e192, 2015

work page 2015

[14] [14]

The Effects of Stigma: Older Persons and Medicinal Cannabis,

S. Dahlke et al., “The Effects of Stigma: Older Persons and Medicinal Cannabis,” Qual Health Res , vol. 34, no. 8 –9, pp. 717 –731, Jul. 2024, doi: 10.1177/10497323241227419

work page doi:10.1177/10497323241227419 2024

[15] [15]

Health literacy and older adults: A sy stematic review,

A. K. Chesser, N. Keene Woods, K. Smothers, and N. Rogers, “Health literacy and older adults: A sy stematic review,” Gerontology and geriatric medicine , vol. 2, p. 2333721416630492, 2016

work page 2016

[16] [16]

Readability and comprehensibility of over-the-counter medication labels,

H. Trivedi, A. Trivedi, and M. F. Hannan, “Readability and comprehensibility of over-the-counter medication labels,” Renal Failure , vol. 36, no. 3, pp. 473 –477, 2014

work page 2014

[17] [17]

Emergent Abilities of Large Language Models

J. Wei et al. , “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[18] [18]

Survey of hallucination in natural language generation,

Z. Ji et al. , “Survey of hallucination in natural language generation,” ACM computing surveys, vol. 55, no. 12, pp. 1–38, 2023

work page 2023

[19] [19]

Enhancing health care communication with large language models —the role, challenges, and future directions,

C. R. S ubramanian, D. A. Yang, and R. Khanna, “Enhancing health care communication with large language models —the role, challenges, and future directions,” JAMA network open, vol. 7, no. 3, pp. e240347–e240347, 2024

work page 2024

[20] [20]

Large Language Models for Chatbot Health Advice Studies: A Systematic Review,

B. Huo et al. , “Large Language Models for Chatbot Health Advice Studies: A Systematic Review,” JAMA Netw Open, vol. 8, no. 2, p. e2457879, Feb. 2025, doi: 10.1001/jamanetworkopen.2024.57879

work page doi:10.1001/jamanetworkopen.2024.57879 2025

[21] [21]

Assessment of the Uti lity of Artificial Intelligence -Based Chatbots in Patient Education: A Systematic Review and Meta -Analysis,

S. H. Emile, N. Horesh, Z. Garoufalia, R. Gefen, M. Boutros, and S. D. Wexner, “Assessment of the Uti lity of Artificial Intelligence -Based Chatbots in Patient Education: A Systematic Review and Meta -Analysis,” The American SurgeonTM, p. 00031348251367031, 2025

work page 2025

[22] [22]

Retrieval augmented generation for large language models in healthcare: A systematic review,

L. M. Amugongo, P. Mascheroni, S. Brooks, S. Doering, and J. Seidel, “Retrieval augmented generation for large language models in healthcare: A systematic review,” PLOS Digit Health , vol. 4, no. 6, p. e0000877, Jun. 2025, doi: 10.1371/journal.pdig.0000877

work page doi:10.1371/journal.pdig.0000877 2025

[23] [23]

Bridging AI and Healthcare: A Scoping Review of Retrieval -Augmented Generation — Ethics, Bias, Transparency, Improvements, and Applications,

D. J. Bunnell, M. J. Bondy, L. M. Fromtling, E. Ludeman, and K. Gourab, “Bridging AI and Healthcare: A Scoping Review of Retrieval -Augmented Generation — Ethics, Bias, Transparency, Improvements, and Applications,” medRxiv, pp. 2025– 04, 2025

work page 2025

[24] [24]

Retrieval -augmented generation for knowledge -intensive nlp tasks,

P. L ewis et al. , “Retrieval -augmented generation for knowledge -intensive nlp tasks,” Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020

work page 2020

[25] [25]

[Online]

OpenAI, “GPT-5,” 2025. [Online]. Available: https://openai.com/index/introducing- gpt-5/

work page 2025

[26] [26]

Gemini 2.5 Pro,

Google DeepMind, “Gemini 2.5 Pro,” 2025. [Online]. Available: https://deepmind.google/models/gemini/pro/

work page 2025

[27] [27]

Claude Sonnet 4.5,

Anthropic, “Claude Sonnet 4.5,” 2025. [Online]. Available: https://www.anthropic.com/news/claude-sonnet-4-5

work page 2025

[28] [28]

Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness,

Y. H. Ke et al., “Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness,” npj Digital Medicine, vol. 8, no. 1, p. 187, 2025

work page 2025

[29] [29]

Conversational agents in healthcare: a systematic review,

L. Laranjo et al. , “Conversational agents in healthcare: a systematic review,” Journal of the American Medical Informatics Association , vol. 25, no. 9, pp. 1248 – 1258, 2018

work page 2018

[30] [30]

PharmaLLM: A medicine prescriber chatbot exploiting Open -Source large language models,

A. Azam, Z. Naz, and M. U. G. Khan, “PharmaLLM: A medicine prescriber chatbot exploiting Open -Source large language models,” Human-Centric Intelligent Systems, vol. 4, no. 4, pp. 527–544, 2024

work page 2024

[31] [31]

Accuracy of a chatbot in answering questions that patients should ask before taking a new medication,

B. R. Cornelison, B. L. Erstad, and C. Edwards, “Accuracy of a chatbot in answering questions that patients should ask before taking a new medication,” Journal of the American Pharmacists Association, vol. 64, no. 4, p. 102110, 2024

work page 2024

[32] [32]

Retrieval -Augmented Generation Meets Local Languages for Improved Drug Information Access and Comprehension.,

A. I. Ismail, B. O. Ibrahim, O. Adekanmbi, and I. Adebara, “Retrieval -Augmented Generation Meets Local Languages for Improved Drug Information Access and Comprehension.,” in Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), 2025, pp. 108–114

work page 2025

[33] [33]

Development and evaluation of a lightweight large language model chatbot for medication enquiry,

K. Elangovan et al., “Development and evaluation of a lightweight large language model chatbot for medication enquiry,” PLOS Digital Health , vol. 4, no. 9, p. e0000961, 2025

work page 2025

[34] [34]

Mistral 7B

A. Q. Jiang et al., “Mistral 7B,” arXiv preprint arXiv:2310.06825, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[35] [35]

Enhanced LLM -supported instructions for medication use through retrieval-augmented generation,

D. dos R. de Jesus et al., “Enhanced LLM -supported instructions for medication use through retrieval-augmented generation,” Computers in Biology and Medicine, vol. 198, p. 111135, 2025

work page 2025

[36] [36]

E valuation of a context -aware chatbot using retrieval - augmented generation for answering clinical questions on medication -related osteonecrosis of the jaw,

D. Steybe et al. , “E valuation of a context -aware chatbot using retrieval - augmented generation for answering clinical questions on medication -related osteonecrosis of the jaw,” Journal of Cranio -Maxillofacial Surgery, vol. 53, no. 4, pp. 355–360, 2025

work page 2025

[37] [37]

From prompt to platform: an agentic AI workflow for healthcare simulation scenario design,

F. L. Barra et al., “From prompt to platform: an agentic AI workflow for healthcare simulation scenario design,” Advances in Simulation, vol. 10, no. 1, p. 29, 2025

work page 2025

[38] [38]

Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation,

J. Kang, “Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation,” arXiv preprint arXiv:2507.02253, 2025

work page arXiv 2025

[39] [39]

Automated Generation of Test Scenarios for Autonomous Driving Using LLMs,

A. A. Danso and U. Büker, “Automated Generation of Test Scenarios for Autonomous Driving Using LLMs,” Electronics, vol. 14, no. 16, p. 3177, 2025

work page 2025

[40] [40]

Windows to their world: the effect of sensory impairments on social engagement and activity time in nursing home residents,

H. E. Resnick, B. E. Fries, and L. M. Verbrugge, “Windows to their world: the effect of sensory impairments on social engagement and activity time in nursing home residents,” J Gerontol B Psychol Sci Soc Sci , vol. 52, no. 3, pp. S135 -144, May 1997, doi: 10.1093/geronb/52b.3.s135

work page doi:10.1093/geronb/52b.3.s135 1997

[41] [41]

Medical cannabis use among older adults in Canada: self -reported data on types and amount used, and perceived effects,

S. Tumati, K. L. Lanctôt, R. Wang, A. Li, A. Davis, and N. Herrmann, “Medical cannabis use among older adults in Canada: self -reported data on types and amount used, and perceived effects,” Drugs & aging , vol. 39, no. 2, pp. 153 –163, 2022

work page 2022

[42] [42]

A conceptual framework for research on subjective cognitive decline in preclinical Alzheimer’s disease,

F. Jessen et al., “A conceptual framework for research on subjective cognitive decline in preclinical Alzheimer’s disease,” Alzheimer’s & dementia, vol. 10, no. 6, pp. 844–852, 2014

work page 2014

[43] [43]

Mild cognitive impairment,

R. C. Petersen, “Mild cognitive impairment,” CONTINUUM: lifelong Learning in Neurology, vol. 22, no. 2, pp. 404–418, 2016

work page 2016

[44] [44]

Older persons: Definitions and key concepts,

World Health Organization, “Older persons: Definitions and key concepts,” 2023. [Online]. Available: https://emergency.unhcr.org/protection/persons -risk/older- persons

work page 2023

[45] [45]

Defining age and older adulthood: NIH Style Guide,

National Institutes of Health, “Defining age and older adulthood: NIH Style Guide,” 2024. [Online]. Available: https://www.nih.gov/nih-style-guide/age

work page 2024

[46] [46]

Cannabidiol and liver enzyme level elevations in healthy adults: A randomized clinical trial,

J. Florian et al., “Cannabidiol and liver enzyme level elevations in healthy adults: A randomized clinical trial,” JAMA Internal Medicine, vol. 185, no. 9, pp. 1070 –1078, 2025

work page 2025

[47] [47]

High prevalence of comorbidities in older adult patients with type 2 diabetes: a cross -sectional survey,

R. Hashemi et al., “High prevalence of comorbidities in older adult patients with type 2 diabetes: a cross -sectional survey,” BMC geriatrics , vol. 24, no. 1, p. 873, 2024

work page 2024

[48] [48]

Exploring The Contours: Navig ating Cannabis Use Among Older Adults,

Y. M. Shin, M. Moussa, and J. Akwe, “Exploring The Contours: Navig ating Cannabis Use Among Older Adults,” Journal of Brown Hospital Medicine, vol. 3, no. 3, p. 120951, 2024

work page 2024

[49] [49]

Taking Care of Themselves: Cannabis Use Among Informal Care Partners of Older Adults,

B. Kaskie et al., “Taking Care of Themselves: Cannabis Use Among Informal Care Partners of Older Adults,” Cannabis and cannabinoid research, 2025

work page 2025

[50] [50]

Prompt engineering as an important emerging skill for medical professionals: tutorial,

B. Meskó, “Prompt engineering as an important emerging skill for medical professionals: tutorial,” Journal of medical Internet research , vol. 25, p. e50638, 2023

work page 2023

[51] [51]

Better zero -shot reasoning with role -play prompting,

A. Kong et al. , “Better zero -shot reasoning with role -play prompting,” in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 4099–4113

work page 2024

[52] [52]

Schema-guided natural language generation,

Y. Du et al., “Schema-guided natural language generation,” in Proceedings of the 13th International Conference on Natural Language Generation, 2020, pp. 283–295

work page 2020

[53] [53]

Chain -of-thought prompting elicits reasoning in large language models,

J. Wei et al. , “Chain -of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems , vol. 35, pp. 24824– 24837, 2022

work page 2022

[54] [54]

Rehabilitation Exercise Quality Assessment and Feedback Generation Using Large Language Models with Prompt Engineering,

J. Tang, A. Abedi, T. J. Colella, and S. S. Khan, “Rehabilitation Exercise Quality Assessment and Feedback Generation Using Large Language Models with Prompt Engineering,” in International Joint Conference on Artificial Intelligence , Springer, 2025, pp. 60–75

work page 2025

[55] [55]

RAG in health care: a novel framework for improving communication and decision -making by addressing LLM limitations,

K. K. Y. Ng, I. Matsuba, and P. C. Zhang, “RAG in health care: a novel framework for improving communication and decision -making by addressing LLM limitations,” Nejm Ai, vol. 2, no. 1, p. AIra2400380, 2025

work page 2025

[56] [56]

Mistral Medium 3,

Mistral AI , “Mistral Medium 3,” 2025. [Online]. Available: https://mistral.ai/news/mistral-medium-3/

work page 2025

[57] [57]

[Online]

xAI, “Grok 4,” 2025. [Online]. Available: https://x.ai/news/grok-4

work page 2025

[58] [58]

DeepSeek V3.2 -Exp,

DeepSeek, “DeepSeek V3.2 -Exp,” 2025. [Online]. Available: https://api - docs.deepseek.com/news/news250929

work page 2025

[59] [59]

Is temperature the creativity parameter of large language models?,

M. Peeperkorn, T. Kouwenhoven, D. Brown, and A. Jordanous, “Is temperature the creativity parameter of large language models?,” arXiv preprint arXiv:2405.00492, 2024

work page arXiv 2024

[60] [60]

Challenges and applications of large language models.arXiv preprint arXiv:2307.10169, 2023

J. Kaddour, J. Harris, M. Mozes, H. Bradley, R. Raileanu, and R . McHardy, “Challenges and applications of large language models,” arXiv preprint arXiv:2307.10169, 2023

work page arXiv 2023

[61] [61]

LangGraph,

LangChain, “LangGraph,” 2025. [Online]. Available: https://www.langchain.com/langgraph

work page 2025

[62] [62]

Johnson, M

J. Johnson, M. Douze, and H. Jégou, “Billion -scale similarity search with GPUs,” IEEE Transactions on Big Data , vol. 7, no. 3, pp. 535 –547, 2021, doi: 10.1109/TBDATA.2019.2921572

work page doi:10.1109/tbdata.2019.2921572 2021

[63] [63]

A content analysis of internet information sources on medical cannabis,

D. J. Kruger, I. M. Moffet, L. C. Seluk, and L. A. Zammit, “A content analysis of internet information sources on medical cannabis,” Journal of Cannabis Research, vol. 2, no. 1, p. 29, 2020

work page 2020

[64] [64]

The information-seeking behavior and unmet knowledge needs of older medicinal cannabis consumers in Canada: A qualitative descriptive study,

J. I. Butler et al., “The information-seeking behavior and unmet knowledge needs of older medicinal cannabis consumers in Canada: A qualitative descriptive study,” Drugs & Aging, vol. 40, no. 5, pp. 427–438, 2023

work page 2023

[65] [65]

Can-stress: A real-world multimodal dataset for understanding cannabis use, stress, and physiological responses.arXiv preprint arXiv:2503.19935, 2025

R. R. Azghan et al. , “CAN -STRESS: A Real -World Multimodal Dataset for Understanding Cannabis Use, Stress, and Physiological Responses,” arXiv preprint arXiv:2503.19935, 2025

work page arXiv 2025

[66] [66]

Evaluating large language models and agents in healthcare: key challenges in clinical applications,

C. Xiaolan, X. Jiayang, L. Shanfu, L. Yexin, H. Mingguang, and S. Danli, “Evaluating large language models and agents in healthcare: key challenges in clinical applications,” Intelligent Medicine, 2025

work page 2025

[67] [67]

On the influence of an iterative affect annotation approach on inter-observer and self -observer reliability,

S. K. D’Mello, “On the influence of an iterative affect annotation approach on inter-observer and self -observer reliability,” IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 136–149, 2015

work page 2015

[68] [68]

Annotated dataset creation through large language models for non-english medical NLP,

J. Frei and F. Kramer, “Annotated dataset creation through large language models for non-english medical NLP,” Journal of Biomedical Infor matics, vol. 145, p. 104478, 2023

work page 2023

[69] [69]

Self -instruct: Aligning language models with self -generated instructions,

Y. Wang et al. , “Self -instruct: Aligning language models with self -generated instructions,” in Proceedings of the 61st annual meeting of the association for computational linguistics (volume 1: long papers), 2023, pp. 13484–13508

work page 2023

[70] [70]

Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models,

T. Wu, M. T. Ribeiro, J. Heer, and D. S. Weld, “Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models,” arXiv preprint arXiv:2101.00288, 2021

work page arXiv 2021

[71] [71]

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

P. Shojaee, K. Meidani, S. Gupta, A. B. Farimani, and C. K. Reddy, “Llm-sr: Scientific equation discovery via programming with large language models,” arXiv preprint arXiv:2404.18400, 2024

work page arXiv 2024

[72] [72]

Are large language models good annotators?,

J. Mohta, K. Ak, Y. Xu, and M. Shen, “Are large language models good annotators?,” in Proceedings on, PMLR, 2023, pp. 38–48

work page 2023

[73] [73]

29.Carion, N.et al.Sam 3: Segment anything with concepts (2025)

Z. Tan et al. , “Large language models for data annotation and synthesis: A survey,” arXiv preprint arXiv:2402.13446, 2024

work page arXiv 2024

[74] [74]

Cannabis: an emerging treatment for common symptoms in older adults,

K. H. Yang et al., “Cannabis: an emerging treatment for common symptoms in older adults,” Journal of the American Geriatrics Society, vol. 69, no. 1, pp. 91 –97, 2021

work page 2021

[75] [75]

Risk factors for cannabis-related mental health harms in older adults: a review,

A. Hudson and P. Hudson, “Risk factors for cannabis-related mental health harms in older adults: a review,” Clinical Gerontologist, vol. 44, no. 1, pp. 3–15, 2021

work page 2021

[76] [76]

Evaluating clinical AI summaries with large language models as judges,

E. Croxford et al., “Evaluating clinical AI summaries with large language models as judges,” npj Digital Medicine, vol. 8, no. 1, p. 640, 2025

work page 2025

[77] [77]

Alignbench: Benchmarking chinese alignment of large language models,

X. Liu et al. , “Alignbench: Benchmarking chinese alignment of large language models,” in Proceedings of the 62nd Annual Meeting of the A ssociation for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 11621–11640

work page 2024

[78] [78]

LiveBench

“LiveBench.” [Online]. Available: https://livebench.ai/

work page

[79] [79]

Self -rag: Learning to retrieve, generate, and critique through self-reflection,

A. Asai, Z. Wu, Y. Wang, A. Sil, and H. Hajishirzi, “Self -rag: Learning to retrieve, generate, and critique through self-reflection,” 2024

work page 2024

[80] [80]

Corrective retrieval augmented generation,

S.-Q. Yan, J. -C. Gu, Y. Zhu, and Z. -H. Ling, “Corrective retrieval augmented generation,” 2024

work page 2024