Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults
Pith reviewed 2026-05-16 14:08 UTC · model grok-4.3
The pith
Retrieval-augmented large language models deliver more cautious cannabidiol guidance for older adults than standalone models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Retrieval-augmented models, especially the ensemble version, consistently generated more cautious and guideline-aligned recommendations on cannabidiol use in older adults across three automated evaluation strategies, outperforming standalone large language models in 64 tested scenarios.
What carries the argument
Retrieval-augmented generation framework that integrates multiple retrieval systems with structured prompts and curated cannabidiol evidence to ensure context-aware and safe outputs.
If this is right
- Retrieval augmentation leads to safer AI recommendations in health domains involving potential drug interactions.
- Ensemble approaches combining multiple retrieval methods yield the highest alignment with guidelines.
- Automated evaluation frameworks can assess AI safety without manual annotation.
- Such systems can assist older adults and caregivers in understanding appropriate cannabidiol use.
- The framework is reproducible for testing other AI health applications.
Where Pith is reading between the lines
- If validated in clinical settings, this could reduce harm from inaccurate AI health advice on supplements.
- The technique may apply to guidance on other substances or medications with evolving evidence.
- Future models could incorporate real-time evidence updates to maintain alignment with latest guidelines.
- Human-AI hybrid systems might use this retrieval method to support healthcare professionals.
Load-bearing premise
The curated evidence on cannabidiol is complete and up-to-date with current guidelines, and the automated metrics accurately reflect real-world safety and alignment.
What would settle it
Human experts reviewing a sample of the AI outputs and finding that retrieval-augmented responses are not more cautious or are less aligned with guidelines than standalone model outputs.
Figures
read the original abstract
Older adults commonly experience chronic conditions such as pain and sleep disturbances and may consider cannabidiol for symptom management. Safe use requires appropriate dosing, careful titration, and awareness of drug interactions, yet stigma and limited health literacy often limit understanding. Conversational artificial intelligence systems based on large language models and retrieval-augmented generation may support cannabidiol education, but their safety and reliability remain insufficiently evaluated. This study developed a retrieval-augmented large language model framework that combines structured prompt engineering with curated cannabidiol evidence to generate context-aware guidance for older adults, including those with cognitive impairment. We also proposed an automated, annotation-free evaluation framework to benchmark leading standalone and retrieval-augmented models in the absence of standardized benchmarks. Sixty-four diverse user scenarios were generated by varying symptoms, preferences, cognitive status, demographics, comorbidities, medications, cannabis history, and caregiver support. Multiple state-of-the-art models were evaluated, including a novel ensemble retrieval architecture that integrates multiple retrieval systems. Across three automated evaluation strategies, retrieval-augmented models consistently produced more cautious and guideline-aligned recommendations than standalone models, with the ensemble approach performing best. These findings demonstrate that structured retrieval improves the reliability and safety of AI-driven cannabidiol education and provide a reproducible framework for evaluating AI tools used in sensitive health contexts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a retrieval-augmented generation (RAG) framework that combines structured prompt engineering with a curated cannabidiol evidence base to generate context-aware guidance for older adults on CBD use, including those with cognitive impairment. It generates 64 diverse scenarios by varying symptoms, demographics, comorbidities, medications, and other factors, then evaluates leading standalone and RAG models (including a novel ensemble retrieval architecture) using three automated annotation-free strategies. The central claim is that RAG models, especially the ensemble, consistently produce more cautious and guideline-aligned recommendations than standalone models.
Significance. If the automated evaluation strategies can be shown to correlate with expert clinical judgment, the work could supply a reproducible, annotation-free framework for benchmarking AI safety in sensitive health-education domains where dosing errors and drug interactions carry direct risk.
major comments (2)
- [Evaluation Framework] The central claim that RAG models (particularly the ensemble) produce more cautious and guideline-aligned output rests entirely on the three automated annotation-free evaluation strategies, yet the manuscript supplies no quantitative metrics, no explicit description of the strategies (e.g., lexical caution markers, retrieval overlap, or prompt-derived heuristics), and no external validation that these proxies correlate with actual clinical safety or fidelity to cannabidiol guidelines. This is load-bearing for the result.
- [Evidence Base] The curated cannabidiol evidence base is treated as ground truth for measuring guideline alignment without reported completeness checks, expert curation audit, or assessment of its representativeness of current clinical guidelines.
minor comments (2)
- [Abstract] The abstract asserts 'consistent improvements' and 'more cautious' recommendations but reports no specific quantitative metrics, effect sizes, or per-strategy scores to support these statements.
- [Scenario Generation] Clarify the precise procedure used to generate the 64 scenarios and confirm that they capture real clinical complexity (e.g., polypharmacy interactions, cognitive impairment effects) rather than surface-level variations.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive comments on our manuscript. We have revised the paper to address the major concerns raised and provide detailed responses below.
read point-by-point responses
-
Referee: [Evaluation Framework] The central claim that RAG models (particularly the ensemble) produce more cautious and guideline-aligned output rests entirely on the three automated annotation-free evaluation strategies, yet the manuscript supplies no quantitative metrics, no explicit description of the strategies (e.g., lexical caution markers, retrieval overlap, or prompt-derived heuristics), and no external validation that these proxies correlate with actual clinical safety or fidelity to cannabidiol guidelines. This is load-bearing for the result.
Authors: We agree that the original manuscript did not provide sufficient detail on the evaluation framework, which is indeed central to our claims. In the revised manuscript, we have added a new subsection (3.4 Evaluation Strategies) that explicitly describes each of the three automated, annotation-free strategies. This includes: (1) lexical caution scoring based on predefined markers (e.g., frequency of phrases recommending medical consultation or low-dose initiation), with reported quantitative results showing higher caution in RAG models; (2) retrieval-evidence overlap metrics quantifying how closely generated responses align with retrieved documents; and (3) prompt-derived heuristic checks for guideline elements such as interaction warnings. We now include specific quantitative metrics throughout the results section for each model and strategy. Regarding external validation, we acknowledge that demonstrating correlation with expert clinical judgment would strengthen the work but requires a dedicated follow-up study involving clinicians, which is beyond the current scope. We have expanded the limitations section to discuss this and outline plans for future validation. revision: yes
-
Referee: [Evidence Base] The curated cannabidiol evidence base is treated as ground truth for measuring guideline alignment without reported completeness checks, expert curation audit, or assessment of its representativeness of current clinical guidelines.
Authors: We thank the referee for pointing this out. The evidence base was assembled from a systematic search of PubMed, Cochrane reviews, and major clinical guidelines (e.g., from the FDA, NIH, and geriatric societies) published through 2023. In the revised version, we have included a detailed description in Section 2.2 and a new Appendix B that reports: completeness checks by topic coverage, the curation process (two authors independently reviewed sources with consensus), and an assessment of representativeness showing alignment with current recommendations on older adults. While we did not conduct a formal external expert audit, we have noted this as a limitation and clarified that the base serves as a representative synthesis rather than exhaustive ground truth. These additions provide greater transparency without altering the core findings. revision: yes
- Demonstrating direct correlation of the automated evaluation proxies with expert clinical judgments, as this would necessitate a new empirical study with healthcare professionals.
Circularity Check
No significant circularity detected
full rationale
The paper generates 64 scenarios externally by varying symptoms, demographics, and comorbidities, then evaluates RAG versus standalone models using three automated annotation-free strategies that compare outputs to curated guidelines. No equations or derivations reduce a prediction to a fitted parameter from the same data, no self-definitional loops appear where X is defined via Y and then Y is predicted from X, and no load-bearing self-citations are invoked to force uniqueness or ansatz choices. The central claim that ensemble RAG produces more cautious outputs rests on independent comparison to external guidelines rather than tautological renaming or construction from inputs, making the derivation self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Curated cannabidiol evidence base is accurate and comprehensive for generating safe guidance.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Across three automated evaluation strategies, retrieval-augmented models consistently produced more cautious and guideline-aligned recommendations than standalone models, with the ensemble approach performing best.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Cannabidiol (CBD) Use by Older Adults for Acute and Chronic Pain,
B. Porter, B. S. Marie, G. Milavetz, and K. Herr, “Cannabidiol (CBD) Use by Older Adults for Acute and Chronic Pain,” J Gerontol Nurs, vol. 47, no. 7, pp. 6 –15, Jul. 2021, doi: 10.3928/00989134-20210610-02
-
[2]
Use of cannabidiol in the management of insomnia: a systematic review,
R. M. Ranum, M. O. Whipple, I. Croghan, B. Bauer, L. L. Toussaint, and A. Vincent, “Use of cannabidiol in the management of insomnia: a systematic review,” Cannabis and Cannabinoid Research, vol. 8, no. 2, pp. 213–229, 2023
work page 2023
-
[3]
Cannabidiol in anxiety and sleep: a large case series,
S. Shannon, N. Lewis, H. Lee, and S. Hughes, “Cannabidiol in anxiety and sleep: a large case series,” The Permanente Journal, vol. 23, pp. 18–041, 2019
work page 2019
-
[4]
R. Ortiz, S. Rueda, and P. Di Ciano, “Use of cannabidiol (CBD) for the treatment of cognitive impairment in psychiatric and neurological illness: A narrative review,” Exp Clin Psychopharmacol , vol. 31, no. 5, pp. 978 –988, Oct. 2023, doi: 10.1037/pha0000659
-
[5]
A. Bahji et al., “Cannabinoids in the management of behavioral, psychological, and motor symptoms of neurocognitive disorders: a mixed studies systematic review,” Journal of Cannabis Research, vol. 4, no. 1, pp. 1–19, 2022
work page 2022
-
[6]
CBD and TH C: do they complement each other like Yin and Yang?,
S. D. Pennypacker and E. A. Romero -Sandoval, “CBD and TH C: do they complement each other like Yin and Yang?,” Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy, vol. 40, no. 11, pp. 1152–1165, 2020
work page 2020
-
[7]
J. Renard, B. Panesar, S. Noorbakhsh, E. Wadsworth, N. Cristiano, and R. Gabrys, “Original qualitative research Perceptions of cannabis among adults aged 60 years and older in Canada: a qualitative study”
-
[8]
Epidemiology of cannabis use among middle-aged and older adults in the United States,
O. Livne, M. Stohl, J. Gilman, T. E. Goldberg, M. M. Wall, and D. S. Hasin, “Epidemiology of cannabis use among middle-aged and older adults in the United States,” American Journal of Preventive Medicine, p. 108149, 2025
work page 2025
-
[9]
R. Kamrul, D. Bunka, A. Crawley, B. Schuster, and M. LeBras, “Navigating cannabinoid choices for chronic neuropathic pain in older adults: potholes and highlights,” Canadian Family Physician, vol. 65, no. 11, pp. 807–811, 2019
work page 2019
-
[10]
Clearing the Smoke on Cannabis: Medical Use of Cannabis and Cannabinoids (2024 Update),
C. C. on S. Use and Addiction, “Clearing the Smoke on Cannabis: Medical Use of Cannabis and Cannabinoids (2024 Update),” Canadian Centre on Substance Use and Addiction, Ottawa, Canada, 2024. [Online]. Available: https://www.ccsa.ca/sites/default/files/2024-04/Clearing-the-Smoke-on-Cannabis- Medical-Use-of-Cannabis-and-Cannabinoids-2024-Update-en.pdf
work page 2024
-
[11]
Mental health and cognition in older cannabis users: a review,
B. E. Vacaflor, O. Beauchet, G. E. Jarvis, A. Schavietto, and S. Rej, “Mental health and cognition in older cannabis users: a review,” Canadian Geriatrics Journal, vol. 23, no. 3, p. 242, 2020
work page 2020
-
[12]
Cannabis Use Among Older Adults,
V. Pravosud et al., “Cannabis Use Among Older Adults,” JAMA Network Open, vol. 8, no. 5, pp. e2510173–e2510173, 2025
work page 2025
-
[13]
J. Protheroe, E. V. Esta cio, and S. Saidy -Khan, “Patient information materials in general practices and promotion of health literacy: an observational study of their effectiveness,” The British Journal of General Practice , vol. 65, no. 632, p. e192, 2015
work page 2015
-
[14]
The Effects of Stigma: Older Persons and Medicinal Cannabis,
S. Dahlke et al., “The Effects of Stigma: Older Persons and Medicinal Cannabis,” Qual Health Res , vol. 34, no. 8 –9, pp. 717 –731, Jul. 2024, doi: 10.1177/10497323241227419
-
[15]
Health literacy and older adults: A sy stematic review,
A. K. Chesser, N. Keene Woods, K. Smothers, and N. Rogers, “Health literacy and older adults: A sy stematic review,” Gerontology and geriatric medicine , vol. 2, p. 2333721416630492, 2016
work page 2016
-
[16]
Readability and comprehensibility of over-the-counter medication labels,
H. Trivedi, A. Trivedi, and M. F. Hannan, “Readability and comprehensibility of over-the-counter medication labels,” Renal Failure , vol. 36, no. 3, pp. 473 –477, 2014
work page 2014
-
[17]
Emergent Abilities of Large Language Models
J. Wei et al. , “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[18]
Survey of hallucination in natural language generation,
Z. Ji et al. , “Survey of hallucination in natural language generation,” ACM computing surveys, vol. 55, no. 12, pp. 1–38, 2023
work page 2023
-
[19]
C. R. S ubramanian, D. A. Yang, and R. Khanna, “Enhancing health care communication with large language models —the role, challenges, and future directions,” JAMA network open, vol. 7, no. 3, pp. e240347–e240347, 2024
work page 2024
-
[20]
Large Language Models for Chatbot Health Advice Studies: A Systematic Review,
B. Huo et al. , “Large Language Models for Chatbot Health Advice Studies: A Systematic Review,” JAMA Netw Open, vol. 8, no. 2, p. e2457879, Feb. 2025, doi: 10.1001/jamanetworkopen.2024.57879
-
[21]
S. H. Emile, N. Horesh, Z. Garoufalia, R. Gefen, M. Boutros, and S. D. Wexner, “Assessment of the Uti lity of Artificial Intelligence -Based Chatbots in Patient Education: A Systematic Review and Meta -Analysis,” The American SurgeonTM, p. 00031348251367031, 2025
work page 2025
-
[22]
Retrieval augmented generation for large language models in healthcare: A systematic review,
L. M. Amugongo, P. Mascheroni, S. Brooks, S. Doering, and J. Seidel, “Retrieval augmented generation for large language models in healthcare: A systematic review,” PLOS Digit Health , vol. 4, no. 6, p. e0000877, Jun. 2025, doi: 10.1371/journal.pdig.0000877
-
[23]
D. J. Bunnell, M. J. Bondy, L. M. Fromtling, E. Ludeman, and K. Gourab, “Bridging AI and Healthcare: A Scoping Review of Retrieval -Augmented Generation — Ethics, Bias, Transparency, Improvements, and Applications,” medRxiv, pp. 2025– 04, 2025
work page 2025
-
[24]
Retrieval -augmented generation for knowledge -intensive nlp tasks,
P. L ewis et al. , “Retrieval -augmented generation for knowledge -intensive nlp tasks,” Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020
work page 2020
- [25]
-
[26]
Google DeepMind, “Gemini 2.5 Pro,” 2025. [Online]. Available: https://deepmind.google/models/gemini/pro/
work page 2025
-
[27]
Anthropic, “Claude Sonnet 4.5,” 2025. [Online]. Available: https://www.anthropic.com/news/claude-sonnet-4-5
work page 2025
-
[28]
Y. H. Ke et al., “Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness,” npj Digital Medicine, vol. 8, no. 1, p. 187, 2025
work page 2025
-
[29]
Conversational agents in healthcare: a systematic review,
L. Laranjo et al. , “Conversational agents in healthcare: a systematic review,” Journal of the American Medical Informatics Association , vol. 25, no. 9, pp. 1248 – 1258, 2018
work page 2018
-
[30]
PharmaLLM: A medicine prescriber chatbot exploiting Open -Source large language models,
A. Azam, Z. Naz, and M. U. G. Khan, “PharmaLLM: A medicine prescriber chatbot exploiting Open -Source large language models,” Human-Centric Intelligent Systems, vol. 4, no. 4, pp. 527–544, 2024
work page 2024
-
[31]
B. R. Cornelison, B. L. Erstad, and C. Edwards, “Accuracy of a chatbot in answering questions that patients should ask before taking a new medication,” Journal of the American Pharmacists Association, vol. 64, no. 4, p. 102110, 2024
work page 2024
-
[32]
A. I. Ismail, B. O. Ibrahim, O. Adekanmbi, and I. Adebara, “Retrieval -Augmented Generation Meets Local Languages for Improved Drug Information Access and Comprehension.,” in Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), 2025, pp. 108–114
work page 2025
-
[33]
Development and evaluation of a lightweight large language model chatbot for medication enquiry,
K. Elangovan et al., “Development and evaluation of a lightweight large language model chatbot for medication enquiry,” PLOS Digital Health , vol. 4, no. 9, p. e0000961, 2025
work page 2025
-
[34]
A. Q. Jiang et al., “Mistral 7B,” arXiv preprint arXiv:2310.06825, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[35]
Enhanced LLM -supported instructions for medication use through retrieval-augmented generation,
D. dos R. de Jesus et al., “Enhanced LLM -supported instructions for medication use through retrieval-augmented generation,” Computers in Biology and Medicine, vol. 198, p. 111135, 2025
work page 2025
-
[36]
D. Steybe et al. , “E valuation of a context -aware chatbot using retrieval - augmented generation for answering clinical questions on medication -related osteonecrosis of the jaw,” Journal of Cranio -Maxillofacial Surgery, vol. 53, no. 4, pp. 355–360, 2025
work page 2025
-
[37]
From prompt to platform: an agentic AI workflow for healthcare simulation scenario design,
F. L. Barra et al., “From prompt to platform: an agentic AI workflow for healthcare simulation scenario design,” Advances in Simulation, vol. 10, no. 1, p. 29, 2025
work page 2025
-
[38]
Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation,
J. Kang, “Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation,” arXiv preprint arXiv:2507.02253, 2025
-
[39]
Automated Generation of Test Scenarios for Autonomous Driving Using LLMs,
A. A. Danso and U. Büker, “Automated Generation of Test Scenarios for Autonomous Driving Using LLMs,” Electronics, vol. 14, no. 16, p. 3177, 2025
work page 2025
-
[40]
H. E. Resnick, B. E. Fries, and L. M. Verbrugge, “Windows to their world: the effect of sensory impairments on social engagement and activity time in nursing home residents,” J Gerontol B Psychol Sci Soc Sci , vol. 52, no. 3, pp. S135 -144, May 1997, doi: 10.1093/geronb/52b.3.s135
-
[41]
S. Tumati, K. L. Lanctôt, R. Wang, A. Li, A. Davis, and N. Herrmann, “Medical cannabis use among older adults in Canada: self -reported data on types and amount used, and perceived effects,” Drugs & aging , vol. 39, no. 2, pp. 153 –163, 2022
work page 2022
-
[42]
F. Jessen et al., “A conceptual framework for research on subjective cognitive decline in preclinical Alzheimer’s disease,” Alzheimer’s & dementia, vol. 10, no. 6, pp. 844–852, 2014
work page 2014
-
[43]
R. C. Petersen, “Mild cognitive impairment,” CONTINUUM: lifelong Learning in Neurology, vol. 22, no. 2, pp. 404–418, 2016
work page 2016
-
[44]
Older persons: Definitions and key concepts,
World Health Organization, “Older persons: Definitions and key concepts,” 2023. [Online]. Available: https://emergency.unhcr.org/protection/persons -risk/older- persons
work page 2023
-
[45]
Defining age and older adulthood: NIH Style Guide,
National Institutes of Health, “Defining age and older adulthood: NIH Style Guide,” 2024. [Online]. Available: https://www.nih.gov/nih-style-guide/age
work page 2024
-
[46]
Cannabidiol and liver enzyme level elevations in healthy adults: A randomized clinical trial,
J. Florian et al., “Cannabidiol and liver enzyme level elevations in healthy adults: A randomized clinical trial,” JAMA Internal Medicine, vol. 185, no. 9, pp. 1070 –1078, 2025
work page 2025
-
[47]
R. Hashemi et al., “High prevalence of comorbidities in older adult patients with type 2 diabetes: a cross -sectional survey,” BMC geriatrics , vol. 24, no. 1, p. 873, 2024
work page 2024
-
[48]
Exploring The Contours: Navig ating Cannabis Use Among Older Adults,
Y. M. Shin, M. Moussa, and J. Akwe, “Exploring The Contours: Navig ating Cannabis Use Among Older Adults,” Journal of Brown Hospital Medicine, vol. 3, no. 3, p. 120951, 2024
work page 2024
-
[49]
Taking Care of Themselves: Cannabis Use Among Informal Care Partners of Older Adults,
B. Kaskie et al., “Taking Care of Themselves: Cannabis Use Among Informal Care Partners of Older Adults,” Cannabis and cannabinoid research, 2025
work page 2025
-
[50]
Prompt engineering as an important emerging skill for medical professionals: tutorial,
B. Meskó, “Prompt engineering as an important emerging skill for medical professionals: tutorial,” Journal of medical Internet research , vol. 25, p. e50638, 2023
work page 2023
-
[51]
Better zero -shot reasoning with role -play prompting,
A. Kong et al. , “Better zero -shot reasoning with role -play prompting,” in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 4099–4113
work page 2024
-
[52]
Schema-guided natural language generation,
Y. Du et al., “Schema-guided natural language generation,” in Proceedings of the 13th International Conference on Natural Language Generation, 2020, pp. 283–295
work page 2020
-
[53]
Chain -of-thought prompting elicits reasoning in large language models,
J. Wei et al. , “Chain -of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems , vol. 35, pp. 24824– 24837, 2022
work page 2022
-
[54]
J. Tang, A. Abedi, T. J. Colella, and S. S. Khan, “Rehabilitation Exercise Quality Assessment and Feedback Generation Using Large Language Models with Prompt Engineering,” in International Joint Conference on Artificial Intelligence , Springer, 2025, pp. 60–75
work page 2025
-
[55]
K. K. Y. Ng, I. Matsuba, and P. C. Zhang, “RAG in health care: a novel framework for improving communication and decision -making by addressing LLM limitations,” Nejm Ai, vol. 2, no. 1, p. AIra2400380, 2025
work page 2025
-
[56]
Mistral AI , “Mistral Medium 3,” 2025. [Online]. Available: https://mistral.ai/news/mistral-medium-3/
work page 2025
- [57]
-
[58]
DeepSeek, “DeepSeek V3.2 -Exp,” 2025. [Online]. Available: https://api - docs.deepseek.com/news/news250929
work page 2025
-
[59]
Is temperature the creativity parameter of large language models?,
M. Peeperkorn, T. Kouwenhoven, D. Brown, and A. Jordanous, “Is temperature the creativity parameter of large language models?,” arXiv preprint arXiv:2405.00492, 2024
-
[60]
Challenges and applications of large language models.arXiv preprint arXiv:2307.10169, 2023
J. Kaddour, J. Harris, M. Mozes, H. Bradley, R. Raileanu, and R . McHardy, “Challenges and applications of large language models,” arXiv preprint arXiv:2307.10169, 2023
-
[61]
LangChain, “LangGraph,” 2025. [Online]. Available: https://www.langchain.com/langgraph
work page 2025
-
[62]
J. Johnson, M. Douze, and H. Jégou, “Billion -scale similarity search with GPUs,” IEEE Transactions on Big Data , vol. 7, no. 3, pp. 535 –547, 2021, doi: 10.1109/TBDATA.2019.2921572
-
[63]
A content analysis of internet information sources on medical cannabis,
D. J. Kruger, I. M. Moffet, L. C. Seluk, and L. A. Zammit, “A content analysis of internet information sources on medical cannabis,” Journal of Cannabis Research, vol. 2, no. 1, p. 29, 2020
work page 2020
-
[64]
J. I. Butler et al., “The information-seeking behavior and unmet knowledge needs of older medicinal cannabis consumers in Canada: A qualitative descriptive study,” Drugs & Aging, vol. 40, no. 5, pp. 427–438, 2023
work page 2023
-
[65]
R. R. Azghan et al. , “CAN -STRESS: A Real -World Multimodal Dataset for Understanding Cannabis Use, Stress, and Physiological Responses,” arXiv preprint arXiv:2503.19935, 2025
-
[66]
Evaluating large language models and agents in healthcare: key challenges in clinical applications,
C. Xiaolan, X. Jiayang, L. Shanfu, L. Yexin, H. Mingguang, and S. Danli, “Evaluating large language models and agents in healthcare: key challenges in clinical applications,” Intelligent Medicine, 2025
work page 2025
-
[67]
S. K. D’Mello, “On the influence of an iterative affect annotation approach on inter-observer and self -observer reliability,” IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 136–149, 2015
work page 2015
-
[68]
Annotated dataset creation through large language models for non-english medical NLP,
J. Frei and F. Kramer, “Annotated dataset creation through large language models for non-english medical NLP,” Journal of Biomedical Infor matics, vol. 145, p. 104478, 2023
work page 2023
-
[69]
Self -instruct: Aligning language models with self -generated instructions,
Y. Wang et al. , “Self -instruct: Aligning language models with self -generated instructions,” in Proceedings of the 61st annual meeting of the association for computational linguistics (volume 1: long papers), 2023, pp. 13484–13508
work page 2023
-
[70]
Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models,
T. Wu, M. T. Ribeiro, J. Heer, and D. S. Weld, “Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models,” arXiv preprint arXiv:2101.00288, 2021
-
[71]
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
P. Shojaee, K. Meidani, S. Gupta, A. B. Farimani, and C. K. Reddy, “Llm-sr: Scientific equation discovery via programming with large language models,” arXiv preprint arXiv:2404.18400, 2024
-
[72]
Are large language models good annotators?,
J. Mohta, K. Ak, Y. Xu, and M. Shen, “Are large language models good annotators?,” in Proceedings on, PMLR, 2023, pp. 38–48
work page 2023
-
[73]
29.Carion, N.et al.Sam 3: Segment anything with concepts (2025)
Z. Tan et al. , “Large language models for data annotation and synthesis: A survey,” arXiv preprint arXiv:2402.13446, 2024
-
[74]
Cannabis: an emerging treatment for common symptoms in older adults,
K. H. Yang et al., “Cannabis: an emerging treatment for common symptoms in older adults,” Journal of the American Geriatrics Society, vol. 69, no. 1, pp. 91 –97, 2021
work page 2021
-
[75]
Risk factors for cannabis-related mental health harms in older adults: a review,
A. Hudson and P. Hudson, “Risk factors for cannabis-related mental health harms in older adults: a review,” Clinical Gerontologist, vol. 44, no. 1, pp. 3–15, 2021
work page 2021
-
[76]
Evaluating clinical AI summaries with large language models as judges,
E. Croxford et al., “Evaluating clinical AI summaries with large language models as judges,” npj Digital Medicine, vol. 8, no. 1, p. 640, 2025
work page 2025
-
[77]
Alignbench: Benchmarking chinese alignment of large language models,
X. Liu et al. , “Alignbench: Benchmarking chinese alignment of large language models,” in Proceedings of the 62nd Annual Meeting of the A ssociation for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 11621–11640
work page 2024
- [78]
-
[79]
Self -rag: Learning to retrieve, generate, and critique through self-reflection,
A. Asai, Z. Wu, Y. Wang, A. Sil, and H. Hajishirzi, “Self -rag: Learning to retrieve, generate, and critique through self-reflection,” 2024
work page 2024
-
[80]
Corrective retrieval augmented generation,
S.-Q. Yan, J. -C. Gu, Y. Zhu, and Z. -H. Ling, “Corrective retrieval augmented generation,” 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.