Evaluation of 6233 MedGPTs finds 25-30% with low factual accuracy, 33.6-54.3% violating operational thresholds, and 57% of action-enabled models lacking privacy disclosures.
Chatdoctor: A medical chat model fine- tuned on a large language model meta-ai (llama) using medical domain knowledge
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Exclusive Unlearning makes LLMs safe by forgetting all but retained domain knowledge, protecting against jailbreaks while preserving useful responses in areas like medicine and math.
LLaVA-Med is created via curriculum fine-tuning on PubMed figure-caption pairs and GPT-4 self-instructed data, achieving competitive or better results than prior supervised models on three biomedical VQA benchmarks.
DualSFT derives parameter masks and data subsets as row- and column-wise aggregations of one gradient interaction matrix under first- and second-order validation-improvement approximations.
RECOVER is an LLM-powered RPM system for postoperative GI cancer care, built from 7 participatory design sessions and 5 patient interviews, then piloted with 4 staff and 5 patients to derive design strategies and responsible AI insights.
Introduces Tree Generation (TG-SFT) to generate synthetic instruction-tuning data from LLMs, reducing catastrophic forgetting when fine-tuning MLLMs on domain-specific or multimodal data.
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
A 14B model trained on synthetic data from Brazilian clinical guidelines outperforms larger LLMs on new benchmarks for Brazilian healthcare protocols.
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
citing papers explorer
-
Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models
Evaluation of 6233 MedGPTs finds 25-30% with low factual accuracy, 33.6-54.3% violating operational thresholds, and 57% of action-enabled models lacking privacy disclosures.
-
Exclusive Unlearning
Exclusive Unlearning makes LLMs safe by forgetting all but retained domain knowledge, protecting against jailbreaks while preserving useful responses in areas like medicine and math.
-
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
LLaVA-Med is created via curriculum fine-tuning on PubMed figure-caption pairs and GPT-4 self-instructed data, achieving competitive or better results than prior supervised models on three biomedical VQA benchmarks.
-
One Algorithm, Two Goals: Dual Scoring for Parameter and Data Selection in LLM Fine-Tuning
DualSFT derives parameter masks and data subsets as row- and column-wise aggregations of one gradient interaction matrix under first- and second-order validation-improvement approximations.
-
RECOVER: Designing a Large Language Model-based Remote Patient Monitoring System for Postoperative Gastrointestinal Cancer Care
RECOVER is an LLM-powered RPM system for postoperative GI cancer care, built from 7 participatory design sessions and 5 patient interviews, then piloted with 4 staff and 5 patients to derive design strategies and responsible AI insights.
-
Preserving Knowledge in Large Language Model with Model-Agnostic Self-Decompression
Introduces Tree Generation (TG-SFT) to generate synthetic instruction-tuning data from LLMs, reducing catastrophic forgetting when fine-tuning MLLMs on domain-specific or multimodal data.
-
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
-
Teaching LLMs Brazilian Healthcare: Injecting Knowledge from Official Clinical Guidelines
A 14B model trained on synthetic data from Brazilian clinical guidelines outperforms larger LLMs on new benchmarks for Brazilian healthcare protocols.
-
A Survey on the Memory Mechanism of Large Language Model based Agents
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.