Domain-adaptive pre-training on a new French health corpus yields limited gains and risks general capability loss unless followed by model merging, which can even boost specialized performance.
hub
Medalpaca–an open-source collection of medical conversational ai models and training data
15 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Evaluation of 6233 MedGPTs finds 25-30% with low factual accuracy, 33.6-54.3% violating operational thresholds, and 57% of action-enabled models lacking privacy disclosures.
Collaborative Parameter Learning freezes 50-75% of parameters whose updates cause forgetting and updates only the 25-50% that mitigate it, allowing LLMs to learn 20-48% more new questions with negligible forgetting and lower compute cost.
CURE-MED pairs a new 13-language medical reasoning benchmark with curriculum RL to raise logical correctness to 70% and language consistency to 95% at 32B scale while outperforming baselines.
DoctorAgent-RL trains a Qwen2.5-7B doctor agent via multi-agent RL on the new MTMedDialog dataset to conduct dynamic, question-driven consultations, reaching 70% exact diagnostic match in real-patient trials.
HuatuoGPT-o1 achieves superior medical complex reasoning by using a verifier to curate reasoning trajectories for fine-tuning and then applying RL with verifier-based rewards.
LLaVA-Med is created via curriculum fine-tuning on PubMed figure-caption pairs and GPT-4 self-instructed data, achieving competitive or better results than prior supervised models on three biomedical VQA benchmarks.
FedSDR augments federated self-distillation with dual LoRA streams (local smoothing and global rectification) to produce globally aligned, factually faithful models under statistical heterogeneity.
LoRA fine-tuning of 3-4B SLMs on 162K multi-task radiology data yields strong performance deployable on consumer CPUs at 4-8 tokens/second.
SFT followed by GRPO improves LLM accuracy and reasoning recall in disease classification from radiology reports on three radiologist-annotated datasets.
FedShield-LLM integrates pruning and FHE on LoRA parameters to support secure, scalable federated fine-tuning of LLMs such as Llama-2.
MedRoute applies RL-based dynamic routing to select specialist LMM agents in a multi-agent medical diagnosis system, outperforming static baselines on text and image datasets.
Systematic comparison of nine text-only and three multimodal LLMs using in-context learning, reasoning prompts, fine-tuning, and multimodal fusion on DementiaBank speech data finds class-centroid demonstrations and token-level fine-tuning most effective, with adapted open models matching or beating
The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.