Llava-med: Training a large language-and-vision assistant for biomedicine in one day,

· 2023

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

HealthPoint represents clinical events as points in a 4D space (content, time, modality, case) and applies low-rank relational attention to achieve state-of-the-art mortality prediction from multi-level incomplete multimodal EHRs.

Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following

cs.CV · 2026-03-19 · unverdicted · novelty 6.0

Instruction-free tuning of LVLMs on medical image-description pairs via momentum proxy instructions and response shuffling achieves SOTA accuracy on VQA tasks across SKINCON, WBCAtt, CBIS, and MIMIC-CXR.

AD-Copilot: A Vision-Language Assistant for Industrial Anomaly Detection via Visual In-context Comparison

cs.CV · 2026-03-14 · conditional · novelty 6.0

AD-Copilot trains an MLLM on a new curated industrial dataset Chat-AD with a Comparison Encoder that uses cross-attention on image pairs, reaching 82.3% accuracy on MMAD and 3.35x gains on MMAD-BBox while generalizing and exceeding human experts on some tasks.

MedLVR: Latent Visual Reasoning for Reliable Medical Visual Question Answering

cs.CV · 2026-04-10 · unverdicted · novelty 5.0

MedLVR interleaves latent visual reasoning segments in autoregressive decoding and uses two-stage training to raise average medical VQA accuracy from 48.3% to 53.4% over a Qwen2.5-VL-7B backbone on OmniMedVQA and five other benchmarks.

citing papers explorer

Showing 4 of 4 citing papers.

A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs cs.LG · 2026-04-06 · unverdicted · none · ref 47
HealthPoint represents clinical events as points in a 4D space (content, time, modality, case) and applies low-rank relational attention to achieve state-of-the-art mortality prediction from multi-level incomplete multimodal EHRs.
Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following cs.CV · 2026-03-19 · unverdicted · none · ref 11
Instruction-free tuning of LVLMs on medical image-description pairs via momentum proxy instructions and response shuffling achieves SOTA accuracy on VQA tasks across SKINCON, WBCAtt, CBIS, and MIMIC-CXR.
AD-Copilot: A Vision-Language Assistant for Industrial Anomaly Detection via Visual In-context Comparison cs.CV · 2026-03-14 · conditional · none · ref 67
AD-Copilot trains an MLLM on a new curated industrial dataset Chat-AD with a Comparison Encoder that uses cross-attention on image pairs, reaching 82.3% accuracy on MMAD and 3.35x gains on MMAD-BBox while generalizing and exceeding human experts on some tasks.
MedLVR: Latent Visual Reasoning for Reliable Medical Visual Question Answering cs.CV · 2026-04-10 · unverdicted · none · ref 32
MedLVR interleaves latent visual reasoning segments in autoregressive decoding and uses two-stage training to raise average medical VQA accuracy from 48.3% to 53.4% over a Qwen2.5-VL-7B backbone on OmniMedVQA and five other benchmarks.

Llava-med: Training a large language-and-vision assistant for biomedicine in one day,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer