Skin-R1: Clinical Knowledge-Guided Dermatological Diagnosis Using Vision-Language Models
read the original abstract
Vision--language models (VLMs) have recently shown promise for assisting clinical reasoning in dermatological diagnosis. However, their trustworthiness and clinical utility remain limited by three key challenges: heterogeneous datasets with inconsistent diagnostic labels and concept annotations, the lack of grounded diagnostic rationales for reliable reasoning supervision, and limited scalability when transferring knowledge from small, densely annotated datasets to large collections with sparse labels. To address these challenges, we propose Skin-R1, a dermatology-oriented VLM that integrates textbook-grounded clinical reasoning supervision with reinforcement learning (RL) to improve the accuracy and robustness of diagnostic prediction. First, we construct a textbook-based reasoning generator that synthesizes hierarchy-aware and differential-diagnosis (DDx) diagnostic trajectories derived from authoritative dermatology knowledge. Second, these trajectories are used for supervised fine-tuning (SFT), establishing a clinically grounded reasoning foundation for the model. Finally, we introduce an RL training framework that incorporates the hierarchical structure of dermatological diseases into the reward design, enabling the model to generalize grounded diagnostic reasoning to large-scale datasets with sparse annotations. Extensive experiments across multiple dermatology benchmarks demonstrate that Skin-R1 consistently improves diagnostic accuracy and robustness compared to state-of-the-art Med-VLM baselines. Ablation studies further highlight the critical role of grounded reasoning supervision introduced during the SFT stage.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis
TIF-GRPO uses integral feedback on pseudo-temporal trajectories to regulate anatomy-aware rewards in RL for clinical faithfulness in volumetric CT analysis.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.