DermAgent orchestrates seven vision-language tools in a Plan-Execute-Reflect loop with dual-modality retrieval from 413k cases and a critic module to outperform GPT-4o by 17.6% in zero-shot dermatological diagnosis accuracy.
hub
and Ko, Justin and Swetter, Susan M
19 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
years
2026 19roles
background 4representative citing papers
Rough-set analysis finds 16.4% of 305 concept profiles in Derm7pt inconsistent (306 images), capping hard CBM accuracy at 92.1%; symmetric filtering produces a 705-image consistent benchmark where EfficientNet-B5 reaches 0.90 label accuracy.
The C-Score quantifies intra-class explanation consistency for CAM methods via confidence-weighted pairwise soft IoU and detects AUC-consistency dissociation as an early warning for model instability on chest X-ray classification.
The α-index is a conserved position-weighted authorship framework with a senior-author penalty that decreases credit as the number of middle authors increases.
Jaguar replaces prime-modulus HE with power-of-two arithmetic to enable coefficient-domain convolution and local-shift truncation, reporting 2-3.7x lower latency than Cheetah and Rhombus on ResNet-18/50 and MobileNetV2.
Introduces synthetic benchmarks for concept bottleneck models that control data modality, concept choice, annotation quality, and completeness to evaluate performance in decision support and automation.
Multi-agent LLM teams outperform human teams in creativity (d=1.50) across tasks by producing more novel ideas, with distinct semantic exploration patterns predicting success for each group.
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.
Zero-shot inversion-free flow method de-identifies skin images in under 20 seconds while preserving pathological features with IoU stability exceeding 0.67 using segment-by-synthesis and CIELAB decoupling.
MARVEL introduces a multi-expert NvMF-based system with an outlier expert that reduces FPR95 in OOD detection on medical datasets by 8-37%.
MLFFM-SegDiff adds a multi-level feature fusion module and dual-path encoder to a diffusion U-Net, reporting improved Jaccard (0.8546) and Dice (0.9207) scores over baselines on three skin lesion datasets.
IViT applies quadratic programming to a pre-trained Vision Transformer with a multi-objective loss, achieving 93.80% accuracy on six skin disease datasets (0.21% below baseline) while reducing feature redundancy by 29.5% and producing clinically consistent activations.
Cascade classification improves macro F1 over single-stage for some models by allowing sensitivity control but reveals a large generalization gap on external clinical data.
YOLO segmentation plus EfficientNet classification aggregates cell predictions to patient-level CBLC ratios, reporting weighted F1 scores of 0.87-0.91 on three external center cohorts from 89 patients.
Describes a methodology and the resulting dataset of 1,026 dermoscopic images with structured metadata and verified diagnostic labels for medical informatics research.
Prospective single-center validation of a cascade deep learning dermoscopy CDSS found no false negatives for five malignant lesions and 88.3% specificity, with quantitative IoU assessment of attention maps.
Benchmark of twelve models finds hybrid CNN-transformer architectures and a SigLIP vision-language model deliver the strongest overall performance on skin cancer detection using the PAD-UFES-20 dataset.
citing papers explorer
-
DermAgent: A Self-Reflective Agentic System for Dermatological Image Analysis with Multi-Tool Reasoning and Traceable Decision-Making
DermAgent orchestrates seven vision-language tools in a Plan-Execute-Reflect loop with dual-modality retrieval from 413k cases and a critic module to outperform GPT-4o by 17.6% in zero-shot dermatological diagnosis accuracy.
-
Quantifying Explanation Consistency: The C-Score Metric for CAM-Based Explainability in Medical Image Classification
The C-Score quantifies intra-class explanation consistency for CAM methods via confidence-weighted pairwise soft IoU and detects AUC-consistency dissociation as an early warning for model instability on chest X-ray classification.
-
The $\alpha$-Index: A Penalized Authorship-Integrity Framework for Position-Weighted Scientific Contribution
The α-index is a conserved position-weighted authorship framework with a senior-author penalty that decreases credit as the number of middle authors increases.
-
Jaguar: Fast Private CNN Inference with Power-of-Two Homomorphic Arithmetic
Jaguar replaces prime-modulus HE with power-of-two arithmetic to enable coefficient-domain convolution and local-shift truncation, reporting 2-3.7x lower latency than Cheetah and Rhombus on ResNet-18/50 and MobileNetV2.
-
Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models
Introduces synthetic benchmarks for concept bottleneck models that control data modality, concept choice, annotation quality, and completeness to evaluate performance in decision support and automation.
-
Multi-agent AI systems outperform human teams in creativity
Multi-agent LLM teams outperform human teams in creativity (d=1.50) across tasks by producing more novel ideas, with distinct semantic exploration patterns predicting success for each group.
-
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
-
ShardTensor: Domain Parallelism for Scientific Machine Learning
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
-
To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems
A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.
-
Zero-Shot Generative De-identification: Inversion-Free Flow for Privacy-Preserving Skin Image Analysis
Zero-shot inversion-free flow method de-identifies skin images in under 20 seconds while preserving pathological features with IoU stability exceeding 0.67 using segment-by-synthesis and CIELAB decoupling.
-
MARVEL: Margin-Aware Robust von Mises-Fischer Expert Learning for Long-Tailed Out-of-Distribution Detection
MARVEL introduces a multi-expert NvMF-based system with an outlier expert that reduces FPR95 in OOD detection on medical datasets by 8-37%.
-
MLFFM-SegDiff: A Multi-Level Feature Fusion Diffusion Model for Skin Lesion Segmentation
MLFFM-SegDiff adds a multi-level feature fusion module and dual-path encoder to a diffusion U-Net, reporting improved Jaccard (0.8546) and Dice (0.9207) scores over baselines on three skin lesion datasets.
-
IViT: A Novel Interpretable Visual Transformer for Skin Disease Detection
IViT applies quadratic programming to a pre-trained Vision Transformer with a multi-objective loss, achieving 93.80% accuracy on six skin disease datasets (0.21% below baseline) while reducing feature redundancy by 29.5% and producing clinically consistent activations.
-
Cascade Classification of Dermoscopic Images of Skin Neoplasms with Controllable Sensitivity and External Clinical Validation
Cascade classification improves macro F1 over single-stage for some models by allowing sensitivity control but reveals a large generalization gap on external clinical data.
-
Patient-Level Diagnosis of Acute Myeloid Leukemia via Deep Learning Analysis of Bone Marrow Smear
YOLO segmentation plus EfficientNet classification aggregates cell predictions to patient-level CBLC ratios, reporting weighted F1 scores of 0.87-0.91 on three external center cohorts from 89 patients.
-
Methodology for Creating a Clinically Verified Dermoscopic Image Dataset
Describes a methodology and the resulting dataset of 1,026 dermoscopic images with structured metadata and verified diagnostic labels for medical informatics research.
-
Clinical Validation of the Melanoscope AI Mobile Dermoscopy Clinical Decision Support System
Prospective single-center validation of a cascade deep learning dermoscopy CDSS found no false negatives for five malignant lesions and 88.3% specificity, with quantitative IoU assessment of attention maps.
-
CNNs, Transformers, Hybrid, and Vision Language Models for Skin Cancer Detection
Benchmark of twelve models finds hybrid CNN-transformer architectures and a SigLIP vision-language model deliver the strongest overall performance on skin cancer detection using the PAD-UFES-20 dataset.