Assessing the Generalizability of Deep Neural Networks-Based Models for Black Skin Lesions

Levy Chaves; Luana Barros; Sandra Avila

arxiv: 2310.00517 · v2 · submitted 2023-09-30 · 💻 cs.CV

Assessing the Generalizability of Deep Neural Networks-Based Models for Black Skin Lesions

Luana Barros , Levy Chaves , Sandra Avila This is my paper

Pith reviewed 2026-05-24 06:27 UTC · model grok-4.3

classification 💻 cs.CV

keywords skin lesion classificationdeep neural networksgeneralizabilityskin tone biasacral lesionsFitzpatrick scalemelanoma detectionmedical imaging

0 comments

The pith

Deep neural network models for skin lesion diagnosis perform poorly on black skin lesions from acral regions compared to white skin.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates supervised and self-supervised deep neural network models on skin lesion images from acral regions such as palms, soles, and nails. These regions are common sites for melanoma in black individuals. The authors curate a dedicated dataset and assess model outcomes using the Fitzpatrick skin tone scale. Results show the models generalize poorly overall and perform better on white skin lesions. A sympathetic reader would care because such tools could aid diagnosis in areas with limited dermatology access, yet only if they work across skin tones.

Core claim

The central claim is that deep neural network models for skin lesion classification, which are trained mostly on datasets of white skin tones, exhibit poor generalizability to acral skin lesions typical in black patients. When tested on a carefully curated acral dataset stratified by the Fitzpatrick scale, the models deliver favorable performance only for lesions on white skin.

What carries the argument

Performance evaluation of supervised and self-supervised models on a curated dataset of acral skin lesions assessed via the Fitzpatrick skin tone scale.

If this is right

Diverse datasets covering multiple skin tones are required for equitable diagnostic performance.
Specialized models may need to be developed for accurate detection of acral lesions on black skin.
Without such inclusion, AI tools cannot deliver benefits to populations with limited access to dermatology.
Neglecting black skin lesions in dataset creation prevents responsible use of these technologies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Data biases in training sets could translate into unequal real-world melanoma detection rates across skin tones.
Techniques such as targeted data collection or adaptation methods might mitigate the observed gaps.
Repeating the evaluation on additional independent datasets would help isolate skin tone as the causal factor.

Load-bearing premise

That observed performance gaps stem from skin tone differences rather than confounding factors such as image quality, lesion subtype distribution, or image acquisition site.

What would settle it

Finding equivalent accuracy on a new, large collection of black acral lesions after matching for image quality, subtype mix, and acquisition conditions would falsify the claim.

Figures

Figures reproduced from arXiv: 2310.00517 by Levy Chaves, Luana Barros, Sandra Avila.

**Figure 2.** Figure 2: Each image corresponds to a melanoma sample and is associated with a [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Evaluation pipeline for all models. Given a test image, we adopt the final [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Melanoma is the most severe type of skin cancer due to its ability to cause metastasis. It is more common in black people, often affecting acral regions: palms, soles, and nails. Deep neural networks have shown tremendous potential for improving clinical care and skin cancer diagnosis. Nevertheless, prevailing studies predominantly rely on datasets of white skin tones, neglecting to report diagnostic outcomes for diverse patient skin tones. In this work, we evaluate supervised and self-supervised models in skin lesion images extracted from acral regions commonly observed in black individuals. Also, we carefully curate a dataset containing skin lesions in acral regions and assess the datasets concerning the Fitzpatrick scale to verify performance on black skin. Our results expose the poor generalizability of these models, revealing their favorable performance for lesions on white skin. Neglecting to create diverse datasets, which necessitates the development of specialized models, is unacceptable. Deep neural networks have great potential to improve diagnosis, particularly for populations with limited access to dermatology. However, including black skin lesions is necessary to ensure these populations can access the benefits of inclusive technology.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript evaluates supervised and self-supervised deep neural network models on skin lesion images extracted from acral regions, curates a new dataset of such lesions assessed on the Fitzpatrick scale to represent black skin, and concludes that the models exhibit poor generalizability to black skin lesions while performing favorably on white skin.

Significance. If the performance gap can be rigorously attributed to skin tone after controlling for confounders, the result would underscore an important fairness limitation in dermatological AI and support calls for more inclusive datasets.

major comments (2)

[Dataset curation and evaluation] Dataset curation section: the manuscript provides no indication that white-skin comparison sets (e.g., ISIC-derived) were matched or stratified on lesion subtype distribution, image resolution/quality, or acquisition site/device; without such controls the attribution of lower performance to skin tone rather than these confounders cannot be established.
[Results] Results and abstract: the central claim of 'poor generalizability' is stated without any reported quantitative metrics (accuracy, AUC, etc.), confidence intervals, dataset sizes, or statistical tests comparing the acral black-skin set to the white-skin baseline, so the data-to-claim link cannot be evaluated.

minor comments (1)

[Abstract] Abstract: key numerical results and dataset sizes should be included to allow readers to assess the magnitude of the reported performance differences.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments highlight important aspects of dataset controls and quantitative reporting that we will address in the revision. Below we respond point-by-point to the major comments.

read point-by-point responses

Referee: [Dataset curation and evaluation] Dataset curation section: the manuscript provides no indication that white-skin comparison sets (e.g., ISIC-derived) were matched or stratified on lesion subtype distribution, image resolution/quality, or acquisition site/device; without such controls the attribution of lower performance to skin tone rather than these confounders cannot be established.

Authors: We agree that explicit matching or stratification on lesion subtype, image quality, and acquisition factors would strengthen causal attribution to skin tone. The current manuscript focuses on curating and evaluating a new acral black-skin dataset against standard white-skin benchmarks (ISIC-derived) without describing such controls. In the revised version we will expand the Dataset Curation section to report all available metadata on the comparison sets, note any limitations in matching, and add a discussion of potential confounders. Where data permit, we will also include supplementary analyses that stratify or match on subtype distribution. revision: yes
Referee: [Results] Results and abstract: the central claim of 'poor generalizability' is stated without any reported quantitative metrics (accuracy, AUC, etc.), confidence intervals, dataset sizes, or statistical tests comparing the acral black-skin set to the white-skin baseline, so the data-to-claim link cannot be evaluated.

Authors: The full manuscript contains experimental results, yet we acknowledge that the abstract and Results section do not present the quantitative metrics, confidence intervals, dataset sizes, or statistical comparisons in sufficient detail. In the revision we will update the abstract with key performance numbers and ensure the Results section includes all accuracy/AUC values, confidence intervals, sample sizes, and appropriate statistical tests for the black-skin versus white-skin comparisons. revision: yes

Circularity Check

0 steps flagged

Empirical evaluation with no derivation chain or fitted predictions

full rationale

This is a dataset curation and model evaluation study. The central claim (poor generalizability to black/acral lesions) rests on direct performance measurements across datasets, not on any equation, parameter fit, or self-citation that reduces the result to its own inputs. No equations, uniqueness theorems, or ansatzes appear in the provided text. The skeptic concern about confounders is a validity issue, not a circularity issue. Score 0 is the appropriate finding for an honest empirical paper.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on the untested premise that the new acral dataset faithfully represents black-skin lesions and that skin-tone category is the dominant driver of performance difference; no free parameters or invented entities are introduced.

axioms (2)

domain assumption The Fitzpatrick scale provides a sufficient proxy for skin-tone-related appearance variation in lesion images.
Invoked when the authors assess datasets concerning the Fitzpatrick scale to verify performance on black skin.
domain assumption The selected supervised and self-supervised models are representative of current practice in skin-lesion classification.
The evaluation treats these models as standard baselines without further justification in the abstract.

pith-pipeline@v0.9.0 · 5720 in / 1248 out tokens · 26916 ms · 2026-05-24T06:27:25.690520+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

[1]

Key statistics for melanoma skin cancer

American Cancer Society. Key statistics for melanoma skin cancer. https://www. cancer.org/cancer/melanoma-skin-cancer/about/key-statistics.html , 2022

work page 2022
[2]

What is acral lentiginous melanoma? https://www.aimatmelanoma.org/melanoma-101/types-of-melanoma/ cutaneous-melanoma/acral-lentiginous-melanoma/

AIM at Melanoma Foundation. What is acral lentiginous melanoma? https://www.aimatmelanoma.org/melanoma-101/types-of-melanoma/ cutaneous-melanoma/acral-lentiginous-melanoma/

work page
[3]

Types of melanoma

Memorial Sloan Kettering Cancer Center. Types of melanoma. https://www. mskcc.org/cancer-care/types/melanoma/types-melanoma, 2022

work page 2022
[4]

Melanoma acral-estudo cl´ ınico e epidemiol´ ogico.Surgical & Cosmetic Dermatology , 2020

Yara Alves Caetano, Ana Maria Quinteiro Ribeiro, Bruno Ricardo da Silva Al- bernaz, Isabella de Paula Eleut´ erio, and Luiz Fernando Fleury Fr´ oes. Melanoma acral-estudo cl´ ınico e epidemiol´ ogico.Surgical & Cosmetic Dermatology , 2020

work page 2020
[5]

Dermatology has a problem with skin color

Roni Caryn Rabin. Dermatology has a problem with skin color. https://www-nytimes-com.cdn.ampproject.org/c/s/www.nytimes.com/2020/ 08/30/health/skin-diseases-black-hispanic.amp.html , 2020

work page 2020
[6]

Knowledge transfer for melanoma screening with deep learning

Afonso Menegola, Michel Fornaciali, Ramon Pires, Fl´ avia Vasques Bittencourt, Sandra Avila, and Eduardo Valle. Knowledge transfer for melanoma screening with deep learning. International Symposium on Biomedical Imaging , 2017

work page 2017
[7]

An evaluation of self-supervised pre-training for skin-lesion analysis

Levy Chaves, Alceu Bissoto, Eduardo Valle, and Sandra Avila. An evaluation of self-supervised pre-training for skin-lesion analysis. In European Conference on Computer Vision Workshops , 2022

work page 2022
[8]

Decolonising dermatology: why black and brown skin need better treatment

Neil Singh. Decolonising dermatology: why black and brown skin need better treatment. The Guardian, 13, 2020

work page 2020
[9]

Fitzpatrick skin phototype

DermNet. Fitzpatrick skin phototype. https://dermnetnz.org/topics/ skin-phototype, 2012

work page 2012
[10]

Skin cancer in african-americans

Dermatology Learning Network. Skin cancer in african-americans. https://www. hmpgloballearningnetwork.com/site/thederm/article/2547, 2004

work page 2004
[11]

Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset

Matthew Groh, Caleb Harris, Luis Soenksen, Felix Lau, Rachel Han, Aerin Kim, Arash Koochek, and Omar Badri. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. In Conference on Computer Vision and Pattern Recognition , 2021

work page 2021
[12]

Acral melanoma detection using a convolu- tional neural network for dermoscopy images

Chanki Yu, Sejung Yang, Wonoh Kim, Jinwoong Jung, Kee-Yang Chung, Sang Wook Lee, and Byungho Oh. Acral melanoma detection using a convolu- tional neural network for dermoscopy images. PloS one , 2018

work page 2018
[13]

Augmented decision-making for acral lentiginous melanoma detection using deep convolutional neural networks

S Lee, YS Chu, SK Yoo, S Choi, SJ Choe, SB Koh, KY Chung, L Xing, B Oh, and S Yang. Augmented decision-making for acral lentiginous melanoma detection using deep convolutional neural networks. Journal of the European Academy of Dermatology and Venereology, 2020. 14 L. Barros et al

work page 2020
[14]

Acral melanoma detection using dermoscopic images and convolutional neural networks

Qaiser Abbas, Farheen Ramzan, and Muhammad Usman Ghani. Acral melanoma detection using dermoscopic images and convolutional neural networks. Visual Computing for Industry, Biomedicine, and Art , 2021

work page 2021
[15]

Skin type diversity: a case study in skin lesion datasets

Neda Alipour, Ted Burke, and Jane Courtney. Skin type diversity: a case study in skin lesion datasets. 2023

work page 2023
[16]

PAD-UFES-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones

Andre Pacheco, Gustavo Lima, Amanda Salom˜ ao, Breno Krohling, Igor Biral, Gabriel Angelo, F´ abio Jr, Jos´ e Esgario, Alana Simora, Pedro Castro, Felipe Ro- drigues, Patricia Frasson, et al. PAD-UFES-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones. Data in Brief , 2020

work page 2020
[17]

Novoa, Melissa Jenkins, Weixin Liang, Veronica Rotemberg, Justin Ko, et al

Roxana Daneshjou, Kailas Vodrahalli, Roberto A. Novoa, Melissa Jenkins, Weixin Liang, Veronica Rotemberg, Justin Ko, et al. Disparities in dermatology ai perfor- mance on a diverse, curated clinical image set. Science Advances, 2022

work page 2022
[18]

Detecting melanoma fairly: Skin tone detection and debiasing for skin lesion classification

Peter J Bevan and Amir Atapour-Abarghouei. Detecting melanoma fairly: Skin tone detection and debiasing for skin lesion classification. In MICCAI Workshop on Domain Adaptation and Representation Transfer , pages 1–11, 2022

work page 2022
[19]

Circle: Color invariant representation learning for unbiased classification of skin lesions

Arezou Pakzad, Kumar Abhishek, and Ghassan Hamarneh. Circle: Color invariant representation learning for unbiased classification of skin lesions. In European Conference on Computer Vision , 2022

work page 2022
[20]

Improving skin color diversity in cancer detection: deep learning approach

Eman Rezk, Mohamed Eltorki, Wael El-Dakhakhni, et al. Improving skin color diversity in cancer detection: deep learning approach. JMIR Dermatology , 5(3):e39143

work page
[21]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In Conference on Computer Vision and Pattern Recognition, 2009

work page 2009
[22]

https://www.isic-archive.com, 2023

ISIC Archive. https://www.isic-archive.com, 2023

work page 2023
[23]

Data, depth, and design: Learning reliable models for skin lesion analysis

Eduardo Valle, Michel Fornaciali, Afonso Menegola, Julia Tavares, Fl´ avia Vasques Bittencourt, Lin Tzy Li, and Sandra Avila. Data, depth, and design: Learning reliable models for skin lesion analysis. Neurocomputing, 2020

work page 2020
[24]

Deep residual learn- ing for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learn- ing for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016

work page 2016
[25]

Boot- strap your own latent - a new approach to self-supervised learning

Jean-Bastien Grill, Florian Strub, Florent Altch´ e, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, et al. Boot- strap your own latent - a new approach to self-supervised learning. In Advances in Neural Information Processing Systems , 2020

work page 2020
[26]

What makes for good views for contrastive learning? In Advances in Neural Information Processing Systems , 2020

Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. What makes for good views for contrastive learning? In Advances in Neural Information Processing Systems , 2020

work page 2020
[27]

Momentum contrast for unsupervised visual representation learning

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In Conference on Com- puter Vision and Pattern Recognition , 2020

work page 2020
[28]

A sim- ple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A sim- ple framework for contrastive learning of visual representations. In International Conference on Machine Learning , 2020

work page 2020
[29]

Unsupervised learning of visual features by contrasting cluster assignments

Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems , 2020

work page 2020
[30]

Seven-point checklist and skin lesion classification using multitask multimodal neu- ral nets

Jeremy Kawahara, Sara Daneshvar, Giuseppe Argenziano, and Ghassan Hamarneh. Seven-point checklist and skin lesion classification using multitask multimodal neu- ral nets. IEEE Journal of Biomedical and Health Informatics , 2019. Title Suppressed Due to Excessive Length 15

work page 2019
[31]

Usatine and Brian D

Richard P. Usatine and Brian D. Madden. Interactive dermatology atlas. https: //www.dermatlas.net, 2023

work page 2023
[32]

https: //www.dermis.net/dermisroot/pt/home/index.htm, 2023

Dermis.net: Dermatology information service available on the internet. https: //www.dermis.net/dermisroot/pt/home/index.htm, 2023

work page 2023
[33]

https://dermnetnz.org, 2023

Dermnet resource. https://dermnetnz.org, 2023

work page 2023
[34]

Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the abcd rule of dermatoscopy and a new 7-point checklist based on pattern analysis

Giuseppe Argenziano, Gabriella Fabbrocini, Paolo Carli, Vincenzo De Giorgi, Elena Sammarco, and Mario Delfino. Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the abcd rule of dermatoscopy and a new 7-point checklist based on pattern analysis. Archives of dermatology , 134(12):1563–1570, 1998

work page 1998
[35]

Dermaamin

Jehad Amin AlKattash. Dermaamin. https://www.dermaamin.com

work page
[36]

Atlas dermatologico

Samuel Freire da Silva. Atlas dermatologico. http://atlasdermatologico.com. br. 16 L. Barros et al. Appendix A In this section, we stratified the results of Table 4 by Fitzpatrick scale. Tables A1, A2, and A3 shows the results for DDI, Fitzpatrick 17k, and PAD-UFES-20* datasets, respectively. Table A1: Evaluation metrics for DDI dataset. #Mel and #Ben ind...

work page

[1] [1]

Key statistics for melanoma skin cancer

American Cancer Society. Key statistics for melanoma skin cancer. https://www. cancer.org/cancer/melanoma-skin-cancer/about/key-statistics.html , 2022

work page 2022

[2] [2]

What is acral lentiginous melanoma? https://www.aimatmelanoma.org/melanoma-101/types-of-melanoma/ cutaneous-melanoma/acral-lentiginous-melanoma/

AIM at Melanoma Foundation. What is acral lentiginous melanoma? https://www.aimatmelanoma.org/melanoma-101/types-of-melanoma/ cutaneous-melanoma/acral-lentiginous-melanoma/

work page

[3] [3]

Types of melanoma

Memorial Sloan Kettering Cancer Center. Types of melanoma. https://www. mskcc.org/cancer-care/types/melanoma/types-melanoma, 2022

work page 2022

[4] [4]

Melanoma acral-estudo cl´ ınico e epidemiol´ ogico.Surgical & Cosmetic Dermatology , 2020

Yara Alves Caetano, Ana Maria Quinteiro Ribeiro, Bruno Ricardo da Silva Al- bernaz, Isabella de Paula Eleut´ erio, and Luiz Fernando Fleury Fr´ oes. Melanoma acral-estudo cl´ ınico e epidemiol´ ogico.Surgical & Cosmetic Dermatology , 2020

work page 2020

[5] [5]

Dermatology has a problem with skin color

Roni Caryn Rabin. Dermatology has a problem with skin color. https://www-nytimes-com.cdn.ampproject.org/c/s/www.nytimes.com/2020/ 08/30/health/skin-diseases-black-hispanic.amp.html , 2020

work page 2020

[6] [6]

Knowledge transfer for melanoma screening with deep learning

Afonso Menegola, Michel Fornaciali, Ramon Pires, Fl´ avia Vasques Bittencourt, Sandra Avila, and Eduardo Valle. Knowledge transfer for melanoma screening with deep learning. International Symposium on Biomedical Imaging , 2017

work page 2017

[7] [7]

An evaluation of self-supervised pre-training for skin-lesion analysis

Levy Chaves, Alceu Bissoto, Eduardo Valle, and Sandra Avila. An evaluation of self-supervised pre-training for skin-lesion analysis. In European Conference on Computer Vision Workshops , 2022

work page 2022

[8] [8]

Decolonising dermatology: why black and brown skin need better treatment

Neil Singh. Decolonising dermatology: why black and brown skin need better treatment. The Guardian, 13, 2020

work page 2020

[9] [9]

Fitzpatrick skin phototype

DermNet. Fitzpatrick skin phototype. https://dermnetnz.org/topics/ skin-phototype, 2012

work page 2012

[10] [10]

Skin cancer in african-americans

Dermatology Learning Network. Skin cancer in african-americans. https://www. hmpgloballearningnetwork.com/site/thederm/article/2547, 2004

work page 2004

[11] [11]

Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset

Matthew Groh, Caleb Harris, Luis Soenksen, Felix Lau, Rachel Han, Aerin Kim, Arash Koochek, and Omar Badri. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. In Conference on Computer Vision and Pattern Recognition , 2021

work page 2021

[12] [12]

Acral melanoma detection using a convolu- tional neural network for dermoscopy images

Chanki Yu, Sejung Yang, Wonoh Kim, Jinwoong Jung, Kee-Yang Chung, Sang Wook Lee, and Byungho Oh. Acral melanoma detection using a convolu- tional neural network for dermoscopy images. PloS one , 2018

work page 2018

[13] [13]

Augmented decision-making for acral lentiginous melanoma detection using deep convolutional neural networks

S Lee, YS Chu, SK Yoo, S Choi, SJ Choe, SB Koh, KY Chung, L Xing, B Oh, and S Yang. Augmented decision-making for acral lentiginous melanoma detection using deep convolutional neural networks. Journal of the European Academy of Dermatology and Venereology, 2020. 14 L. Barros et al

work page 2020

[14] [14]

Acral melanoma detection using dermoscopic images and convolutional neural networks

Qaiser Abbas, Farheen Ramzan, and Muhammad Usman Ghani. Acral melanoma detection using dermoscopic images and convolutional neural networks. Visual Computing for Industry, Biomedicine, and Art , 2021

work page 2021

[15] [15]

Skin type diversity: a case study in skin lesion datasets

Neda Alipour, Ted Burke, and Jane Courtney. Skin type diversity: a case study in skin lesion datasets. 2023

work page 2023

[16] [16]

PAD-UFES-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones

Andre Pacheco, Gustavo Lima, Amanda Salom˜ ao, Breno Krohling, Igor Biral, Gabriel Angelo, F´ abio Jr, Jos´ e Esgario, Alana Simora, Pedro Castro, Felipe Ro- drigues, Patricia Frasson, et al. PAD-UFES-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones. Data in Brief , 2020

work page 2020

[17] [17]

Novoa, Melissa Jenkins, Weixin Liang, Veronica Rotemberg, Justin Ko, et al

Roxana Daneshjou, Kailas Vodrahalli, Roberto A. Novoa, Melissa Jenkins, Weixin Liang, Veronica Rotemberg, Justin Ko, et al. Disparities in dermatology ai perfor- mance on a diverse, curated clinical image set. Science Advances, 2022

work page 2022

[18] [18]

Detecting melanoma fairly: Skin tone detection and debiasing for skin lesion classification

Peter J Bevan and Amir Atapour-Abarghouei. Detecting melanoma fairly: Skin tone detection and debiasing for skin lesion classification. In MICCAI Workshop on Domain Adaptation and Representation Transfer , pages 1–11, 2022

work page 2022

[19] [19]

Circle: Color invariant representation learning for unbiased classification of skin lesions

Arezou Pakzad, Kumar Abhishek, and Ghassan Hamarneh. Circle: Color invariant representation learning for unbiased classification of skin lesions. In European Conference on Computer Vision , 2022

work page 2022

[20] [20]

Improving skin color diversity in cancer detection: deep learning approach

Eman Rezk, Mohamed Eltorki, Wael El-Dakhakhni, et al. Improving skin color diversity in cancer detection: deep learning approach. JMIR Dermatology , 5(3):e39143

work page

[21] [21]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In Conference on Computer Vision and Pattern Recognition, 2009

work page 2009

[22] [22]

https://www.isic-archive.com, 2023

ISIC Archive. https://www.isic-archive.com, 2023

work page 2023

[23] [23]

Data, depth, and design: Learning reliable models for skin lesion analysis

Eduardo Valle, Michel Fornaciali, Afonso Menegola, Julia Tavares, Fl´ avia Vasques Bittencourt, Lin Tzy Li, and Sandra Avila. Data, depth, and design: Learning reliable models for skin lesion analysis. Neurocomputing, 2020

work page 2020

[24] [24]

Deep residual learn- ing for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learn- ing for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016

work page 2016

[25] [25]

Boot- strap your own latent - a new approach to self-supervised learning

Jean-Bastien Grill, Florian Strub, Florent Altch´ e, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, et al. Boot- strap your own latent - a new approach to self-supervised learning. In Advances in Neural Information Processing Systems , 2020

work page 2020

[26] [26]

What makes for good views for contrastive learning? In Advances in Neural Information Processing Systems , 2020

Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. What makes for good views for contrastive learning? In Advances in Neural Information Processing Systems , 2020

work page 2020

[27] [27]

Momentum contrast for unsupervised visual representation learning

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In Conference on Com- puter Vision and Pattern Recognition , 2020

work page 2020

[28] [28]

A sim- ple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A sim- ple framework for contrastive learning of visual representations. In International Conference on Machine Learning , 2020

work page 2020

[29] [29]

Unsupervised learning of visual features by contrasting cluster assignments

Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems , 2020

work page 2020

[30] [30]

Seven-point checklist and skin lesion classification using multitask multimodal neu- ral nets

Jeremy Kawahara, Sara Daneshvar, Giuseppe Argenziano, and Ghassan Hamarneh. Seven-point checklist and skin lesion classification using multitask multimodal neu- ral nets. IEEE Journal of Biomedical and Health Informatics , 2019. Title Suppressed Due to Excessive Length 15

work page 2019

[31] [31]

Usatine and Brian D

Richard P. Usatine and Brian D. Madden. Interactive dermatology atlas. https: //www.dermatlas.net, 2023

work page 2023

[32] [32]

https: //www.dermis.net/dermisroot/pt/home/index.htm, 2023

Dermis.net: Dermatology information service available on the internet. https: //www.dermis.net/dermisroot/pt/home/index.htm, 2023

work page 2023

[33] [33]

https://dermnetnz.org, 2023

Dermnet resource. https://dermnetnz.org, 2023

work page 2023

[34] [34]

Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the abcd rule of dermatoscopy and a new 7-point checklist based on pattern analysis

Giuseppe Argenziano, Gabriella Fabbrocini, Paolo Carli, Vincenzo De Giorgi, Elena Sammarco, and Mario Delfino. Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the abcd rule of dermatoscopy and a new 7-point checklist based on pattern analysis. Archives of dermatology , 134(12):1563–1570, 1998

work page 1998

[35] [35]

Dermaamin

Jehad Amin AlKattash. Dermaamin. https://www.dermaamin.com

work page

[36] [36]

Atlas dermatologico

Samuel Freire da Silva. Atlas dermatologico. http://atlasdermatologico.com. br. 16 L. Barros et al. Appendix A In this section, we stratified the results of Table 4 by Fitzpatrick scale. Tables A1, A2, and A3 shows the results for DDI, Fitzpatrick 17k, and PAD-UFES-20* datasets, respectively. Table A1: Evaluation metrics for DDI dataset. #Mel and #Ben ind...

work page