Recent Popular Leaderboard What is KiKo? Case Reports

Evaluating the diagnostic accuracy of AI models in dermatology: Addressing skin tone bias to improve outcomes for ethnic skin types

Need to claim your poster? Find the KiKo table at the conference and they'll help you get set up.

Presented at: Society for Investigative Dermatology 2025

Date: 2025-05-07 00:00:00

Views: 2

Summary: Artificial intelligence (AI) is increasingly integrated into dermatology for diagnostic support, yet its effectiveness across diverse skin tones remains limited. This study evaluates the diagnostic performance of AI dermatology models in detecting conditions across different Fitzpatrick skin types (FST) and assesses the representation of skin of color (SOC) in AI training datasets. A scoping review was conducted using PubMed and Google Scholar, focusing on studies that examined AI performance across FSTs and the demographic composition of AI training datasets. Findings indicate that AI models exhibit significantly lower diagnostic accuracy for darker skin tones, which may exacerbate existing disparities in dermatology education, undermining AI’s potential to improve equity. For instance, evaluations using the Diverse Dermatology Images (DDI) dataset revealed reduced area under the curve (AUC) scores for FST V–VI compared to FST I–II (0.55 vs. 0.64 for DeepDerm; 0.50 vs. 0.61 for ModelDerm; 0.57 vs. 0.72 for HAM 10000). Additionally, many commonly used AI training datasets underrepresent SOC, with databases such as the Lesion Image Database containing only 4.3% images of Black or African American skin. Incorporating the Stanford DDI dataset (65% SOC images) into model training significantly improved diagnostic accuracy, closing the performance gap between light and dark skin tones and nearly equalizing their ROC-AUC values (p = 9.33 × 10^-5 for DeepDerm, p = 5.91 × 10^-10 for HAM10000; one-sided t test). These results underscore the need for diverse and representative training datasets to improve AI’s diagnostic accuracy for SOC. Strategies such as synthetic data augmentation, transfer learning, and transparency in dataset curation can enhance model reliability and reduce disparities in dermatologic care. For AI-driven dermatology to bridge healthcare disparities and improve diagnostic outcomes, it must incorporate underrepresented populations at inception. Priyanka Kadam<sup>1</sup>, Matthew Chen<sup>1</sup>, Nitin Chetla<sup>2</sup>, Samantha Sugerik<sup>1</sup>, James Briley<sup>1</sup> 1. Stony Brook Medicine, Stony Brook, NY, United States. 2. UVA Health, Charlottesville, VA, United States. Minoritized Populations and Health Disparities Research