Artificial intelligence diagnostics are less accurate on skin of color than light skin, even when trained on equal numbers of images.
Although artificial intelligence (AI) can be a useful tool for dermatologists, significant challenges exist in creating databases and programming that provide equally precise diagnostic results in skin of color (SOC) compared with lighter-skinned patients.1
In a poster presented at the 17th Annual Skin of Color Society Scientific Symposium,2 Pushkar Aggarwal, MBA, a third-year medical student at the University of Cincinnati College of Medicine in Ohio, showed that a large gap exists in the accuracy with which AI distinguishes between melanoma and basal cell carcinoma (BCC) in patients of color versus light-skinned patients.
Two image recognition models were each trained on 150 images, validated on 38 images, and tested on 30. At each stage, the images were evenly divided between those that showed melanoma and those that showed BCC. One model was trained on light skin and the other on SOC.2
The performance of the 2 models was tested using the area under the receiver operating characteristic curve. Sensitivity was 0.60 for light skin and 0.53 for SOC. Specificity was 0.53 and 0.47, respectively. Positive predictive value was 0.56 and 0.50, and negative predictive value was 0.57 and 0.50, respectively. The F1 scores were 0.58 for light skin and 0.52 for SOC.2
Aggarwal said that when the same number of images were used in the training, validation, and testing, the SOC AI model still offered inferior results compared with the model trained on lighter skin, noting that current AI models will require significantly more images of SOC. “Many skin lesions, especially those that are hyperpigmented, are more easily distinguished from the surrounding skin in lighter skin color than in skin of color,” he said.
Therefore, several different techniques need to be employed to increase the number of images of darker skin available for the training of AI models. “This can be performed by gathering clinical images of cutaneous manifestations in skin of color, which has been employed by organizations such as VisualDx. However, this still leaves a significant gap in dermatological images in skin of color,” he told Dermatology Times®.
According to Aggarwal, AI can compensate for the lack of diversity in available images in 2 ways. “In one technique, images of a dermatological disease in light skin color and in skin of color are provided to an [AI] image-to-image translation model, which identifies the skin lesion and the surrounding non–skin lesion [such as normal skin, hair, and freckles]. Next, the model converts the images of the dermatological disease in lighter skin color to the same dermatological disease in skin of color. This will result in the generation of novel dermatological images that have ‘darker’ skin color,” he said.
These images would also be adjusted to reflect the features of skin lesions as they appear on darker skin—and doing so would make all currently available dermatological images available to AI models being trained on SOC. Another method would be to use AI to create novel images of cutaneous manifestations in skin of color from scratch.
Although more work is needed to optimize accuracy when using AI as a diagnostic tool on SOC, it is still a useful aid for both primary care physicians and dermatologists, Aggarwal said. “For example, in primary care offices, the AI model can serve as a second pair of eyes that can help the clinician decide whether referral to dermatologist or observation is needed,” he said. “For dermatologists, the AI model can provide a broad differential diagnosis, and when this is combined with the patient’s clinical history attained by the dermatologist, the likelihood of obtaining the correct diagnosis will be higher.”