Machine learning and specifically deep learning, have dramatically improved the state-of-the-art in many areas of research, including computer vision, speech recognition, and natural language processing.
These advances are now seeing an application in the medical field, where deep neural networks are being used for a wide range of different purposes, including tumor segmentation, diabetic retinopathy detection, and cancer classification from histological tissue images.
Skin lesion classification by deep neural networks into different cancer sub-types has also experienced significant progress in the last years. A breakthrough moment occurred when a convolutional neural network (CNN) was trained on a dataset of 129450 images of skin lesions of different diseases in. The neural network achieved the same accuracy as expert dermatologists on two binary classification cases: keratinocyte carcinomas versus benign seborrheic keratoses and malignant melanomas versus benign nevi. Since then, many other deep learning models have been proposed for the same purpose.
Despite the rapid acceleration of deep learning research in healthcare, with potential applications being demonstrated across various domains, there are currently limited examples of these techniques being successfully deployed into clinical practice.
The challenges and limitations for the deployment of such systems into real-world environments are related to several factors: ethics and regulatory aspects, data availability and variability, and technology issues intrinsic to machine learning solutions. Among the last ones, essential factors include dataset shift, fitting of confounders, interpretability or explainability of decisions, generalization to different populations, and the development of reliable measures of model confidence.
Most deep learning-based solutions produce deterministic outputs and do not quantify or control the uncertainty in the prediction, which may lead to a lack of confidence in the automated diagnosis and errors in the interpretation of results.
Usually, performance is given in terms of global metrics related to the models' discriminative power like sensitivity, specificity, AUC, or ROC curves. However, it is crucial to know how sure or confident the model is about a prediction, especially in the clinical practice where diagnostic errors are especially relevant, and there are always difficult cases that may require closer examination or a second expert opinion.
Obtaining reliable uncertainty estimates of neural network predictions is a long-standing challenge. In the EU project iTOBOs granted with 12 million euros, an international consortium of 19 academic institutions and technical partners will address the use of uncertainty estimation techniques and metrics for deep neural and apply them to the problem of skin lesion classification.
The iToBoS project will produce a machine learning tool integrated in a "cognitive assistant" for clinicians to use in clinical studies with patients in three clinical University Centers in Europe (Hospital Clinic of Barcelona, Spain; University of Trieste, Italy) and Australia (University of Queensland).
Marc Combalia1,4,5; Josep Malvehy1,2,3,4,5
.FCRB and IDIBAPS, Barcelona, Spain. .Hospital clinic of Barcelona. .University of Barcelona. .Athena Tech. .iToBoS.