One of the objectives of the iToBoS project is to acquire a large dataset with different types of lesions from patients with diverse skin types, different phenotypes, and different degrees of UV damage.
The quality and quantity of the ground-truth data is critical for all the AI solutions that are being developed in this project. The International Skin Imaging Collaboration (ISIC) dataset is a blueprint for the advancement of iToBoS project.
ISIC is an academia and industry partnership designed to facilitate the application of digital skin imaging to help reduce melanoma mortality. ISIC is creating resources for the dermatology and computer science communities, including a large and expanding open-source public access archive of skin images. The ISIC archive serves as a public resource of images for teaching, research, and for the development and testing of diagnostic artificial intelligence algorithms. The ISIC datasets have become a leading repository for researchers in machine learning for medical image analysis, especially in the field of skin cancer detection and malignancy assessment. They contain tens of thousands of dermoscopic photographs together with gold-standard lesion diagnosis metadata. The associated yearly challenges have resulted in major contributions to the field, with papers reporting measures well in excess of human experts.
Random Images from ISIC dataset with their ground-truth label
ISIC Dataset Class Distribution
The ISIC community organizes a yearly skin lesion classification challenge and publish a dataset to attract wider participation of researchers to improve the diagnosis of Computer-aided diagnosis (CAD) algorithms and to spread awareness of the growing problem that skin cancer represents. Following table shows a summary of the number of images per lesion class within the ISIC datasets (2016–2020). We note that the number of images has increased substantially every year since its introduction and ISIC challenge used a different classification system each year starting from 2016. There are a total of 16 classes and ~71,00 images for ISIC training sets 2016 - 2020. Table shows the detailed split of class distribution for ISIC 2016 - 2020.
Source [2]
References:
- Cassidy, Bill, et al. "Analysis of the ISIC image datasets: usage, benchmarks and recommendations." Medical Image Analysis75 (2022): 102305.
- Rotemberg, Veronica, et al. "A patient-centric dataset of images and metadata for identifying melanomas using clinical context." Scientific data 8.1 (2021): 1-8.
- https://towardsdatascience.com/deep-learning-for-diagnosis-of-skin-images-with-fastai-792160ab5495.
- www.isic-archive.com