Improving AI models with XAI

In the past years, Deep Neural Networks (DNNs) have shown exceptional results across multiple domains, including general image classification tasks, winning Atari games or even detecting skin cancer by lesion classification. These networks are commonly trained on large datasets, such as ImageNet (millions of images of several classes) or ISIC 2019 (images of skin lesions).

Unfortunately, these datasets often contain unwanted artifacts (e.g. copyright tag) that have remained undetected during dataset creation. If a given artifact only occurs in one class, this artifact is called Clever Hans (CH) artifact and can cause the model to learn a correlation between the artifact and the class label. This leads to seemingly good model performance in the test lab, but due to right predictions for the wrong reasons . For example, about one-fifth of the images of class “horse” in the Pascal VOC 2007 image categorization dataset (Everingham et al., 2007) contain a copyright tag in the bottom of the image, because they are from the same photographer (Lapuschkin et al. 2016, Lapuschkin et al. 2019). DNNs learn a relationship between the existence of the copyright tag and the class “horse”, which, in fact, is only a CH artifact. Another example for a CH artifact was detected in the ISIC 2019 dataset: The largest class, i.e. melanocytic nevus, contains several images with colorful band-aids next to the lesion. Because this artifact — which is unrelated to the classification task — only occurs in one single class, DNNs might learn a relationship upon that spurious correlation and the artifact can be considered as CH artifact.