Convolutional Neural Networks (CNN) for image processing

Convolutional neural networks are a type of neural networks often used to perform machine learning techniques on images.

Inside their architecture, convolutional filters are used as feature extractors: these kernels slide along the image and generate a feature map response. If one feature is informative, extract it from all regions and create a feature map.

The convolutions are useful for getting high level features from images, such as edges and patterns. Information extraction from images is also done with other filtering operations, such as blurring, sharpening and pooling.

Fig. 1  An example of a CNN architecture: EfficientNet [1].

The architecture of a CNN is designed to mimic the way human brain deals with images; this is why they are extremely used in image processing related tasks.

Human vision goes from simple to highly complex in a hierarchical way: first detect edges and colours and then extend to object detection and recognition, or other tasks. Individual neurons in cerebral cortex have a stimulus response in a limited region called receptive field. Each neuron brings its contributions and their responses overlap to get the whole visual field.

CNNs are fully connected networks (each neuron in a layer is connected to all neurons from next layer) and each convolutional layer combines features extracted by the previous layer. Advancing deeper into layers, the receptive field of each neuron becomes larger and the features extracted get more complex.

Due to being fully connected, CNNs have a high risk of overfitting. This is why, besides applying different regularisation techniques (dropout, skip connections, weight decay), solving tasks with CNNs require a very large amount of data for training.

Author: Gabriela Ghimpeteanu, Coronis Computing.

Bibliography

[1] Wang, J.; Liu, Q.; Xie, H.; Yang, Z.; Zhou, H. Boosted EfficientNet: Detection of Lymph Node Metastases in Breast Cancer Using Convolutional Neural Networks. Cancers 2021, 13, 661.