Convolutional Neural Networks (CNNs) are a class of deep learning models primarily used for image and video processing (computer vision) tasks. They excel at learning spatial hierarchies
and capturing local patterns. CNNs consist of convolutional layers that apply filters to input images, enabling feature extraction. These filters detect patterns such as edges, textures,
and shapes. Following convolutional layers, pooling layers are often used to down sample the data, reducing dimensionality while preserving important features. Finally, fully connected
layers are employed for classification tasks, where the extracted features are combined to make predictions. CNNs have achieved breakthrough performance in various computer vision tasks,
including image classification, object detection, and image segmentation.