Convolutional layers are explained in layman’s terms for

Neural Networks (ConvNets) have been a rage in the recent times amongst modern
day research and development community to try and ease various applications.
Right from image processing to pattern recognition and classification, ConvNets
have been experimented with a variety of domains and has been providing some
excellent results. ConvNets are basically a modified version of multi-layer
perceptron designed such that they require very less processing as compared
various other networks. Being a part of Deep Learning, its architecture
contains a humongous number of hidden layers between the input and the output
one to provide ultimate optimization. Its hidden layers mainly consist of
multiple or combination of each of the following most popular layers- Convolutional,
Rectified Linear Unit (ReLU), Pooling and Fully Connected. Not going in too
depth and mathematics, the above mentioned layers are explained in layman’s
terms for simplicity and better understanding. Convolutional Layers are the
core layers which perform most of the computational work. They use a variety of
different dimensional masks over the required image where each mask involving a
specific feature of the image thus imitating small patterns in an image and
mapping to them. Each filter gives a respective convolved image. Thus, a single
image is split into ‘n’ number of filtered images stacked together where each
layer is connected to the other. Stacking such numerous filtered images gives
us the convolutional layer. The ReLU layer performs the normalization process.
It is used to introduce non-linearity into our ConvNets since most of the real
life information i.e. data with which our network would be training will be
non-linear. This ReLU operation is performed at the pixel level transforming
each negative value to zero. We could also use other non-linear functions such
as hyperbolic tangent (tanh) or sigmoidal function but ReLU has found to
produce better results for most of the cases. Next comes the Pooling Layer. It
basically shrinks the image to a more essential form. Max-Pooling is one of the
most popular layers used. It takes the maximum or the most prominent feature
out of the block of neurons of the previous layer. On the same note, Average
Pooling involves taking out the average value from the neurons cluster of the
previous layer. The above 3 layers are repetitively cascaded after each other
as per the requirement which is also known as Deep Stacking. Fully Connected
Layer connects the each and every neuron of a layer with neurons in the next
layer. It is very similar to multilayer perceptron neural network. The Deep
Stacked layers contain high-level features of the input image. The Fully
Connected Layer makes use of these features to classify the image into
different classes based on our training dataset. The sum of the output
probabilities of this layer is 1. This is done by the Softmax Activation
Function. This activation function takes a vector of real-valued score and
compresses it to a vector of values between zero being the lowest and one being
the highest such that these values also add up to one 3.