A Sondrel Engineering Consultant and alumni of Imperial College, Samuel Kong, attended this year's British Machine Vision Conference held at his old university. For anyone involved in designing silicon for applications in this field, Samuel's blog series on the emerging trends and hot topics discussed and presented at this event will be of interest.
The majority of discussions there related to projects involving Image Convolution, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN). In this first blog, Samuel gives a summary explanation of each of these areas. Subsequent blogs will look at some specific projects presented over the course of the conference, showing current innovation and development in these fields.
Image Convolution is a huge part of Signal Processing and Image Processing. In image processing, it is the process of transforming an image by applying a kernel over each pixel and its local neighbours across the entire image.
The kernel is a matrix of values whose size and values determine the transformation effect of the convolution process. The Convolution Process places the Kernel Matrix over each pixel of the image (ensuring that the full Kernel is within the image), multiplies each value of the Kernel with the corresponding pixel it is over, sums the resulting multiplied values and returns the resulting value as the new value of the centre pixel. This process is repeated across the entire image.
Figure 3 gives a visual representation of Convolution.
Examples of common Kernels are:
This Identity Kernel when applied to an image through convolution, will have no effect on the resulting image. Every pixel will retain its original value.
This Sharpen Kernel when applied to an image through convolution, will have an image sharpening effect to the resulting image. The precise values can be customised for varying levels of sharpness.
The Gaussian Blur Kernel when applied to an image through convolution, will apply a Gaussian Blurring effect to the resulting image.
Just as how the values of the Kernel can be varied for different levels of effects, the size of the Kernel can also be altered to shape the effect of the convolution. By increasing the size of the Kernel Matrix, the spatial locality influencing each pixel's resulting value is increased as pixels from further away are being pulled into the equation. There are many more Kernels that are used in image processing such as edge detection, embossing, rotation, etc. Figure 4 below shows an Edge Detection Kernel being used to highlight all the edges within a photo.
ARTIFICIAL NEURAL NETWORKS
An Artificial Neural Network (ANN) is a type of Machine Learning model that takes inspiration from how a human brain works. Just as how a brain is made up of many interconnected neurons transferring pulses of information between themselves, an ANN is also made up of layers neurons that transfer data between one another, from input to the output. Figure 5 below shows a visual representation of a Neuron within an ANN and how it compares with that of a human neuron.
Mathematically, the Neuron within the ANN applies a summation of all the inputs multiplied by their
corresponding weights. So, from Figure 5, the equation for the output Y would be:
The output of that neuron would then drive the input of other neurons in the next layer. By having these multiple layers of Neurons that are interconnected, a Deep Neural Network (DNN) is formed. These Deep Neural Networks can model very complex functions for many different problems. Figure 6 below shows what a Deep Neural Network looks like:
DNNs are very powerful and when trained well enough, it is possible to model complex mathematical
functions that can solve many different types of problems such as object classification and linear
regression. Figure 7 below is an example from a YouTube video of how easily accessible Machine
Learning is and how it can be used to train a computer up to play video games that even some humans struggle to accomplish. It is possible to see from the video snippet the Neural Network that was being built up and the input it received in the form of the current state of play.
CONVOLUTIONAL NEURAL NETWORKS
Convolutional Neural Networks (CNN) are a type of Deep Neural Network. They are designed
specifically for the field of Image Processing (but can be adapted for Audio Processing as well) due to
the nature of its architecture . A CNN, to put simply, applies convolution to its inputs using a Kernel Matrix that it calibrates through training. For this reason, CNNs are very good at feature matching in images and object classification. Figure 8 shows the architecture of a CNN. The Pink circles represent the Convolution Layers of the CNN where a Convolution Kernel is being applied to the inputs and their local pixel neighbours. By only applying the convolution process to local neighbouring pixels, the Neural Network is a lot smaller and each pixel is less susceptible to the noise of pixels further away. The other benefit of the CNN is that, as every input is being applied the same convolution kernel, the weightings of each neuron can be shared with one another, further reducing the complexity of the neural network. For a CNN, just as how an ANN is trained to calibrate the individual weighting values of the inputs of the neurons, a CNN calibrating its weightings is equivalent to it modelling a Convolution Kernel that will fit the solution for the task.
The result of a CNN could be a set of votes towards what the network believes the image should be
classified as. An example of a CNN being used for object classification is shown in Figure 9 below. The
example is made up of multiple Convolution Kernels and multiple Convolutional Layers that by the end of it will generate a set of voting weights to determine what the object should be classified as. The example is very similar to the projects presented within the conference at a simplified level.
For more information on Convolutional Neural Networks, refer to this YouTube video:https://www.youtube.com/watch?v=FmpDIaiMIeA.
The areas of Image Convolution, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) explained above are being explored in many cutting edge research projects and product innovations. Samuel will give an overview of projects presented at the event in his next blogs. They include:
- Probabilistic and Deep Models for 3D Reconstruction
- Deep Face Hallucination for Unviewed Sketches
- Character Identification in TV series without a Script
- You Said That?
- Lip Reading in Profile
- Exploring the Structure of a Real-Time, Arbitrary Neural Artistic Stylization Network
- PixColour: Pixel Recursive Colorization
- HoloLens: Computer Vision Meets Mixed Reality
 Apple Inc., “Performing Convolution Operations,” 13 September 2016. [Online]. Available: https://developer.apple.com/library/content/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html. [Accessed 11 September 2017].
 GIMP, “Convolution Matrix,” GIMP, [Online]. Available: https://docs.gimp.org/en/plug-in-convmatrix.html. [Accessed 11 September 2017].
 S. Petridis, “Lecture 7-8: Artificial Neural Networks,” London, 2014.
 OpenNN, [Online]. Available: http://www.opennn.net/. [Accessed 12 September 2017].
 SethBling, “MarI/O - Machine Learning for Video Games,” 13 June 2015. [Online]. Available:
[Accessed 12 September 2017].
 F. V. Veen, “Neural Network Zoo,” The Asimov Institute, 14 September 2016. [Online]. Available: http://www.asimovinstitute.org/neural-network-zoo/. [Accessed 11 September 2017].
 A. Karpathy, “CS231n Convolutional Neural Networks for Visual Recognition,” Stanford University, [Online]. Available: http://cs231n.github.io/convolutional-networks/. [Accessed 11 September 2017].