What is Autoencoder?

An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”. Along with the reduction side, a reconstructing side is learnt, where the autoencoder tries to generate from the reduced encoding a representation as close as possible to its original input, hence its name. Recently, the autoencoder concept has become more widely used for learning generative models of data. - From Wikipedia.


How it works?

On a first glance, an autoencoder might look like any other neural network but unlike others, it has a bottleneck at the centre. This bottleneck is used to learn the features of the image. An autoencoder does two tasks, it encodes an image and then decodes it.

Why Autoencoder?

Today data denoising and dimensionality reduction for data visualization are considered as two main interesting practical applications of autoencoders. With appropriate dimensionality and sparsity constraints, autoencoders can learn data projections that are more interesting than PCA or other basic techniques.

Autoencoders are learned automatically from data examples. It means that it is easy to train specialized instances of the algorithm that will perform well on a specific type of input and that it does not require any new engineering, only the appropriate training data.

However, autoencoders will do a poor job for image compression. As the autoencoder is trained on a given set of data, it will achieve reasonable compression results on data similar to the training set used but will be poor general-purpose image compressors. Compression techniques like JPEG will do vastly better. What it can be used for?

Autoencoders can be used to remove noise, perform image colourisation and various other purposes. A noisy image can be given as input to the autoencoder and a de-noised image can be provided as output. The autoencoder will try de-noise the image by learning the latent features of the image and using that to reconstruct an image without noise. The reconstruction error can be calculated as a measure of distance between the pixel values of the output image and ground truth image.


What are the common types of Autoencoders?

1. Vanilla autoencoder
In its simplest form, the autoencoder is a three layers net, i.e. a neural net with one hidden layer. The input and output are the same, and we learn how to reconstruct the input, for example using the adam optimizer and the mean squared error loss function.

2. Multilayer autoencoder
If one hidden layer is not enough, we can obviously extend the autoencoder to more hidden layers. Any of the hidden layers can be picked as the feature representation but we will make the network symmetrical and use the middle-most layer.

3. Convolutional autoencoder
It can be used with Convolutions instead of Fully-connected layers and the principle is the same. But using images (3D vectors) instead of flattened 1D vectors. The input image is downsampled to give a latent representation of smaller dimensions and force the autoencoder to learn a compressed version of the images.

4. Regularized autoencoder
There are other ways we can constraint the reconstruction of an autoencoder than to impose a hidden layer of smaller dimension than the input. Rather than limiting the model capacity by keeping the encoder and decoder shallow and the code size small, regularized autoencoders use a loss function that encourages the model to have other properties besides the ability to copy its input to its output.

In practice, we usually find two types of regularized autoencoder: the sparse autoencoder and the denoising autoencoder.

Sparse autoencoder : Sparse autoencoders are typically used to learn features for another task such as classification. An autoencoder that has been regularized to be sparse must respond to unique statistical features of the dataset it has been trained on, rather than simply acting as an identity function.

Denoising autoencoder : Rather than adding a penalty to the loss function, we can obtain an autoencoder that learns something useful by changing the reconstruction error term of the loss function. This can be done by adding some noise of the input image and make the autoencoder learn to remove it. By this means, the encoder will extract the most important features and learn a robuster representation of the data.

In brief, each has different properties depending on the imposed constraints : either the reduced dimension of the hidden layers or another kind of penalty.