Dimensionality Reduction to the Max!

Robert MacWha
6 min readJan 20, 2021
Photo by Nathan Dumlao on Unsplash

Maybe you’re a data scientist trying to visualize some newfangled, complex dataset. Maybe you’re an artist that wants to try out some new kind of image-warping technique. Maybe you’re an aspiring AI enthusiast who just wants to get into image generation. Well, meet autoencoders — a feed-forward neural network whose primary focus is distilling larger concepts into their primary features.

What is an Autoencoder?

An autoencoder is an hourglass-shaped feed-forward neural network where the input and output are the same. What’s the point of that? Well, between the input and output layers they compress the data into a lower-dimensional ‘code’ and then reconstruct the original data from that code.

Photo by Arden Dertat

Why on earth would you want to do that? Well, if you’re forcing the autoencoder to re-create the input from the smaller code then that code has to contain all the information present in the input. This means that while training the autoencoder you’re really just teaching it to identify the key features of your dataset. Neat, huh?

Applications of Autoencoders

This ability for autoencoders to learn the key features of a dataset means that they’re able to do a wide variety of tasks, such as:

Image compression

Because the purpose of an autoencoder is to compress an input into a smaller code and reconstruct it from that code - it makes sense to use them for image compression. However, in practice, autoencoders rarely perform better than traditional compression algorithms for two reasons:

  1. Autoencoders generally work within very specific domains, meaning that any new types of images won’t be properly reconstructed. Instead, you’ll end up with a jumbled mess of features as the autoencoder tries to figure out what kind of cat your kettle is.
  2. Autoencoders generally aim to capture the larger details of an image, meaning that finer details will be lost. This results in a very blurry output, less than ideal for most applications.

Denoising Images

Since autoencoders often miss finer details when reconstructing the images, why not use them to get rid of unwanted detail? autoencoders are often more than capable of denoising images so that other algorithms, such as optical character recognition, can more easily identify them.

Photo by Arden Dertat

Dimensionality Reduction

Data scientists often need a way to visualize extremely complex datasets and autoencoders can help with that problem. Because similar inputs will have similar codes — autoencoders can help simplify otherwise complex datasets into as low as two dimensions!

However, the problem with this application is that there is a tradeoff between accuracy and dimensionality. Getting a 2-dimensional representation of your dataset is fantastic, but completely unless it’s an extremely simplistic dataset since you’ll most likely be greeted by a blob of uncertainty.

Image Generation

Autoencoders — being able to take seemingly random codes and produce coherent images, are logical choices for image generation algorithms. However, since autoencoders generally miss finer details the resulting images will often be blurry or incomplete. As you can see from these photos I made, the results aren’t really good enough for most applications.

Types of Autoencoders

I’ll be going over how three of the most common kinds of autoencoders are constructed. However, depending on your use case you might want to implement a more complex architecture such as an LSTM or variational autoencoder. The three architectures I’ll be talking about are:

  1. Single-layer autoencoders
  2. Multi-layer autoencoders
  3. Convolutional autoencoders

To help demonstrate these three kinds of autoencoders, I’ve also created a google collab document containing implementations of all three created in the Keras framework. Feel free to create a copy of the collab if you want to play around with any of the autoencoders.

Single-layer autoencoders

In its simplest form, an autoencoder is a set of three layers — an input layer, a hidden layer, and an output layer. Since the input and output are the same the autoencoder learns to create a denser representation of the data in order to fit it through the smaller hidden layer.

input_size  = 28*28
code_size = 56
output_size = 28*28
i = Input( shape=(input_size,) )# Encoder
e = Dense(code_size, activation='sigmoid')(i)
# Decoder
d = Dense(output_size, activation='sigmoid')(e)
# Autoencoder
autoencoder = Model(i, d)
autoencoder.compile(loss='mean_squared_error', optimizer='adam')

While generally unable to learn more complex patterns, single-layered autoencoders are usually suitable for problems with smaller input sizes and larger hidden sizes.

Multi-layer autoencoders

Because multi-layered autoencoders can be many orders of magnitude more complex than their single-layered cousin, they can learn a much more detailed representation of the given input data.

input_size  = 28*28
hidden_size = 14*14
code_size = 48
output_size = 28*28
i = Input( shape=(input_size,) )# Encoder
e = Dense(hidden_size, activation='relu')(i)
encoded = Dense(code_size, activation='sigmoid')(e)# Decoder
d = Dense(hidden_size, activation='relu')(encoded)
d = Dense(output_size, activation='sigmoid')(d)
# Autoencoder
autoencoder = Model(i, d)
autoencoder.compile(loss='mean_squared_error', optimizer='adam')

While we could, in theory, pick any of the hidden layers to be the feature representation it’s generally best to select the middle one so that the model is symmetrical. This way both the encoding and decoding step have a balanced number of parameters.

Convolutional Autoencoders

As of now, we’ve only been using dense layers to make our autoencoders, but there’s nothing to stop us from using different sorts of layers. In principle convolutional autoencoders are the same as any other autoencoder, but 3D arrays (images) instead of 1D arrays.

input_shape = (28, 28)
code_size = 48
input_shape = (28, 28)
# Encoder
i = Input(shape=input_shape)
e = Conv2D(16, (4, 4), activation='relu', padding='same')(i)
e = MaxPooling2D((2, 2))(e)
e = Conv2D(8, (4, 4), activation='relu', padding='same')(e)
e = Flatten(e)encoded = Dense(code_size, activation=''sigmoid)(e)# Decoder
d = Dense(196, activation='relu')(encoded)
d = Reshape((4, 7, 7))(d)
d = Conv2D(8, (4, 4), activation='relu', padding='same')(d)
d = UpSampling2D()(d)
d = Conv2D(4, (4, 4), activation='relu', padding='same')(d)
d = UpSampling2D()(d)
d = Conv2D(1, (4, 4), activation='relu', padding='same')(d)
# Autoencoder
autoencoder = Model(i, d)
autoencoder.compile(loss='mean_squared_error', optimizer='adam')

Convolutional autoencoders have the perk of being much more powerful than the previous two kinds of autoencoders. However, the added complexity does mean that they take considerably longer to train — so which to use depends entirely on your project’s needs.

Conclusion

So — now that you know the basics of autoencoders, what’s next? There are so many different structures and applications that I haven’t touched on that the possibilities ahead of you are practically limitless. Have fun creating!

TL;DR

  • Autoencoders are hourglass-shaped feed-forward neural networks where the input and output are the same.
  • There are many applications for autoencoders such as image compression, denoising, dimensionality reduction and image generation.
  • Autoencoders can be extremely simplistic, sometimes only requiring a single hidden layer.

Resources

  • Here’s a showcase of the three different autoencoders spoken about in this article.
  • This is a website I made that showcases an autoencoder trained to generate images of cats! It’s fully interactive and hosted on Github pages.
  • Here’s a blog post from Keras going over the basics of autoencoders.

Thanks for reading my article! Feel free to message me on LinkedIn if you have anything to say, or follow me on Medium to get notified when I post another article!

--

--

Robert MacWha

ML Developer, Robotics enthusiast, and activator at TKS.