Generative Adversarial Neural Network

2024

3-minute read

Python

TensorFlow

Keras

Numpy

Matplotlib

This might be one of my favorite machine learning algorithms: the Generative Adversarial Network (GAN). It involves two AI models competing against each other, allowing the GAN to effectively supervise itself.

It consists of two submodels: a generator and a discriminator. The generator creates fake samples, while the discriminator determines whether these samples are fake or real.

The discriminator will receive an input and attempt to produce binary classification values (0 or 1) to decide whether the image is real or fake. The generator will take random inputs to create an image, while the discriminator will evaluate both the generated images and possibly some real images to determine whether they are fake.

The key distinction with a conditional GAN is that it provides more control over the output, allowing you to specify the type of image you want to generate. In contrast, with this type of GAN, you are simply generating random images from your sample population without any specific control over the image type.

How do these algorithms work?

First, we'll train our generator to produce highly convincing fake images of clothing. To achieve this, we need to train our discriminator model to accurately recognize what real clothing looks like in pictures. Once the discriminator becomes proficient at identifying real clothes, we will introduce non-clothing shapes to ensure it correctly classifies them as not being clothes.

At this point, the generator will take a random input vector (like a t-shirt) and use it to generate its own fake t-shirt image. This image is then passed to the discriminator, which must decide whether the image is real or fake.

The result of this decision is shared with both the generator and the discriminator, and they adjust their behavior accordingly based on the feedback.

Here, our generator starts with a set of 128 random values and outputs a matrix (image) with the shape of 28 by 28 by 1.

Conversely, our discriminator performs the opposite task. It takes the generator's output—an image with the shape of 28 by 28 by 1—and produces a single value between zero and one to determine whether the image is real or fake.

At the start, the images produced by the generator don't look great, as you can see here. However, after training, the generator will improve significantly, eventually creating more accurate and realistic visualizations of fashion items.

Training GANs can be challenging because you need to strike a balance between how quickly the discriminator learns and how fast the generator improves.

You can see d_loss and g_loss balancing, we don't want one to decrease and the other one to increase really fast. We want them to stay steady and stable over the long term.

Using pyplot, I can evaluate the performance of my trained model. The plot shows the loss trends for both the discriminator (d_loss) and the generator (g_loss). The fluctuation in loss values is expected, especially in the early stages of training. The generator's loss shows a gradual stabilization over time, while the discriminator's loss remains more volatile. This indicates that the generator is learning to create more realistic data, making it progressively harder for the discriminator to differentiate between real and fake data. This balance is crucial for the model's overall performance.

Because we would need around 2000 epochs to train the model properly, I used a pretrained model to generate the images.

Generative Adversarial Neural Network

2024

How do these algorithms work?

20 epochs:

2000 epochs: