Hey, today is #MindblowingMonday π€―!
A day to share with you amazing things from every corner of Computer Science.
Today I want to talk about Generative Adversarial Networks π
π¬ But let's begin with some eye candy.
Take a look at this mind-blowing 2-minute video and, if you like it, then read on, I'll tell you a couple of things about it...
Generative Adversarial Networks (GAN) have taken by surprise the machine learning world with their uncanny ability to generate hyper-realistic examples of human faces, cars, landscapes, and a lot of other stuff, as you just saw.
Want to know how they work? π
There are many variants, but the core idea is to have 2οΈβ£ neural networks:
- βοΈ a generator network
- βοΈ a discriminator network
Both networks are connected in a sort of adversarial game, where each is trying to outperform the other.
βοΈ The discriminator is a regular neural network whose job is to determine if a specific sample (say, an image of a face) is real or generated.
This network's architecture depends on the classification task, as usual, e.g., lots of convolutions and pooling for images.
βοΈ The generator network is a decoder network, whose job is to transform an input of random values to whatever you want to generate.
In images, for example, you'll have deconvolution layers and upsampling, i.e., the "reverse" of an image classification network.
π© All the magic happens in the training.
You train the discriminator by alternatively showing it real and generated images, and minimizing some classification loss (e.g., binary cross-entropy).
The generator is trained to try and "fool" the discriminator. But this is not easy, so the trick involves letting it "see" the discriminator loss function.
π‘ It's like showing you my brain while you perform a magic trick, so you can understand how I can be fooled best.
This is the basic idea, but the devil is in the details. Two common problems with GANs are:
1οΈβ£ The discriminator learns much faster, so the generator never gets a chance to catch up.
2οΈβ£ The generator gets complacent and just produces the same good examples over and over.
π€ Finally, beyond the technical challenges, the possibility of suddenly creating very realistic content opens a can of worms of ethical issues such as disinformation.
But technology itself is neither good nor bad, it is just a tool. It's on ourselves what we do with it.
As usual, if you like this topic, have any questions, or just want to discuss, reply in this thread or @ me any time. I'll be listening.
This thread is available in plain format here, forever:
apiad.net/tweetstorms/miβ¦
Stay curious:
- π₯ <apiad.net/to/#gan-video>
- π« <deepgenerativemodels.github.io>
- π <manning.com/books/gans-in-β¦>
- π <arxiv.org/abs/1710.07035>
- π» <github.com/nightrome/realβ¦>
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
