A day to share with you amazing things from every corner of Computer Science.
Today I want to talk about Generative Adversarial Networks π
π¬ But let's begin with some eye candy.
Take a look at this mind-blowing 2-minute video and, if you like it, then read on, I'll tell you a couple of things about it...
Generative Adversarial Networks (GAN) have taken by surprise the machine learning world with their uncanny ability to generate hyper-realistic examples of human faces, cars, landscapes, and a lot of other stuff, as you just saw.
Want to know how they work? π
There are many variants, but the core idea is to have 2οΈβ£ neural networks:
- βοΈ a generator network
- βοΈ a discriminator network
Both networks are connected in a sort of adversarial game, where each is trying to outperform the other.
βοΈ The discriminator is a regular neural network whose job is to determine if a specific sample (say, an image of a face) is real or generated.
This network's architecture depends on the classification task, as usual, e.g., lots of convolutions and pooling for images.
βοΈ The generator network is a decoder network, whose job is to transform an input of random values to whatever you want to generate.
In images, for example, you'll have deconvolution layers and upsampling, i.e., the "reverse" of an image classification network.
You train the discriminator by alternatively showing it real and generated images, and minimizing some classification loss (e.g., binary cross-entropy).
The generator is trained to try and "fool" the discriminator. But this is not easy, so the trick involves letting it "see" the discriminator loss function.
π‘ It's like showing you my brain while you perform a magic trick, so you can understand how I can be fooled best.
This is the basic idea, but the devil is in the details. Two common problems with GANs are:
1οΈβ£ The discriminator learns much faster, so the generator never gets a chance to catch up.
2οΈβ£ The generator gets complacent and just produces the same good examples over and over.
π€ Finally, beyond the technical challenges, the possibility of suddenly creating very realistic content opens a can of worms of ethical issues such as disinformation.
But technology itself is neither good nor bad, it is just a tool. It's on ourselves what we do with it.
As usual, if you like this topic, have any questions, or just want to discuss, reply in this thread or @ me any time. I'll be listening.
Despite their impressive capabilities, all LLMs, including OpenAI o1, are still fundamentally limited by design constraints that make them incapable of true, open-ended reasoning.
Let's break it down. 𧡠(1/5)
Reason 1: Stochastic Sampling.
LLMs rely on probabilities to pick the next token. Even when you fix the temperature, randomness is still built into the language modeling paradigm.
But logic is anything but random.
Reason 2: Bounded Computation.
Each token processed requires constant computation. This means the total computational budget is determined by the input size.
But we know some problems (including logical reasoning) require an exponentially large computation.
Clustering is a process for discovering relationships between objects, by placing them in different groups according to how similar they are with each other.
βGiven a set of objects, is there a natural, unbiased way to cluster them?
Meet the ugly duckling theorem.
𧡠1 of 20
Say we have three objects, two swans π¦’π¦’ and an ugly duckling π¦.
Obviously, the natural way to cluster them is by placing the two swans together and the duckling in a different group, right?
Well, it depends on which features you choose to look into.
2 of 20
If we cluster them by colour, sure, but if we cluster them by size, maybe not.
So how about we consider *all the possible* features? Wouldn't that give us the most "natural" clustering?
As a start, let's say there are N boolean predicates that we can evaluate...
I've spent the last couple of years disrupting traditional software companies with machine learning and data science ideas directly out of my group's core research.
I've found that most issues arise from three critical areas.
Here's what I've learned...
𧡠1/24
Most of the obstacles I've seen can be grouped in one of the following three categories:
1οΈβ£ the language
2οΈβ£ the development process
3οΈβ£ the expected results
Let's tackle them one by one.
2/24
1οΈβ£ The majority of clashes between academia and industry are due to a language barrier.
We talk about experiments, models, and hypotheses. They talk about functionality, business rules, and user experience.
AutoML is a growing subfield of machine learning, that aims to automate some of the most boring and time-consuming parts of designing, training, and deploying a machine learning pipeline.
Here are 10 open source AutoML tools you can start using today:π
βοΈauto-sklearn
Probably the most popular AutoML system, it sits on top of everyone's favourite ML framework, scikit-learn, and gives you a black-box AutoML wrapper that abstracts away most of scikit-learn's estimators.
Another well-known AutoML framework based on another popular and well-loved machine learning framework, WEKA. Although the project is not in active development anymore, it is still used by the community.
It's probably the most important theoretical question in computer science, and it sounds weirdly abstract.
But deep down, it has a very intuitive explanation.
If you have heard of this and want to learn a bit more, read on...
π§΅π
Computer Science is all about finding clever ways to solve difficult problems.
We have found clever algorithms for a bunch of them: sorting stuff, finding shortest paths, solving equations, simulating physics...
But some problems seem to be way too hard π
One example is the Travelling Salesman problem.
β Find a cycle starting in your city to visit all major cities in your country and return home with the least fuel cost.
This is the kind of problem we expect computers to solve easily, right? That's what computers are for!