You want to build a face recognition system for your office, but getting many pictures from your coworkers is not a choice.
Also, having to retrain the model for every new employee seems like a burden.
How do we solve this?
Grab your ☕️ and let's do the thing!👇
To solve a standard classification problem, you collect many images representing the different classes you want to classify.
You label the images and train a classification model.
This is all good, but sometimes getting a lot of images is not an option.
[2 / 13]
A face recognition system is one example: getting many images for every person we want to support is impractical.
Another example is a signature verification system: we want a model capable of verifying a signature even when we didn't train it.
[3 / 13]
Clearly, we can't solve these problems with a standard classification model, so here is a different solution:
▫️ A Siamese Neural Network.
Let's talk a little bit more about this beauty 😎
[4 / 13]
The trunk of a Siamese Network contains two identical subnetworks (I call these "twins.")
These subnetworks are exactly the same: a mirror of each other.
The attached picture is my attempt to draw this thing.
[5 / 13]
The goal of the twin subnetworks is to produce a feature vector for each one of the given images.
Then we have a layer that will compute the distance from one feature vector to the other.
Let's say the distance is zero. What does that mean?
[6 / 13]
If the distance is zero, both image 1 and image 2 must be the same!
A shorter distance means that both images have more features in common.
A larger distance means that both images have fewer features in common.
Makes sense, right?
[7 / 13]
Let's get back to our face recognition system.
We could take a live picture of a person and compare it with a saved picture (a reference image.)
Our Siamese network will determine how far away both images are. If they are close, we have a match!
Beautiful, right?
[8 / 13]
Here is the main takeaway of this story:
👉 Our Siamese network is not learning to classify the input image into specific classes.
Instead, it is learning a similarity function that we can use to compare any two images.
[9 / 13]
This means that we don't need to retrain the network to add more faces (or signatures in the case of signature verification.)
If we learn a good similarity function, we can compare any two domain images to determine whether they are the same.
[10 / 13]
It may be evident at this point, but a massive advantage of Siamese Networks is that we don't need many images to solve the problem:
One-shot Learning gets away with a single image!
[11 / 13]
I'm currently doing a lot of work with Siamese networks, and the results so far have been fantastic!
I'll write more about that as I collect more information. Taking these things out of papers and into production systems is trickier than what may seem.
[12 / 13]
Speaking of tricky things, if you are looking for some deep dives into machine learning and how to make it work in real life, follow me.
This stuff is fun, but it's even better if we do it together!
[13 / 13]
🦕
Great question! No, you don't necessarily need multiple pictures of the same person. Remember: you aren't trying to classify a person. You are trying to create a similarity function.
You train the network using pairs of images (A, A, 1) and (A, B, 0).
Here are the best 10 machine learning threads I posted in February.
They go all the way from beginner-friendly content to a broader dive into specific machine learning concepts and techniques.
I'd love to hear which one is your favorite!
🧵👇
Having to pick only 10 threads is painful. I always struggle to decide what should stay out of the list.
This, however, is a great incentive when I'm writing the content: I have to compete against myself to make sure what I write ends up being part of the list!
[2 / 13]
[Thread 1]
An explanation about three of the most important metrics we use: accuracy, precision, and recall.
More specifically, this thread shows what happens when we focus on the wrong metric using an imbalanced classification problem.
For the first time yesterday, I set up a project using a Development Container in Visual Studio Code and it immediately hit me:
✨ This is the way going forward! 🤯
If you haven't used this yet, here are some thoughts.
👇
The basic idea: you can run your entire development environment inside a container.
Every time you open your project, @code prepares and runs your container.
[2 / 7]
There are several advantages to this:
First of all, your entire team will run exactly the same environment, regardless of their preferred operating system, folder structure, existing libraries, etc.
Imagine your favorite creator in Twitter starts offering the following:
1. A weekly newsletter 2. Deep dives into your favorite topics 3. A look behind the scenes 4. Live discussion invitations 5. Unfiltered exclusive content
$4.99/mo
Would you subscribe?
@AlejandroPiad and @yudivian I know what you vote would be, but let’s watch these results and see what the broader community thinks.
In my experience 1,000 answers is usually enough to capture the overall sentiment of my audience.
Imagine you have a ton of data, but most of it isn't labeled. Even worse: labeling is very expensive. 😑
How can we get past this problem?
Let's talk about a different—and pretty cool—way to train a machine learning model.
☕️👇
Let's say we want to classify videos in terms of maturity level. We have millions of them, but only a few have labels.
Labeling a video takes a long time (you have to watch it in full!) We also don't know how many videos we need to build a good model.
[2 / 9]
In a traditional supervised approach, we don't have a choice: we need to spend the time and come up with a large dataset of labeled videos to train our model.
But this isn't always an option.
In some cases, this may be the end of the project. 😟