Tweet

Santiago

Follow @svpino

Jun 21 • 9 tweets • 3 min read

The Hello World! of machine learning: Classifying handwriting digits.

But everyone solves this problem the same way.

Here is a different, non-boring approach that you haven't seen before.

1 of 9

I used Contrastive Learning to solve this problem.

Nobody gets away with listing MNIST in their portfolio unless you use a different, exciting approach.

Contrastive Learning is just that.

2 of 9

Here is the high-level idea:

1. Create a neural network that turns a picture of a digit into an embedding (a vector of numbers.)

2. Embeddings belonging to the same digit should be similar.

3. Embeddings belonging to different digits should be far apart.

3 of 9

Whenever we receive a new picture:

1. Use the network to create the embedding.

2. Compare it to every digit's template embedding.

3. Correct answer is the digit whose embedding is the most similar to the one we created.

4 of 9

To make this happen:

1. We need a network with 2 heads.
2. Take two images as the input.
3. Compute the distance between the results.

The loss function will help us minimize the distance between images of the same digit.

5 of 9

Look at the attached picture.

• Each input expects an image
• The "model" layer computes the embeddings
• The "lambda" layer computes the distance.

We want the distance to be small if both images are the same. If they are different, we want the distance to be large.

6 of 9

@ylecun

There's a name for this: "Siamese network."

Take a look at the attached code. It shows the Keras implementation of this model. I'm using @ylecun's Contrastive Loss.

The entire code is here:
deepnote.com/@santiago-vald…

7 of 9

@ylecun

@ylecun I trained the model for 15 epochs. It reached 92% accuracy on the test set.

Not ground-breaking results, but it's an excellent way to showcase Contrastive Learning.

Attached you can see 10 of the results (including 2 mistakes.)

8 of 9

@ylecun

@ylecun Siamese Networks are handy in real-life applications. I've used them a ton.

For classification problems, they are ideal if you only have a few samples of each class.

9 of 9

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

Jun 17

After talking to many machine learning engineers, only those at the top use this technique to train their models.

Distribute training makes a world of difference.

And contrary to what you may believe, anyone can start doing this immediately.

A quick summary:

1 of 8

You can parallelize the training process of your model.

Unless you work with small datasets that train fast, you should distribute the training process.

Time adds up. If you can avoid it, don't wait.

2 of 8

Most have heard about training on multiple computers.

Unfortunately, many don't know they can also distribute training on a single computer running multiple GPUs.

Most multi-GPU setups I've seen have one or more idle consuming electricity.

3 of 8

Read 9 tweets

Santiago

@svpino

Jun 13

Skip this unless you are starting as a software developer.

Here are 11 short problems that will help you practice. As you move through the list, their complexity increases.

It doesn't matter the language you are learning.

Try them out!

↓

1. Write a function that reverses an array in place.

In other words, the function should not use an auxiliary array to do the work.

2. Write a function that finds the missing number in an unsorted array containing every one of the other 99 numbers ranging from 1 to 100.

Read 12 tweets

Santiago

@svpino

Jun 9

Everyone says that deploying machine learning models is important.

But nobody ever talks about what it takes.

Deploying is not a button that you push or a function that you call. Let's talk about this:

1 of 14

Let's get something straight:

Deploying models is not something for MLOps teams to worry about.

If you build them, you should know how to use them.

You may not have to worry about scalability, availability, and every other -ility, but running a model is fundamental.

2 of 14

I talk to companies on a weekly basis.

Their machine learning team is one or two data scientists. They don't have the budget to look elsewhere.

If you are a data scientist, in 99.99% of the cases deploying models is part of your job.

3 of 14

Read 15 tweets

Santiago

@svpino

Jun 8

Writing code is just the first step.

But nothing matters unless you can deploy it and have people use it.

After 20+ years of building software, I tried to deploy a web3 smart contract, and holy shit, it was frustrating.

But there's light at the end of the tunnel!

1 of 8

@thirdweb_

I used the same tools that everyone recommended.

It was cumbersome, time-consuming, and error-prone. Technology that hasn't matured yet.

Today, @thirdweb_ deploy goes live. Probably one of the most critical innovations in the web3 space.

This is how it works:

2 of 8

@thirdweb_

@thirdweb_ In case you aren't familiar with @thirdweb_:

You have two choices to start with web3:

1. The hard way → Do-everything-yourself-good-luck!

2. The smart way → Use @thirdweb_ and let them worry about the complex stuff while you write the code that matters.

3 of 8

Read 8 tweets

Santiago

@svpino

Jun 6

20 questions to practice for machine learning interviews.

These questions focus mostly on neural networks. They cover some fundamental concepts you should know.

↓

1. Why is it important to introduce non-linearities in a neural network?

2. What are the differences between a multi-class classification problem and a multi-label classification problem?

3. Why does the use of Dropout work as a regularizer?

4. Why you shouldn't use a softmax output activation function in a multi-label classification problem?

5. Does the use of Dropout in your model slow down or speed up the training process?

Read 9 tweets

Santiago

@svpino

Jun 3

You don't need to "understand" how a machine learning model works before using it.

It's not a prerequisite.

Many have created this narrative, and it's funny because as soon as you talk to them, you realize the hypocrisy of the argument.

1 of 10

Let me start by saying that I have no interest in arguing using extremes:

• Zero understanding is not helpful.
• Full understanding is not realistic.

A more interesting question:

How much do you need to understand to accomplish your goals?

2 of 10

This is obvious but still worth remembering:

I can't fly a plane without proper training.

At the same time, I don't need to understand how the engine works to be a darn good pilot.

Both things can be accurate at the same time.

3 of 10

Read 10 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Santiago

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?