Tweet

Santiago

Follow @svpino

18 Mar, 13 tweets, 3 min read

The ability to reuse the knowledge of one model and adapt it to solve a different problem is one of the most consequential breakthroughs in machine learning.

Grab your ☕️ and let's talk about this.

🧵👇

A deep learning model is like a Lego set, with many pieces connected, forming a long structure.

These pieces are layers, and each layer has a responsibility.

Although we don't know exactly the role of every layer, we know that the closer they get to the output, the more specific they get.

The best way to understand what I mean is through an example: a model that will process car images.

The top layer (closer to the input) may focus on high-level details of the image. For example, it could focus on extracting edges from the input image.

The next layer may focus on using these edges to form lines.

The layer after that may use the lines to form shapes.

Then, a new layer may focus on specific car components, and the closer you get to the output, the more specific each layer will be.

The output layer will simply decide whether there's a car in the input image.

This is a simplification, but it hopefully illustrates the idea.

Imagine that we trained our car model with 100k images over a few days. A lot of work!

Now it's time to move on and build a new model. But this time we want a model specific to trucks.

There are a couple of problems with this second model:

1. We might not have that many truck images, so the results might not be good.

2. It'll be really shameful to waste all the work we did with the car model because cars and trucks are really similar.

Here is where the magic happens!

Let's get the official definition out of the way: we call this thing you are about to learn "transfer learning."

Transfer learning is a method where we can reuse a model that was developed for one task as the starting point for another task.

Basically, we will reuse most of the stuff our car model learned and transfer it to our truck model.

Think about it: edges, lines, colors, textures... a lot of the knowledge is the same. We will only need to train the new model to recognize the things specific to trucks!

We can disconnect the bottom layers from the car model (the ones closer to the output) because we know they learned things specific to cars.

In their place, we will connect fresh new layers that we will train with truck images.

This way, we can reuse a lot of knowledge and focus on training a small portion of the model on what's specific to the new problem.

Because the hard part is already learned, we don't need that much data for this, neither as much training time as the car model took!

Transfer learning is the reason you and I can build a state-of-the-art computer vision model without having to collect thousands of pictures or having to spend a fortune on infrastructure costs.

We stand on the shoulder of giants!

Hey! We can do this shit together!

Stay tuned. More threads like this coming your way!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

19 Mar

Thoughts about starting with machine learning.

☕️🧵👇

Three unnegotiable prerequisites:

1. Software development
2. Algorithms and data structures
3. Communication

If you build a strong foundation on these, you'll be unstoppable.

Learn to build software.

It's hard to make progress with machine learning if you struggle with programming.

Read 23 tweets

Santiago

@svpino

18 Mar

I can't shut up about neural networks.

What questions do you have?

https://twitter.com/Purni80707264/status/1372641475557740552?s=20

They aren't necessarily opposite concepts.

Fully connected refer to networks composed of layers where every node is connected to every node of the next layer.

Deep networks refer to networks with many layers. They could be fully connected or not.

https://twitter.com/Purni80707264/status/1372641475557740552?s=20

https://twitter.com/mayor_onyebueke/status/1372641390136602631?s=20

Especially with deep learning, where you have many layers full of nodes, it's hard to understand the "thinking" of a network because you'll have to reverse-engineer million of float values and try to make sense of them.

Hard to do.

https://twitter.com/mayor_onyebueke/status/1372641390136602631?s=20

Read 7 tweets

Santiago

@svpino

17 Mar

Here are some of the features that make Python 🐍 a freaking cool language.

🧵👇

1. You can slice and dice arrays very easily.

2. What's even better: negative indexing is really cool, and you can use it to refer to items from the last element in the list.

Read 15 tweets

Santiago

@svpino

16 Mar

5 Python 🐍 package managers that I'm not using anymore:

▫️ conda
▫️ virtualenv
▫️ venv
▫️ pipenv
▫️ poetry

🤷‍♂️

Instead, for several weeks now, I've been using development containers in Visual Studio Code.

Life-changing. Give 'em a try.

https://twitter.com/svpino/status/1365778771169726469

Here is a thread I wrote a few weeks back when I started using them:

https://twitter.com/svpino/status/1365778771169726469

An important note: here I’m referring to the “virtual environment” capabilities of these tools. I still need to pip modules down.

But I’ve been isolating environments with the containers instead.

Read 5 tweets

Santiago

@svpino

15 Mar

You aren't doing yourself any favors if you aren't throwing away your validation data regularly.

It's painful, I know, but you are looking for trouble if you don't do it.

Let's talk about what happens with your data and your model.

Grab the ☕️, and let's do this thing. 🧵👇

https://twitter.com/svpino/status/1359827674097672195?s=20

Every machine learning tutorial teaches you about splitting your dataset.

They either go with train/test or train/validation/test. Nomenclature doesn't matter here. You just need to understand how each one of these is used.

Here is a thread about this:

https://twitter.com/svpino/status/1359827674097672195?s=20

Let's think of a neural network and focus on the train set for a second.

We use this to train our model. The data on this set is the one the network uses to adjust the weights.

And, of course, the model will get really good at solving this set.

Read 8 tweets

Santiago

@svpino

15 Mar

My recommendation to learn machine learning:

▫️ Machine Learning Crash Course (Google)
▫️ Machine Learning (Coursera)

Take them in order. They are both free. They are both amazing.

(Before you embark on this journey, make sure you feel comfortable writing Python 🐍.)

https://twitter.com/ttnmhttspr/status/1371375266761560064

Not really. Nothing has changed with the fundamentals. The course is as relevant today as it was back in 2010.

https://twitter.com/ttnmhttspr/status/1371375266761560064

https://twitter.com/aleti_sunil/status/1371335131143811073

Experience will help you make a few sensible choices that you can later test.

https://twitter.com/aleti_sunil/status/1371335131143811073

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?