Tweet

Santiago

Follow @svpino

1 Mar, 19 tweets, 4 min read

Let's talk about learning problems in machine learning:

▫️ Supervised Learning
▫️ Unsupervised Learning
▫️ Reinforcement Learning

And some hybrid approaches:

▫️ Semi-Supervised Learning
▫️ Self-Supervised Learning
▫️ Multi-Instance Learning

Grab your ☕️, and let's do this👇

Supervised Learning is probably the most common class of problems that we have all heard about.

We start with a dataset of examples and their corresponding labels (or answers.)

Then we teach a model the mapping between those examples and the corresponding label.

[2 / 19]

The goal of these problems is for a model to generalize from the examples that it sees to later answer similar questions.

There are two main types of Supervised Learning:

▫️ Classification → We predict a class label
▫️ Regression → We predict a numerical label

[3 / 19]

A Supervised Learning Classification example:

Given a dataset with pictures of dogs and their corresponding breed, build a model that determines the breed of a new picture of a dog.

Notice how the goal is to predict a class label (the breed of the dog.)

[4 / 19]

A Supervised Learning Regression example:

Given the characteristics of a group of houses and their market value, build a model that determines the value of a new house.

Notice how the goal is to predict a numerical label (the value of the house.)

[5 / 19]

Unsupervised Learning is about finding relationships in data.

There are no labels involved in this process. We aren't directly teaching the algorithm through labeled examples. We are expecting it to learn from the data itself.

[6 / 19]

An example of Unsupervised Learning:

Given a list of prospective customers, group them into different segments so your marketing department can reach out to them.

Here the algorithm will determine different groups for your customers based on existing relationships.

[7 / 19]

Clustering is the most common example of Unsupervised Learning.

You have probably heard of k-Means as one of the most popular clustering algorithms. Here, "k" represents the number of clusters we want to find.

[8 / 19]

Reinforcement Learning is pretty cool:

An agent interacts with the environment collecting rewards. Based on those observations, the agent learns which actions will optimize the outcome (either maximizing rewards or minimizing penalties.)

[9 / 19]

An example of Reinforcement Learning:

A robot learning its way from point A to point B in a warehouse by walking and exploring the different paths between the two locations.

Every time the robot gets stuck is penalized. When it reaches the goal, it is rewarded.

[10 / 19]

But of course, AlphaZero (Chess) and AlphaGo (Go) are probably two of the most popular Reinforcement Learning implementations.

DeepMind is the company behind all of this research. Check out their website for some really cool articles.

[11 / 19]

In Semi-Supervised Learning, we get a lot of data but only a few labels. Sometimes, even the labels we have are not completely correct.

The goal is to build a solution that takes advantage of all the data we have, including the unlabeled one.

[12 / 19]

https://twitter.com/svpino/status/1365307434856820736?s=20

A few days ago I posted a thread about Active Learning, a semi-supervised approach.

Check it out if you are looking for more information about one possible way to approach this problem.

[13 / 19]

https://twitter.com/svpino/status/1365307434856820736?s=20

Self-Supervised Learning is a subset of Unsupervised Learning.

The idea is to use Supervised Learning to solve a task (pretext task) that can later be used to solve the original problem.

We frame the problem in a way where we can take advantage of existing labels.

[14 / 19]

Three popular examples of Self-Supervised Learning problems:

▫️ Autoencoders
▫️ Transformers
▫️ Generative Adversarial Networks

I'm planning to talk more about these 3 in the future, so stay tuned for that.

[15 / 19]

The last one on our list is Multi-Instance Learning, a subset of Supervised Learning.

The difference here is that instead of having labels for every specific example, we have labels for groups of examples.

These groups are called "bags."

[16 / 19]

Multi-Instance Learning models focus on predicting the class of a bag given the individual instances contained in it.

An example of this type of problem is content-based image retrieval, where we aim to find images based on the object it contains.

[17 / 19]

Whenever you are in front of a problem, you could identify potential solutions if you can recognize its type given its characteristics.

That's the value in understanding these types of machine learning problems and possible applications.

[18 / 19]

I have a ton more of this type of content for you. Like, pounds and pounds of it!

Just follow me, so you don't miss the fun part of Twitter.

And thank you for the support!

[19 / 19]

🦕

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

3 Mar

The two questions related to neural networks that I hear most often:

▫️ How many layers should I use?
▫️ How many neurons per layer should I use?

There are some rules of thumb that I'll share with you after you get your ☕️ ready.

🧵👇

First, let's get this out of the way:

A neural network with a single hidden layer can model any function regardless of how complex it is (assuming it has enough neurons.)

Check the "Universal Approximation Theorem" if you don't believe me.

↓

So, if we can do it all with a single layer, why bother adding more layers?

Well, it turns out that a neural network with a single layer will overfit really quick.

The more neurons you add to it, the better it will become at memorizing stuff.

That is bad news.

↓

Read 10 tweets

Santiago

@svpino

2 Mar

Let's talk about how you can build your first machine learning solution.

(And let's make sure we piss off half the industry in the process.)

Grab that ☕️, and let's go! 🧵

Contrary to popular belief, your first attempt at deploying machine learning should not use TensorFlow, PyTorch, Scikit-Learn, or any other fancy machine learning framework or library.

Your first solution should be a bunch of if-then-else conditions.

Regular, ol' conditions make for a great MVP solution to a machine learning wannabe system.

Pair those conditions with a human, and you have your first system in production!

Conditions handle what they can. Humans handle the rest.

Read 14 tweets

Santiago

@svpino

2 Mar

If you want to start with Machine Learning and need some guidance, I want to give you access to my entire course for $10. Today only.

And if you don't like it, you pay $0. But I promise you'll love it!

Thanks to the 100+ of you who already bought it!

👉 gumroad.com/l/kBjbC/50000

If you can’t afford this, reply below explaining how do you think this will help you. I’ll give away 10 copies for free.

Thanks to everyone that has taken advantage of this offer so far!

There are still a few more hours left.

If starting with machine learning feels overwhelming, then this is for you.

gumroad.com/l/kBjbC/50000

Read 4 tweets

Santiago

@svpino

2 Mar

Some hard skills that I use every day as a Machine Learning Engineer:

▫️ A whole lot of Python
▫️ TensorFlow, Keras, Scikit-learn
▫️ AWS SageMaker
▫️ Jupyter
▫️ SQL
▫️ Probabilities, Statistics
▫️ Google Spreadsheets (seriously!)
▫️ Software Engineering

https://twitter.com/Ivy48462095/status/1366741134337212422?s=20

General notions of linear algebra are useful, especially when you want to understand how certain things happen behind the scenes.

That being said, I don't consider myself an expert and it's not part of the day-to-day.

https://twitter.com/Ivy48462095/status/1366741134337212422?s=20

https://twitter.com/sirwallax/status/1366708106441486339?s=20

You could also use Excel.

I use Google Spreadsheets because it's in the cloud, and it's convenient for me. I don't have Microsoft Office installed, and as long as spreadsheets aren't crazy large, Google has what I need.

https://twitter.com/sirwallax/status/1366708106441486339?s=20

Read 6 tweets

Santiago

@svpino

28 Feb

Here are the best 10 machine learning threads I posted in February.

They go all the way from beginner-friendly content to a broader dive into specific machine learning concepts and techniques.

I'd love to hear which one is your favorite!

🧵👇

Having to pick only 10 threads is painful. I always struggle to decide what should stay out of the list.

This, however, is a great incentive when I'm writing the content: I have to compete against myself to make sure what I write ends up being part of the list!

[2 / 13]

https://twitter.com/svpino/status/1357302018428256258?s=20

[Thread 1]

An explanation about three of the most important metrics we use: accuracy, precision, and recall.

More specifically, this thread shows what happens when we focus on the wrong metric using an imbalanced classification problem.

[3 / 13]

https://twitter.com/svpino/status/1357302018428256258?s=20

Read 13 tweets

Santiago

@svpino

27 Feb

For the first time yesterday, I set up a project using a Development Container in Visual Studio Code and it immediately hit me:

✨ This is the way going forward! 🤯

If you haven't used this yet, here are some thoughts.

👇

@code

The basic idea: you can run your entire development environment inside a container.

Every time you open your project, @code prepares and runs your container.

[2 / 7]

There are several advantages to this:

First of all, your entire team will run exactly the same environment, regardless of their preferred operating system, folder structure, existing libraries, etc.

Everyone will have a mirrored experience.

[3 / 7]

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?