Santiago Profile picture
12 Jan, 7 tweets, 3 min read
Here is an interesting problem:

You trained a model to classify pictures of 100 different animal species. It does a good job at it.

But when you show it a picture with a species that wasn't part of the training set, the results are obviously wrong.

How do you work around this?
This is also known as a "negative" class, and it helps with this problem, assuming you are capable of collecting images from unknown objects.

I've also found the advantages of this negative class to diminish as more random objects are thrown in there.

It turns out that knowing what you don't know is a tough problem to solve in machine learning.

You'd expect the confidence score returned by the model to be very low for unknown objects. This is, unfortunately, not necessarily the case.

Intuitively, looking at the confidence score seems like a sensible approach. In practice is less so.

It's very common for a classification model to return a very high score for unknown classes, especially when these classes share common patterns.

Remember, the problem is not whether you can improve your model. The problem is about understanding that your prediction is completely wrong even when you get a high confidence score from your model.

Unfortunately, this is not an "overfitting" issue.

The model has never been trained with the class that we are trying to infer. However, the model still offers an incorrect prediction.

This is a promising approach.

Notice that the heart of it is to properly detect out-of-distribution queries to our model.

Still not sure how this could be implemented in practice, but I'll dive deeper and report back any findings.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

14 Jan
10 machine learning YouTube videos.

On libraries, algorithms, and tools.

(If you want to start with machine learning, having a comprehensive set of hands-on tutorials you can always refer to is fundamental.)

🧵👇
1⃣ Notebooks are a fantastic way to code, experiment, and communicate your results.

Take a look at @CoreyMSchafer's fantastic 30-minute tutorial on Jupyter Notebooks.

Image
2⃣ The Pandas library is the gold-standard to manipulate structured data.

Check out @joejamesusa's "Pandas Tutorial. Intro to DataFrames."

Image
Read 11 tweets
11 Jan
10 fundamental practices that will improve your career in tech.

🧵👇
[1] Understand the power of "good enough."

A working, good enough solution is usually better than a non-existent perfect solution.

Learn to balance constraints. Know where and when to compromise and when to say "enough."
[2] If you get stuck, ask for help.

Don't spin your wheels indefinitely, trying to solve a problem that can be easily solved by someone else.

Know when you should keep trying and when to stop and ask.
Read 11 tweets
9 Jan
Multi-label classification problems seem to get less attention than binary or multi-class classification problems.

They are widespread in real life, so you should definitely know how to recognize them and solve them.

This is a short 🧵👇
In machine learning, you are in front of a multi-label classification problem whenever you want to classify your samples using one or more labels.

For example, you could classify a movie as "Horror," "Thriller," and "Classic" simultaneously.

[2 / 5]
In more formal terms:

Multi-label classification is a predictive modeling technique that predicts zero or more mutually non-exclusive labels.

[3 / 5]
Read 5 tweets
7 Jan
Telling people "you can write code on your phone" is disingenuous.

I get that you are trying to motivate others, but this is not practical, neither helps anyone.
We tell people that starting with Python is better than starting with C++ because it is much easier to start and we don’t want them to lose their motivation.

Most people who have to type a program in their phones will be demotivated in a week.
Of course, if you don’t have any other way to access a computer, do what you have to do to learn.

But if there’s a chance to use a computer, spend the energy there and you’ll be better off for it.
Read 4 tweets
6 Jan
An interesting machine learning problem that's quite common 👇:

Let's say you need to identify the model of a phone based on a set of pictures of the device. That is, for every request, you'll get one or more images of a device, and you need to answer with each model.

🧵
[2] A plausible solution is to implement a deep learning model that, given an image, determines the correct model of the device (a regular classification model.)

You can run each image through that deep learning model, and this will give you a set of possible answers.

👇
[3] Now, looking at the set of possible answers, you need to determine how to select the correct answer.

Imagine you get the following 5 possible answers:

- Nokia 95
- iPhone 12
- iPhone X
- iPhone 12
- Samsung Galaxy 5

Which one is correct?

👇
Read 6 tweets
3 Jan
A machine learning workflow:

1. Define the problem
2. Assemble a dataset
3. Determine success metrics
4. Decide on evaluation method
5. Prepare the data
6. Establish a baseline
7. Develop a model that beats the baseline
8. Overfit model
9. Regularize model
10. Tune model
Where's model validation in this workflow?

Notice that steps 8, 9, and 10 presume the existence of a mechanism to evaluate the model. This means that model validation is implicitly part of this workflow.
"Assembling a dataset" focuses on determining what will be the sources of data that we will need to solve the problem.

Before understanding metrics of success, we need to have access to the data that we will be using.

Later, "Preparing the data" focuses on that data.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!