Santiago Profile picture
22 May, 21 tweets, 4 min read
I'll blow your mind with a technique you aren't using yet.

Sometimes, you want your system to do exactly the opposite of what your machine learning model thinks you should do.

Let me convince you. ↓
I'm going to start with a nice problem:

Imagine a model that looks at a picture of an electrical transformer and predicts whether it's about to break or not.

Don't worry about how the model does this. We are going to focus on the results instead.
There are 4 possible results for this model:

1. It predicts a bad unit as bad.
2. It predicts a bad unit as good.
3. It predicts a good unit as bad.
4. It predicts a good unit as good.

#2 and #3 are the mistakes the model makes.
Assuming we run 100 units through the model, we can organize the results in a matrix:

• The rows represent the "actual" condition of the transformer.

• The columns represent the "prediction" of the model.

We call this a "Confusion Matrix."
This is how we can read this confusion matrix:

• 60 bad units were predicted as bad.
• 3 bad units were predicted as good.
• 7 good units were predicted as bad.
• 30 good units were predicted as good.
We have a name for each one of these values:

1. True positives (TP) = 60
2. False negatives (FN) = 3
3. False positives (FP) = 7
4. False negatives (FN) = 30

In this example, "POSITIVE" corresponds to a bad unit.
Our model made 10 mistakes:

• 7 false positives
• 3 false negatives

At first glance, it may seem that we have a problem with the false positives.

But here is where things start getting interesting.
What happens if we need to send a technician to inspect every transformer that our model thinks is about to break?

Let's say that it takes 2 hours to inspect the transformer, and the technician charges $100/hr.

Every false positive will cost us $200 (2 hours x $100/hr)!
We have 7 false positives.

$200 x 7 = $1,400.

Out of the 100 samples we ran, we'll incur $1,400 in false positives if we follow the model's recommendation.

Let's now take a look at the false negatives.
Imagine that if we miss a bad transformer, the unit breaks, so there will be an outage, and we'll need to scramble to restore service to that area.

The average cost of fixing this mess is $1,000.

We got 3 false negatives.

$1,000 x 3 = $3,000.
Following the model's recommendations, our total cost would be:

• False-positive costs: $1,400
• False-negative costs: $3,000

$1,400 + $3,000 = $4,400.

Now it's time to do some magic and optimize this.
Before we get to the fun part, keep this in mind:

In this example, false negatives are more expensive than false positives.

We want our model to minimize false negatives.

How can we do this?
Here is a really cool approach.

Let's assume that our model assigns a probability to each prediction.

So whenever it says "this unit is bad...", it also returns "...with 65% percent probability."

We can use that!
Let's illustrate this with an example.

Our model returns:

• Prediction: Good.
• Probability: 55%.

Assume we can compute the opposite probability as 1 - 0.55. Therefore:

• Prediction: Bad
• Probability: 45%

Let's combine this with the costs.
We only care about the mistakes because if the model gets it right, the cost is $0.

Before trusting the result, we will compute what's the potential cost if it makes a mistake:

• Model predicts "Good," but the unit is "Bad."
• Model predicts "Bad," but the unit is "Good."
Potential Mistake 1: The model predicts "Good," but the unit is "Bad."

This would be a False Negative.

Probability: 55%
Cost: $1,000

Potential cost of returning "Good": 0.55 * $1,000 = $550.
Potential Mistake 2: The model predicts "Bad," but the unit is "Good."

This would be a False Positive.

Probability: 45%
Cost: $200

Potential cost of returning "Bad": 0.45 * $200 = $90.
Think about this:

• The model predicted that the unit is "Good."

• If we trust it, and we are wrong, our cost will be $550.

• If we do the opposite, and we are wrong, our cost will be $90.

Our best bet, in this case, is to do the opposite of what the model says!
This technique has a name:

"Cost Sensitivity."

Adding a cost-sensitive layer on top of the mistakes of a model is a great way to analyze and optimize your predictions.

Isn't this beautiful?
Every week, I post 2 or 3 threads like this, breaking down machine learning concepts and giving you ideas on applying them in real-life situations.

If you find this helpful, follow me @svpino so we can do this thing together!
Details in the thread:

- I missed “True negatives” when describing the values in the confusion matrix. It should be item #4.

- The percentages (55% and 45%) are swapped when computing the estimated costs.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

24 May
Today, Python 🐍is the best language you can learn if you want to get into machine learning.

It has many things going for it, but here are 2 very important:

• It's very flexible.
• It's very popular.

This, however, can change.
I get a lot of questions about Python versus R.

R is great, but it doesn't have the flexibility that Python does.

Learn Python, and you can use it for more than machine learning: you can build your backend, MLOps, DevOps, tooling, etc.

One language. Multiple uses.
If you look at the literature today, it's Python all the way down.

If you look at libraries and frameworks, they are Python-first with a few spinoffs on different languages.

It's hard to go against the current.
Read 5 tweets
23 May
The best advice I've ever gotten:

"Nobody remembers who came in second place."

I played a lot of sports. I was taught that winning is the only thing that matters. Participation trophies weren't a thing back then.

That's how I'm wired.

Many people prefer to approach their careers with a different, less competitive mentality.

I respect that. I just prefer a different approach.

Complacency is something that terrifies me, and I've found that healthy competition keeps me away from it.

"Never compare yourself to others." ← This may be good advice, but it doesn't work for me.

I've tried to use yesterday's version of myself to try and improve every day. But this is not enough, too little, and not fun.

Read 4 tweets
20 May
Machine learning is not what you think.

Here are five levels of automation:

• L0. Human-only
• L1. Shadow mode
• L2. AI-Assisted
• L3. Partial automation
• L4. Full automation

Everyone thinks L4, but that's not necessarily the end game.

Let's break these down. ↓
• L0. Human-only

Obviously, there's no automation at this level. Human makes all the decisions, so this is a manual process.

This is usually the initial state of any system right before we start automating it.
• L1. Shadow mode

As we introduce AI, a useful step is to deploy it parallel to the existing manual process.

We route requests to both the person and the system and get answers from both.

At this level, the AI system is not involved in any decisions.
Read 9 tweets
19 May
Wanna get into Machine Learning but don't know where to start?

Join the other 1,400 people that have watched my course. It's gotten well over 100 5-star reviews!

$5 for the next 24 hours only!

Your full money back if you don't like it.

gum.co/kBjbC/five
If you want to support my content but don't have the money or don't need the course, a retweet/like of the original tweet goes a long way!

If you can't afford the course, comment under the original tweet, and I'll send you a link to a free copy.

Thanks for the support!
18 more hours until the price goes back to normal.
Read 5 tweets
18 May
A surprising situation.

A machine learning model predicting 3 classes ends up with these results:

• Class A: 90% accuracy
• Class B: 80% accuracy
• Class C: 70% accuracy

Class C seems to be the worst-performing.

But you know already that I'll say it isn't. ↓
Here is the correct answer: We don't know yet which class is the worst.

This is important to understand, so keep reading.

Many people make this same mistake every day and try to fix things that ain't broken.

We are missing something, so let's go and get it.
How about if we ask a friend to look at the dataset manually?

Let's say these are the results of our friend doing the job:

• Class A: 95% accuracy
• Class B: 100% accuracy
• Class C: 71% accuracy

Do you see what just happened?
Read 6 tweets
10 May
A 6-step process that completely changed my life:

• Maximize what you don't learn
• Avoid schedules
• Use uncomfortable situations
• Learn as a byproduct
• Teach somebody else
• Circle back in a month

On how to learn efficiently and get ahead in life: ↓
Everything starts with maximizing the things I don't learn.

If I spend time on things that don't bring me value, I can't focus on what really matters to me.

By default, everything around me is noise until it's impossible to ignore.
If I don't see the value right away, I'll ignore it. Important things will make their way back to me.

Ignoring the noise makes space for what truly matters.
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(