Santiago Profile picture
19 Apr, 12 tweets, 3 min read
Is 10 twice as worse as 5? Sometimes it is, but sometimes it's even worse.

This is the question I always ask myself when deciding how to penalize my models.

Read on for more details and a couple of examples:

↓ 1/11
When we are training a machine learning model, we need to compute how different our predictions are from the expected results.

For example, if we predict a house's price as $150,000, but the correct answer is $200,000, our "error" is $50,000.

↓ 2/11
There are multiple ways we can compute this error, but two common choices are:

• RMSE — Root Mean Squared Error
• MAE — Mean Absolute Error

Both of these have different properties that will shine depending on the problem you want to solve.

↓ 3/11
Remember that the optimizer uses this error to adjust the model, so we want to set up the right incentives for our model to learn.

Here, I'd like to focus on one important difference between these two metrics, so you can always remember how to use them.

↓ 4/11
When comparing RMSE and MAE, remember that "squared" portion of the first.

It means that we are "squaring" the difference between the prediction and the expected value.

Why is this relevant?

↓ 5/11
Squaring the difference "penalizes" larger values.

Let's go back to the introduction of this thread where I asked whether 10 was twice as bad as 5, and let's see what happens with one example.

↓ 6/11
If we expect our prediction to be 2, but we get 10, and we are using RMSE, our error will be (2 - 10)² = 64.

However, if we get a 5, our error will be (2 - 5)² = 9.

64 is definitely much larger than 18 (2 x 9)!

↓ 7/11
MAE doesn't have the same property: the error increases proportionally with the difference between predictions and target values.

Understanding this is important to decide which metric is better for each case.

Let's see a couple of examples.

↓ 8/11
Predicting a house's price is a good example where $10,000 off is twice as bad as $5,000.

We don't necessarily need to rely on RMSE here, and MAE may be all we need.

↓ 9/11
But predicting the pressure of a tank may work differently: while 5 psi off may be within the expected range, 10 psi off may be a complete disaster.

Here 10 is much worse than just two times 5, so RMSE may be a better approach.

↓ 10/11
Keep in mind that there is more nuance to MAE and RMSE.

The way they penalize differences is not the only criteria you should look at, but I have always found this a good, easy way to understand how they may help.

↓ 11/11
If you found this thread helpful, follow me @svpino for weekly posts touching on machine learning and how to use it to build real-life systems.

I always try to hide the math and come up with easy ways to explain boring concepts. If that's your thing, stay tuned for more.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

20 Apr
Yesterday, @PrasoonPratham posted a step-by-step guide to solve the Titanic challenge on Kaggle.

I thought it'd be fun to engineer some features that can help build an even better model.

Here are some ideas worth considering.

↓ 1/10 Image
Attached you can find the original set of input variables that come with the data to solve the problem.

We are going to transform some of these into features that should help our model produce better results.

This is what Feature Engineering is all about.

↓ 2/13 Image
Keep in mind that these are just hypotheses that you'll have to try and validate.

Some of these suggestions might not improve the results or could even make the model perform worse.

This is an exercise to try and think creatively about the data we are getting.

↓ 3/13
Read 14 tweets
18 Apr
Have you upgraded your project to Python 🐍 3.9 yet?

Read on for some of the new syntax and built-in features in Python that you don't want to miss.

1/5
1. You can now merge dictionaries by using a new operator "|".

See PEP 584 for more information: python.org/dev/peps/pep-0….

↓ 2/5 Image
2. There's another new operator "|=". This one will let you update a dictionary.

See PEP 584 for more information: python.org/dev/peps/pep-0….

↓ 3/5 Image
Read 6 tweets
17 Apr
Learning Python 🐍 is not just a prerequisite for getting into machine learning, but it's an investment that will help your career for the rest of your life.

A thread with my recommendations:

↓ 1/10
Before getting into it, if you have never coded before, make sure you build your Python skills to a point where you feel really comfortable solving problems.

Take your time, and don't rush it.

↓ 2/10
A lot of people want to read a book and immediately jump into machine learning.

Everything is possible in life, but I'd recommend you focus on building a strong coding foundation before moving on.

↓ 3/10
Read 11 tweets
15 Apr
Challenge:

Here are 10 interesting questions about neural networks.

1/6
1. What's the best way to initialize the weights of a neural network?

2. What criteria would you follow to choose between mean squared error (MSE) and mean absolute error (MAE) as your loss function?

2/6
3. How do you determine the number of hidden layers that will best solve your problem?

4. How do you decide it is a good time to introduce a scheduler to decrease your learning rate?

5. For how long should you keep training your neural network?

3/6
Read 9 tweets
15 Apr
A short thread that will help you understand how a useful machine learning model differs from a crappy one.

(In English, without any jargon or math.)

This is also the way I first understood what "overfitting" means.

1/7
When building a machine learning model, our ultimate goal is to "generalize" the lessons it learns to unseen data.

When we train it, we expect it to extract what's relevant, discard what's not, and understand the data's underlying patterns.

2/7
Imagine we have an image, and we want to determine whether it's a picture of a person or not.

These are some features that could be relevant to the model:

• There're two eyes
• There's a nose
• There's a mouth
• There are ears

3/7
Read 8 tweets
14 Apr
This is a coffee from Starbucks. It costs $5.49.

You drink it in 5 minutes, and it's gone forever. And if you don't like it? Well, there's nothing you can do about that.

I have two deals for you that are much better than this. Read on.

Deal 1 is the story of how I built my audience from 0 to 30,000 followers.

You can do the same, and I'll show you every trick and strategy that I followed.

And it's $5 only for the next 12 hours. The price goes back to $20 right after that.

Link: gum.co/LLhIW/crazy.
Deal 2 is even better! A few days ago, I launched a private community where I'm sharing my journey to 100,000 followers.

It's a small group right now with full access to every strategy, tip, and experiment that I'm running.

A Slack channel, weekly videos, a journal, and more.
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!