Prashant Profile picture
Oct 20, 2021 14 tweets 4 min read Read on X
MAPE is another metric used in performance evaluation in machine learning.

The formula looks a tad bit complex but it isn't.

Let's try to break it down. ↓
• To start with, Mean Absolute Error is a metric which shows how far a value is to the target value.
• To get the absolute error, we can just subtract the predicted value from the target value and remove the sign.

• We can sum the errors and divide by total to get the mean absolute error.
• This is useful, say we are predicting the prices of cars which are sedan.

• Mean Absolute Error would provide average error value on all the sedan cars.

• If MAE is 5k means that we are predicting off by $5k on average on sedan cars. This could be both 5k less or 5k more.
• This gives us a good estimate in terms of price how wrong we are.

• But what if we our data also contains SUV cars?

• This would create problem as the price range of SUVs is generally higher so, the error range on SUVs could also be higher compared to sedan's 5k.
• Now we can't actually say that an error of somewhere around 8k on SUV is worse than 5k on Sedan.

• If the price of the SUV is 80000 and of sedan is 50000, both of the errors are same in terms of percentage (10%).
• So we can see that in scenarios where the target ranges vary, calculating percentage error could be useful.

• As it gives us normalised and scale independent error values unlike absolute errors.
• To convert absolute errors to absolute percentage errors, we need to divide the absolute error terms with the target values.

• Sum those percentage errors.

• Then multiply by 100% and divide by the number of observations to calculate the mean.
• This gives us our Mean Absolute Percentage Error and a fair scale of comparison for the errors.

• MAPE is widely used in Time-Series Forecasting.

• But MAPE also has its downsides:
1. One of the limitation is using MAPE when the target or the actual values are on lower scale.

If the values lie on lower range like (0, 2, 10, 5) the percentage errors can be quite large, even greater than 100% error.
2. MAPE is asymmetric and penalises more when the predictions are higher than the target values.

For predictions lower than target, the upper limit is 100%, while for over predictions there is no limit, it could be 200% , 400%, 800% and so on...
3. There is the problem of dividing by zero.

If the value of the prediction target is 0, the division by zero problem occurs making it unable to calculate MAPE for that observation.
Here is a python function for the Mean Absolute Percentage Error.
Overall, MAPE is a very useful and widely used metric.

Make sure to use it in conjunction with other metrics to get unbiased and unmistaken conclusions.

Thanks for reading!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Prashant

Prashant Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @capeandcode

Feb 24, 2023
How ChatGPT Works?

• Everyone knows about ChatGPT but not everyone is aware of how it works.

Here is an attempt at explanation ↓

[ In 10 Steps ]
• It is a large language model based which uses a technique called "transformer" to understand and generate human-like responses to text-based input.

• Transformer is a neural network architecture that excels at processing sequential data, such as text.
• The model is trained on a massive amount of text data from a wide range of sources, such as books, articles, and websites.

• These resources allow the model to learn patterns and relationships in language.
Read 12 tweets
Sep 22, 2022
"Learn SQL"

Great advice no doubt.

• But what topics to cover?
• Which SQL database to use?
• What resources to learn from?

Here's is a track you can follow ↓

1/22
Let's start with choosing the SQL database to learn.

• There are several of databases like Postgres, MS SQL server, MS Access, Oracle.

• But for learning purposes I'd suggest going with MySQL.

• For reasons that it is secure, free & open source and the support is great.
• Later on we can switch to other databases easily.

• Now what topic should we know in SQL? I knew the basics like select, where and joins but I couldn't get past them.

• So here are some topics and terms everyone should learn, starting with...
Read 23 tweets
Sep 14, 2022
Interview Question:

• What is Covariance?
• What is Correlation?

• What are the differences between them?

Explain briefly ↓

0/9
COVARIANCE

• Covariance tells us the systematic relationship between two random variables, in which a change in one reflects the change in other.

• It measures the joint variability of two random variables.

• The formula for covariance is:

1/9
• It has a range of [-∞, +∞].

• The greater the covariance, the more reliant the relationship.

• The covariance value isn't very easy to interpret and depends on the context.

2/9
Read 11 tweets
Sep 12, 2022
Interview Questions

• How does k-means work?
• What are its stopping criteria?
• What are its pros and cons?
• How do you choose its number of clusters?

Explain briefly ↓

0/4 Image
1. Working

• k-means is an unsupervised algorithm.

• We want to create groups of similar data points using this algorithm.
• In k-means we begin with a set of random points as clusters.

• We measure the distance of each point from the clusters using some distance metric like euclidean or cosine.

• We assign points to the cluster that is closest to it.
Read 11 tweets
Sep 2, 2022
Another common interview question

• What are the assumptions of Linear Regression?
• How do we check them?
• How can we fix them?

Here's the answer ↓

0/5
1. Linear Relationship

It is assumed that the relationship between the dependent and independent variables is linear.
!How to check:

• Observe the Residuals vs Fitted Value plots, there shouldn't be any evident pattern.
Read 17 tweets
Aug 31, 2022
Random Forests is a favorite for interviews!

By far the most common questions that I have been asked are one way or other related to Random Forests

It's important to know it inside out.

Here's are some of those questions:

0/8
Q: What ensemble principle is used in Random Forests?

A: Random Forest works on the principle of the bagging ensemble technique.

Bagging stands for Bootstrap Aggregation.
In Bagging, random data samples in a training set are used with replacement.

1/9
Q: Do Random Forests require pruning?

A: Random Forests usually do not require pruning as they don't overfit like a single DT as trees are bootstrapped and multiple random trees use random features so the individual trees are strong predictors without being correlated.

2/9
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(