Bias vs. variance in 13 charts.

πŸ§΅πŸ‘‡
Here is a sample 2-dimensional dataset.

(We are just representing here the training data.)

πŸ‘‡
The red line represents a model.

Let's call it "Model A."

A very simple model. Just a straight line.

πŸ‘‡
Here we have a much more complex model.

Let's call this one "Model B."

πŸ‘‡
Let's compute Model A's error.

We can add up all the yellow distances together to come up with this error (we usually sum up the square of the distances to avoid positives values to cancel with negative values.)

The error is high.
Let's now compute Model B's error.

Well, this error it's pretty much zero!

Model B is very complex and fits the training data perfectly.
Apparently, Model B is much better than Model A, right?

Well, not necessarily. Let's introduce a validation dataset (blue dots) and compute the error of each model again.

Here is Model A's error. The model performs consistently bad on the validation data.

πŸ‘‡
When computing Model B's error, we realize that it is not zero anymore.

The model performs much worse on the validation data than on the training data.

This is not a consistent model. This is not good.

Neither Model A nor B is good.

πŸ‘‡
We say that Model A shows *high bias* and *low variance*.

A straight line doesn't have enough expressiveness to fit the data.

We say that this model *underfits* the data.

πŸ‘‡
We say that Model B shows *high variance* and *low bias*.

The model has too much expressiveness and "memorizes" the training data (instead of generalizing.)

We say that this model *overfits* the data.

πŸ‘‡
What we want is Model C.

A model that properly balances bias and variance in a way that's able to generalize and give good predictions for unseen data.

Remember this: The bias vs. variance tradeoff is a constant battle you have to fight.

πŸ‘‡
Finally, let's look at how the bias and variance tradeoff play out as we increase our models' complexity.

Let's represent the error of the model on the training set as we vary its complexity.

πŸ‘‡
Let's now do the same with the error on the validation data (a dataset that the model didn't see while training.)

See what happened here?

The more complex the model becomes, the worse it does on the validation set.

πŸ‘‡
Breaking it down into three sections:

▫️Green: We are underfitting.
▫️Yellow: We are overfitting.
▫️Orange: Just right.

We want to be in the middle section. That's the right balance of bias vs. variance.
A thread about preventing overfitting:

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with Santiago πŸŽƒ

Santiago πŸŽƒ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

19 Oct
Overfitting sucks.

Here are 7 ways you can deal with overfitting in Deep Learning neural networks.

πŸ§΅πŸ‘‡ Image
A quick reminder:

When your model makes good predictions on the same data that was used to train it but shows poor results with data that hasn't seen before, we say that the model is overfitting.

The model in the picture is overfitting.

πŸ‘‡ Image
1⃣ Train your model on more data

The more data you feed the model, the more likely it will start generalizing (instead of memorizing the training set.)

Look at the relationship between dataset size and error.

(Unfortunately, sometimes there's no more data.)

πŸ‘‡ Image
Read 10 tweets
17 Oct
Wanna maximize the potential reward of every hour you spend?

Here is a tangible way to do this when building real-life Machine Learning solutions.

πŸ§΅πŸ‘‡
Complex systems usually depend on multiple components working together to produce a solution.

Imagine a pipeline like this, where the input goes through 4 different components before getting to the appropriate output.

πŸ‘‡
After everything is said and done, let's imagine this system is correct 60% of the time.

That sucks. We need to improve it.

Unfortunately, we tend to prioritize work in those areas where we *think* there's value. Even worse, areas that are easy or fun to change.

πŸ‘‡
Read 9 tweets
14 Oct
Machine Learning 101:

▫️ Overfitting sucks ▫️

Here is what you need to know.

πŸ§΅πŸ‘‡
Overfitting is probably the most common problem when training a Machine Learning model (followed very close by underfitting.)

Overfitting means that your model didn't learn much, and instead, it's just memorizing stuff.

πŸ‘‡
Overfitting may be misleading: during training, it looks like your model learned awesomely well.

Look at the attached picture. It shows how the accuracy of a sample model increases as it's being trained.

The accuracy reaches close to 100%! That's awesome!

Or, is it?

πŸ‘‡
Read 9 tweets
13 Oct
Transfer Learning.

It sounds fancy because it is.

This is a thread about one of the most powerful tools that make possible that knuckleheads like me achieve state-of-the-art Deep Learning results on our laptops.

πŸ§΅πŸ‘‡
Deep Learning is all about "Deep" Neural Networks.

"Deep" means a lot of complexity. You can translate this to "We Need Very Complex Neural Networks." See the attached example.

The more complex a network is, the slower it is to train, and the more data we need to train it.

πŸ‘‡
To get state-of-the-art results when classifying images, we can use a network like ResNet50, for example.

It takes around 14 days to train this network with the "imagenet" dataset (1,300,000+ images.)

14 days!

That's assuming that you have a decent (very expensive) GPU.

πŸ‘‡
Read 10 tweets
12 Oct
When I heard about Duck Typing for the first time, I had to laugh.

But Python 🐍 has surprised me before, and this time was no exception.

This is another short thread 🧡 that will change the way you write code.

πŸ‘‡
Here is the idea behind Duck Typing:

▫️If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

Taking this to Python's world, the functionality of an object is more important than its type. If the object quacks, then it's a duck.

πŸ‘‡
Duck Typing is possible in dynamic languages (Hello, JavaScript fans πŸ‘‹!)

Look at the attached example. Notice how "Playground" doesn't care about the specific type of the supplied item. Instead, it assumes that the item supports the bounce() method.

πŸ‘‡
Read 8 tweets
11 Oct
This is ridiculous.

I've been coding in Python 🐍 since 2014. Somehow I've always resisted embracing one of their most important principles.

This is a short thread 🧡about idiomatic Python.

πŸ‘‡
Python is all about readability.

One of its core principles is around writing explicit code.

Explicitness is about making your intentions clear.

Take a look at the attached code. It's one of those classic examples showing bad versus good.

See the difference?

πŸ‘‡
There are more subtle ways in which Python encourages explicitness.

This example shows a function that checks whether two keys exist in a dictionary and adds them up if they do.

If one of the keys doesn't exist, the function returns None.

Nothing wrong here, right?

πŸ‘‡
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!