(We are just representing here the training data.)
π
The red line represents a model.
Let's call it "Model A."
A very simple model. Just a straight line.
π
Here we have a much more complex model.
Let's call this one "Model B."
π
Let's compute Model A's error.
We can add up all the yellow distances together to come up with this error (we usually sum up the square of the distances to avoid positives values to cancel with negative values.)
The error is high.
Let's now compute Model B's error.
Well, this error it's pretty much zero!
Model B is very complex and fits the training data perfectly.
Apparently, Model B is much better than Model A, right?
Well, not necessarily. Let's introduce a validation dataset (blue dots) and compute the error of each model again.
Here is Model A's error. The model performs consistently bad on the validation data.
π
When computing Model B's error, we realize that it is not zero anymore.
The model performs much worse on the validation data than on the training data.
This is not a consistent model. This is not good.
Neither Model A nor B is good.
π
We say that Model A shows *high bias* and *low variance*.
A straight line doesn't have enough expressiveness to fit the data.
We say that this model *underfits* the data.
π
We say that Model B shows *high variance* and *low bias*.
The model has too much expressiveness and "memorizes" the training data (instead of generalizing.)
We say that this model *overfits* the data.
π
What we want is Model C.
A model that properly balances bias and variance in a way that's able to generalize and give good predictions for unseen data.
Remember this: The bias vs. variance tradeoff is a constant battle you have to fight.
π
Finally, let's look at how the bias and variance tradeoff play out as we increase our models' complexity.
Let's represent the error of the model on the training set as we vary its complexity.
π
Let's now do the same with the error on the validation data (a dataset that the model didn't see while training.)
See what happened here?
The more complex the model becomes, the worse it does on the validation set.
π
Breaking it down into three sections:
β«οΈGreen: We are underfitting.
β«οΈYellow: We are overfitting.
β«οΈOrange: Just right.
We want to be in the middle section. That's the right balance of bias vs. variance.
Here are 7 ways you can deal with overfitting in Deep Learning neural networks.
π§΅π
A quick reminder:
When your model makes good predictions on the same data that was used to train it but shows poor results with data that hasn't seen before, we say that the model is overfitting.
The model in the picture is overfitting.
π
1β£ Train your model on more data
The more data you feed the model, the more likely it will start generalizing (instead of memorizing the training set.)
Look at the relationship between dataset size and error.
This is a thread about one of the most powerful tools that make possible that knuckleheads like me achieve state-of-the-art Deep Learning results on our laptops.
π§΅π
Deep Learning is all about "Deep" Neural Networks.
"Deep" means a lot of complexity. You can translate this to "We Need Very Complex Neural Networks." See the attached example.
The more complex a network is, the slower it is to train, and the more data we need to train it.
π
To get state-of-the-art results when classifying images, we can use a network like ResNet50, for example.
It takes around 14 days to train this network with the "imagenet" dataset (1,300,000+ images.)
14 days!
That's assuming that you have a decent (very expensive) GPU.
When I heard about Duck Typing for the first time, I had to laugh.
But Python π has surprised me before, and this time was no exception.
This is another short thread 𧡠that will change the way you write code.
π
Here is the idea behind Duck Typing:
β«οΈIf it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
Taking this to Python's world, the functionality of an object is more important than its type. If the object quacks, then it's a duck.
π
Duck Typing is possible in dynamic languages (Hello, JavaScript fans π!)
Look at the attached example. Notice how "Playground" doesn't care about the specific type of the supplied item. Instead, it assumes that the item supports the bounce() method.