◆3 things from me
◆2 things from other people and
◆2 from the community
🧵🧵
This week, I wrote about what to consider while choosing a machine learning model for a particular problem, early stopping which is one of the powerful regularization techniques, and what to know about the learning rate.
The next is their corresponding threads!
1. What to know about a model selection process...
1. @rasbt shared 170 deep learning videos that he recorded in 2021. Not merely from this week, but I got to know about them this week. Thanks to @alfcnz for retweeting them...
1. @Nvidia GTC 2021: Lots of updates from NVIDIA which is on a mission of designing powerful deep learning accelerators.
I watched the keynote. It's great. There are lots of amazing news and updates from Omniverse, NVIDIA Drive, more accessibility to vision and language pre-trained models, to Jarvis AI accurate conversational bot...
Give it a watch! It's a great event!!
2. Gradients all not all you need
This paper from @Luke_Metz discusses the potential chaos of using gradients-based optimization algorithms. Most optimizers compute the gradients of the weights in order to minimize the loss function.
That usually works (but there is no theoretical guarantee that proves it always will).
The paper highlights the issues of gradients and adds that they are not all you need sometimes. Thanks to @rasbt for sharing this.
The below illustration shows early stopping, one of the effective and simplest regularization techniques used in training neural networks.
A thread on the idea behind early stopping, why it works, and why you should always use it...🧵
Usually, during training, the training loss will decrease gradually, and if everything goes well on the validation side, validation loss will decrease too.
When the validation loss hits the local minimum point, it will start to increase again. Which is a signal of overfitting.
How can we stop the training just right before the validation loss rise again? Or before the validation accuracy starts decreasing?
That's the motivation for early stopping.
With early stopping, we can stop the training when there are no improvements in the validation metrics.
◆3 threads from me
◆3 threads from others
◆2 news from the ML communities
3 POSTS FROM ME
This week, I explained Tom Mitchell's classical definition of machine learning, why it is hard to train neural networks, and talked about some recipes for training and debugging neuralnets.
Here is the meaning of Tom's definition of machine learning
Neural networks are hard to train. The more they go deep, the more they are likely to suffer from unstable gradients.
A thread 🧵🧵
Gradients can either explode or vanish, and neither of those is a good thing for the training of our network.
The vanishing gradients problem results in the network taking too long to train(learning will be very slow), and the exploding gradients cause the gradients to be very large.