When designing your neural network, you first want to focus on your training loss.
Overfit the heck of your data and get that loss as low as you can!
Only after that should you start regularizing and focusing on your validation loss.
☕️🧵👇
Always try to overfit first.
Getting here is a good thing: you know your model is working as it should!
If you can't get your model to overfit, there's probably something wrong with your configuration.
How do you overfit? Pick a model that's large enough for the data.
Large enough means it has enough parameters (layers, filters, nodes) to memorize your data.
You can also so try to overfit a portion of your dataset. Fewer samples will be easier to overfit.
A quick summary of some of the things you can try to get your model to overfit:
▫️ Try a more complex model
▫️ Decrease the amount of data
▫️ Don't use any regularization
▫️ Don't use data augmentation
At this point, you should be laser-focused on getting that training loss down.
Is it not going down? Keep looking because there may be something wrong. Better fix it now or suffer infinite pain later.
Alright, so as soon as your training loss is as low as it could possibly get, it's time to look at your validation loss.
You want to tradeoff some training loss to decrease your validation loss.
Are you only using a portion of the data? Time to use it all.
Is your model too large? Make it smaller (get rid of some filters, layers, and/or nodes)
Start regularizing it step by step until you get where you want.
Make sure you don't dump the entire book of regularization techniques all at once! Small steps.
A quick summary of the things you can try to get rid of the overfitting:
▫️ Simplify the model (fewer parameters)
▫️ Start using the entire dataset
▫️ Add data augmentation
▫️ L2 and L1 regularization
▫️ Add some Dropout
▫️ Use Early stopping
In summary, this dance is a three-step approach:
1. Go big, overfit. 2. Pare it back until it's good enough. 3. Follow me, so you don't miss the tricks.
🦕
Absolutely.
Anything that regularizes your model is fair at this point. The only caveat is to go step by step, and never try to throw the kitchen sink at it at once.
Let's talk about how you can build your first machine learning solution.
(And let's make sure we piss off half the industry in the process.)
Grab that ☕️, and let's go! 🧵
Contrary to popular belief, your first attempt at deploying machine learning should not use TensorFlow, PyTorch, Scikit-Learn, or any other fancy machine learning framework or library.
Your first solution should be a bunch of if-then-else conditions.
Regular, ol' conditions make for a great MVP solution to a machine learning wannabe system.
Pair those conditions with a human, and you have your first system in production!
Conditions handle what they can. Humans handle the rest.
I use Google Spreadsheets because it's in the cloud, and it's convenient for me. I don't have Microsoft Office installed, and as long as spreadsheets aren't crazy large, Google has what I need.