Santiago Profile picture
4 Mar, 11 tweets, 3 min read
When designing your neural network, you first want to focus on your training loss.

Overfit the heck of your data and get that loss as low as you can!

Only after that should you start regularizing and focusing on your validation loss.

☕️🧵👇
Always try to overfit first.

Getting here is a good thing: you know your model is working as it should!

If you can't get your model to overfit, there's probably something wrong with your configuration.
How do you overfit? Pick a model that's large enough for the data.

Large enough means it has enough parameters (layers, filters, nodes) to memorize your data.

You can also so try to overfit a portion of your dataset. Fewer samples will be easier to overfit.
A quick summary of some of the things you can try to get your model to overfit:

▫️ Try a more complex model
▫️ Decrease the amount of data
▫️ Don't use any regularization
▫️ Don't use data augmentation
At this point, you should be laser-focused on getting that training loss down.

Is it not going down? Keep looking because there may be something wrong. Better fix it now or suffer infinite pain later.
Alright, so as soon as your training loss is as low as it could possibly get, it's time to look at your validation loss.

You want to tradeoff some training loss to decrease your validation loss.

Are you only using a portion of the data? Time to use it all.
Is your model too large? Make it smaller (get rid of some filters, layers, and/or nodes)

Start regularizing it step by step until you get where you want.

Make sure you don't dump the entire book of regularization techniques all at once! Small steps.
A quick summary of the things you can try to get rid of the overfitting:

▫️ Simplify the model (fewer parameters)
▫️ Start using the entire dataset
▫️ Add data augmentation
▫️ L2 and L1 regularization
▫️ Add some Dropout
▫️ Use Early stopping
In summary, this dance is a three-step approach:

1. Go big, overfit.
2. Pare it back until it's good enough.
3. Follow me, so you don't miss the tricks.

🦕
Absolutely.

Anything that regularizes your model is fair at this point. The only caveat is to go step by step, and never try to throw the kitchen sink at it at once.

Change. Measure. Change. Measure. Repeat.

When you are trying to overfit, data augmentation is the opposite of what you want.

It "augments" your data with more variants that will make it harder for the model to overfit.

Turn it off, and only back on when you need to get rid of overfitting.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

3 Mar
The more you grow, the more you realize that the language you use doesn't matter at all.

JavaScript, Python, or whatever you use represents exactly $0 of your take-home pay every month.

The value you produce using these languages is the remaining 100%.
I’ve never had a conversation with a client that cared about a specific language, other than those wanting to build on top of an existing codebase.
Every line of code is a liability.

Corollary: The best code is the one nobody wrote.
Read 4 tweets
3 Mar
The two questions related to neural networks that I hear most often:

▫️ How many layers should I use?
▫️ How many neurons per layer should I use?

There are some rules of thumb that I'll share with you after you get your ☕️ ready.

🧵👇
First, let's get this out of the way:

A neural network with a single hidden layer can model any function regardless of how complex it is (assuming it has enough neurons.)

Check the "Universal Approximation Theorem" if you don't believe me.

So, if we can do it all with a single layer, why bother adding more layers?

Well, it turns out that a neural network with a single layer will overfit really quick.

The more neurons you add to it, the better it will become at memorizing stuff.

That is bad news.

Read 12 tweets
2 Mar
Let's talk about how you can build your first machine learning solution.

(And let's make sure we piss off half the industry in the process.)

Grab that ☕️, and let's go! 🧵
Contrary to popular belief, your first attempt at deploying machine learning should not use TensorFlow, PyTorch, Scikit-Learn, or any other fancy machine learning framework or library.

Your first solution should be a bunch of if-then-else conditions.
Regular, ol' conditions make for a great MVP solution to a machine learning wannabe system.

Pair those conditions with a human, and you have your first system in production!

Conditions handle what they can. Humans handle the rest.
Read 16 tweets
2 Mar
If you want to start with Machine Learning and need some guidance, I want to give you access to my entire course for $10. Today only.

And if you don't like it, you pay $0. But I promise you'll love it!

Thanks to the 100+ of you who already bought it!

👉 gumroad.com/l/kBjbC/50000
If you can’t afford this, reply below explaining how do you think this will help you. I’ll give away 10 copies for free.
Thanks to everyone that has taken advantage of this offer so far!

There are still a few more hours left.

If starting with machine learning feels overwhelming, then this is for you.

gumroad.com/l/kBjbC/50000
Read 4 tweets
2 Mar
Some hard skills that I use every day as a Machine Learning Engineer:

▫️ A whole lot of Python
▫️ TensorFlow, Keras, Scikit-learn
▫️ AWS SageMaker
▫️ Jupyter
▫️ SQL
▫️ Probabilities, Statistics
▫️ Google Spreadsheets (seriously!)
▫️ Software Engineering
General notions of linear algebra are useful, especially when you want to understand how certain things happen behind the scenes.

That being said, I don't consider myself an expert and it's not part of the day-to-day.

You could also use Excel.

I use Google Spreadsheets because it's in the cloud, and it's convenient for me. I don't have Microsoft Office installed, and as long as spreadsheets aren't crazy large, Google has what I need.

Read 6 tweets
1 Mar
Let's talk about learning problems in machine learning:

▫️ Supervised Learning
▫️ Unsupervised Learning
▫️ Reinforcement Learning

And some hybrid approaches:

▫️ Semi-Supervised Learning
▫️ Self-Supervised Learning
▫️ Multi-Instance Learning

Grab your ☕️, and let's do this👇
Supervised Learning is probably the most common class of problems that we have all heard about.

We start with a dataset of examples and their corresponding labels (or answers.)

Then we teach a model the mapping between those examples and the corresponding label.

[2 / 19]
The goal of these problems is for a model to generalize from the examples that it sees to later answer similar questions.

There are two main types of Supervised Learning:

▫️ Classification → We predict a class label
▫️ Regression → We predict a numerical label

[3 / 19]
Read 19 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!