Santiago Profile picture
3 Mar, 12 tweets, 3 min read
The two questions related to neural networks that I hear most often:

▫️ How many layers should I use?
▫️ How many neurons per layer should I use?

There are some rules of thumb that I'll share with you after you get your ☕️ ready.

🧵👇
First, let's get this out of the way:

A neural network with a single hidden layer can model any function regardless of how complex it is (assuming it has enough neurons.)

Check the "Universal Approximation Theorem" if you don't believe me.

So, if we can do it all with a single layer, why bother adding more layers?

Well, it turns out that a neural network with a single layer will overfit really quick.

The more neurons you add to it, the better it will become at memorizing stuff.

That is bad news.

Do you know one way to fight this overfitting? Adding more data. A ton of it.

Do you think you have that much data? I didn't think so.

Theoretically, yes, it's possible to learn everything with a single layer. Practically, a bad idea.

So let's add more layers instead.

Deeper networks learn features at different levels of abstraction. This way, they can learn richer structures.

Let's take images of bunnies 🐇 and a network with 3 layers and see what may happen (keep reading...)

Here is the hypothetical learning that could take place:

▫️ The 1st layer learns to recognize edges.

▫️ The 2nd layer learns shapes using the edges learned by the 1st layer.

▫️ The 3rd layer learns to recognize ears and noses using the shapes from the 2nd layer.

In summary, more layers help a network generalize, as opposed to memorizing.

And there's more: because we have a more efficient structure at learning, we don't need a huge amount of neurons.

That's good news!

Another small detail:

The more parameters we have, the slower it will be to train the network.

A deeper network will help us keep the number of parameters manageable.

So going back to your original questions, here is a rule of thumb:

▫️ Prioritize deep and narrow networks over wide and shallow.

In other words: More layers with fewer neurons over fewer layers with more neurons.

But of course, it ultimately depends on your problem.

If these threads help you find sanity among so many weights and biases, give me a follow and retweet this thing.

Do you know how many people are still asking the same questions? Let's settle it for them!

🦕
Parameters refer to the weight and biases of the network. The more neurons we have, the more parameters we will have to tune.

Imagine that we have 10 inputs, a hidden layer with 16 nodes (neurons), and an output layer with 3 nodes.

Here is how we compute the number of parameters:

Weights: 10 * 16 + 16 * 3 = 208
Biases: 16 + 3 = 19

Parameters: 208 + 19 = 227

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

4 Mar
When designing your neural network, you first want to focus on your training loss.

Overfit the heck of your data and get that loss as low as you can!

Only after that should you start regularizing and focusing on your validation loss.

☕️🧵👇
Always try to overfit first.

Getting here is a good thing: you know your model is working as it should!

If you can't get your model to overfit, there's probably something wrong with your configuration.
How do you overfit? Pick a model that's large enough for the data.

Large enough means it has enough parameters (layers, filters, nodes) to memorize your data.

You can also so try to overfit a portion of your dataset. Fewer samples will be easier to overfit.
Read 11 tweets
3 Mar
The more you grow, the more you realize that the language you use doesn't matter at all.

JavaScript, Python, or whatever you use represents exactly $0 of your take-home pay every month.

The value you produce using these languages is the remaining 100%.
I’ve never had a conversation with a client that cared about a specific language, other than those wanting to build on top of an existing codebase.
Every line of code is a liability.

Corollary: The best code is the one nobody wrote.
Read 4 tweets
2 Mar
Let's talk about how you can build your first machine learning solution.

(And let's make sure we piss off half the industry in the process.)

Grab that ☕️, and let's go! 🧵
Contrary to popular belief, your first attempt at deploying machine learning should not use TensorFlow, PyTorch, Scikit-Learn, or any other fancy machine learning framework or library.

Your first solution should be a bunch of if-then-else conditions.
Regular, ol' conditions make for a great MVP solution to a machine learning wannabe system.

Pair those conditions with a human, and you have your first system in production!

Conditions handle what they can. Humans handle the rest.
Read 16 tweets
2 Mar
If you want to start with Machine Learning and need some guidance, I want to give you access to my entire course for $10. Today only.

And if you don't like it, you pay $0. But I promise you'll love it!

Thanks to the 100+ of you who already bought it!

👉 gumroad.com/l/kBjbC/50000
If you can’t afford this, reply below explaining how do you think this will help you. I’ll give away 10 copies for free.
Thanks to everyone that has taken advantage of this offer so far!

There are still a few more hours left.

If starting with machine learning feels overwhelming, then this is for you.

gumroad.com/l/kBjbC/50000
Read 4 tweets
2 Mar
Some hard skills that I use every day as a Machine Learning Engineer:

▫️ A whole lot of Python
▫️ TensorFlow, Keras, Scikit-learn
▫️ AWS SageMaker
▫️ Jupyter
▫️ SQL
▫️ Probabilities, Statistics
▫️ Google Spreadsheets (seriously!)
▫️ Software Engineering
General notions of linear algebra are useful, especially when you want to understand how certain things happen behind the scenes.

That being said, I don't consider myself an expert and it's not part of the day-to-day.

You could also use Excel.

I use Google Spreadsheets because it's in the cloud, and it's convenient for me. I don't have Microsoft Office installed, and as long as spreadsheets aren't crazy large, Google has what I need.

Read 6 tweets
1 Mar
Let's talk about learning problems in machine learning:

▫️ Supervised Learning
▫️ Unsupervised Learning
▫️ Reinforcement Learning

And some hybrid approaches:

▫️ Semi-Supervised Learning
▫️ Self-Supervised Learning
▫️ Multi-Instance Learning

Grab your ☕️, and let's do this👇
Supervised Learning is probably the most common class of problems that we have all heard about.

We start with a dataset of examples and their corresponding labels (or answers.)

Then we teach a model the mapping between those examples and the corresponding label.

[2 / 19]
The goal of these problems is for a model to generalize from the examples that it sees to later answer similar questions.

There are two main types of Supervised Learning:

▫️ Classification → We predict a class label
▫️ Regression → We predict a numerical label

[3 / 19]
Read 19 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!