Hi,

The number of layers depends on the size of the dataset, but there is no way to know the right number of layers although it's somewhere between 1-10 for an initial trial at least.
Everything in deep networks is not clearly predefined. It's all experimenting, experimenting, and experimenting.

ReLU activation is a good starting point always. You can later try other non-saturating activations like SeLU, ELU, etc, but avoid using sigmoid or Tanh.
The common pattern in convolution layers is to double the filters, layer after layer, like 16, 32, 64... but again, this is not guaranteed to work well. The size of filters is usually 3X3, or 5X5.

The pooling size is usually 2 by 2.
Nowadays, unless you want to do some self-experimentation to learn how hyperparameter plays with another, it's safe to use existing architectures (like ResNet) than trying to build your own from scratch.
Most new architectures build on the existing ones and improve their performance or reduce the size, or introduce some little tweaks.

As you can imagine, that's all experimentation.
This is at least the reason why most things work but they are not clearly understood why & how they work.

Ex: We know batch normalization helps train faster and can lead to better performance, but we don't understand what it means that it 'reduces the internal covariate shift'.
Your question is very broad. And thus, my response may not be enough.

But if all you want is to build something that works, take an existing thing that works, and customize it on your dataset. That said though, you still have to experiment, experiment, and experiment.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jean de Nyandwi

Jean de Nyandwi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Jeande_d

2 Dec
The image you see below is a typical architecture of Convolutional Neural Networks, a.k.a ConvNets.

ConvNets is a neural network type architecture that is mostly used in image recognition tasks.

More about ConvNets 🧵🧵

Image credit: CS 231n
Today, it's a norm to use ConvNets for most recognition tasks such as image classification and object detection.

Although vision transformers are increasingly outperforming most benchmarks in research communities, we probably still have long to go with ConvNets.
To motivate why ConvNets are so powerful, let's think about what happens when we use fully connected networks for image data.
Read 32 tweets
29 Nov
How to think about precision and recall:

Precision: What is the percentage of positive predictions that are actually positive?

Recall: What is the percentage of actual positives that were predicted correctly?

🧵🧵
The fewer false positives, the higher the precision. Vice-versa.

The fewer false negatives, the higher the recall. Vice-versa.
How do you increase precision? Reduce false positives.

It can depend on the problem, but generally, that might mean fixing the labels of those negative samples(being predicted as positives) or adding more of them in the training data.
Read 12 tweets
28 Nov
Machine Learning Weekly HighLights 💡

Made of:

◆3 things from me
◆2 from from others and
◆1 from the community
This week, I explored different object detection libraries, wrote about the hyper-parameter optimization methods, and updated the introduction to machine learning in my complete ML packaged free online book.

I also reached 6000 followers 🎉. Thank you for your support again!
Read 13 tweets
21 Nov
Machine Learning Weekly Highlights 💡

Made of:

◆2 things from me
◆2 from other creators
◆2+1 from the community

A thread 🧵
This week, I wrote about activation functions and why they are important components of neural networks.

Yesterday, I also wrote about image classification, one of the most important computer vision tasks.
#1

Here is the thread about activation functions

Read 12 tweets
20 Nov
Image classification is one of the most common & important computer vision tasks.

In image classification, we are mainly identifying the category of a given image.

Let's talk more about this important task 🧵🧵
Image classification is about recognizing the specific category of the image from different categories.

Take an example: Given an image of a car, can you make a computer program to recognize if the image is a car?
One might ask why we even need to make computers recognize the images. He or she would be right.

Humans have an innate perception system. Identifying or recognizing the objects seems to be a trivial task for us.

But for computers, it's a different story. Why is that?
Read 15 tweets
17 Nov
Activations functions are one of the most important components of any typical neural network.

What exactly are activation functions, and why do we need to inject them into the neural network?

A thread 🧵🧵
Activations functions are basically mathematical functions that are used to introduce non linearities in the network.

Without an activation function, the neural network would behave like a linear classifier/regressor.
Or simply put, it would only be able to solve linear problems or those kinds of problems where the relationship between input and output can be mapped out easily because input and output change in a proportional manner.

Let me explain what I mean by that...
Read 27 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(