Tweet

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

15 Apr

Challenge:

Here are 10 interesting questions about neural networks.

↓ 1/6

1. What's the best way to initialize the weights of a neural network?

2. What criteria would you follow to choose between mean squared error (MSE) and mean absolute error (MAE) as your loss function?

↓ 2/6

3. How do you determine the number of hidden layers that will best solve your problem?

4. How do you decide it is a good time to introduce a scheduler to decrease your learning rate?

5. For how long should you keep training your neural network?

↓ 3/6

Read 7 tweets

Santiago

@svpino

15 Apr

A short thread that will help you understand how a useful machine learning model differs from a crappy one.

(In English, without any jargon or math.)

This is also the way I first understood what "overfitting" means.

↓ 1/7

When building a machine learning model, our ultimate goal is to "generalize" the lessons it learns to unseen data.

When we train it, we expect it to extract what's relevant, discard what's not, and understand the data's underlying patterns.

↓ 2/7

Imagine we have an image, and we want to determine whether it's a picture of a person or not.

These are some features that could be relevant to the model:

• There're two eyes
• There's a nose
• There's a mouth
• There are ears

↓ 3/7

Read 8 tweets

Santiago

@svpino

14 Apr

How important is having access to a GPU when we deploy a machine learning model?

Let's talk about this in this short thread.

↓ 1/7

We all know that having a GPU is a must to train a deep learning model, especially when we have a lot of data.

But that's during training. How about when using the model?

↓ 2/7

The right answer depends on the problem we are trying to solve, but more often than not, a GPU is not necessary during inference time.

And depending on its cost, a GPU might even be counterproductive: the value it gives us may pale compared to its cost.

↓ 3/7

Read 7 tweets

Santiago

@svpino

13 Apr

Something essential for aspiring machine learning practitioners is to find an area where they can specialize and build their reputation.

I've met many people who focus on a specific niche and have built a career in it.

Here are some examples of things you could do:

↓ 1/5

1. Computer Vision. Basically, this is about dealing with images and video.

2. Natural Language Processing and Understanding. Think about this as having to deal with text.

3. Recommendation systems is another popular area. Very useful for e-commerce companies.

↓ 2/5

4. Anomaly Detection focuses on detecting anomalous occurrences in the input data.

5. I've met a couple of people that focus exclusively on Time series analysis by helping companies forecast their metrics.

6. MLOps is a field that's gaining a lot of strength.

↓ 3/5

Read 9 tweets

Santiago

@svpino

12 Apr

If you are looking to get a background in math before starting with machine learning, here is all the material you need covering the following topics:

• Probabilities & Statistics
• Linear Algebra
• Multivariate Calculus

More than enough to get started.

↓ 1/7

Seeing Theory

seeing-theory.brown.edu

An interactive website that will take you through some of the most critical concepts of Probabilities and Statistics.

These will be enough to get you started, and you will have fun while going through it!

↓ 2/7

Statistics 110: Probability

youtube.com/playlist?list=…

If you are looking for more, this course from Harvard University is an excellent introduction to probability as a language and a set of tools for understanding statistics, science, risk, and randomness.

↓ 3/7

Read 8 tweets

Santiago

@svpino

11 Apr

A lot in machine learning is pretty dry and boring, but understanding how autoencoders work feels different.

This is a thread about autoencoders, things they can do, and a pretty cool example.

↓ 1/10

Autoencoders are lossy data compression algorithms built using neural networks.

A network encodes (compresses) the original input into an intermediate representation, and another network reverses the process to get the same input back.

↓ 2/10

The encoding process "generalizes" the input data.

I like to think of it as the Summarizer in Chief of the network: its entire job is to represent the entire dataset as compactly as possible, so the decoder can do a decent job at reproducing the original data back.

↓ 3/10

Read 12 tweets

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?