Santiago Profile picture
Mar 9, 2021 16 tweets 5 min read Read on X
Here is an underrated machine learning technique that will give you important information about your data and model.

Let's talk about learning curves.

Grab your ☕️ and let's do this thing!

🧵👇
Start by creating a model. Something simple. You are still exploring what works and what doesn't, so don't get fancy yet.
We are now going to plot the loss (model error) vs. the training dataset size. This will help us answer the following questions:

▫️ Do we need more data?
▫️ Do we have a bias problem?
▫️ Do we have a variance problem?
▫️ What's the ideal picture?
▫️ Do we need more data?

As you increase the training size, if both curves converge towards each other and stop improving, you don't need more data.

If there's room for them to continue closing the gap, then more data should help. Image
This one should be self-explanatory: if our errors stopped improving after adding more data, it's unlikely that more of it will do any good.

But if we still see the loss improving, more data should help push it even lower.
▫️ Do we have a bias problem?

If the training error is too high, we have a high bias problem.

Also, if the validation error is too high, we have a problem with the bias —either low or high bias. Image
A high bias indicates that our model is not powerful enough to learn the data. This is why our training error is high.

If the training error is low, that's a good thing: our model can fit the data.
High validation error indicates that our model is not performing well on the validation data. We probably have a bias problem.

To know in which direction, we need to look at the training error to decide.

▫️ Low training error: low bias
▫️ High training error: high bias
▫️ Do we have a variance problem?

If there's a big gap between the training error and the validation error, we have high variance.

A low training error also indicates that we have high variance. Image
High variance indicates that the model fits the data too well (probably memorizing it.)

When testing with the validation set, we should see the big gap indicating that the model did great with the training set, but sucked with the validation set.
A couple more important points:

▫️ High bias + low variance: we are underfitting.
▫️ High variance + low bias: we are overfitting.
▫️ What's the ideal picture?

These are the curves that you should be looking forward to getting.

Training and validation error converged both to a low error. Image
Here is another chart that does an excellent job at explaining bias and variance.

You want low bias + low variance, but keep in mind there's always a tradeoff between them: you need to find a good enough balance for your specific use case. Image
If these threads help, then make sure to follow me, and you won't be disappointed.

And for even more in-depth machine learning stories, make sure you head over digest.underfitted.io. The first issue coming this Friday!

🐍 Image
Here is a quick guide that will help you dealing with overfitting and underfitting:



Both error or score will work to create good learning curves.

They will simply work as opposites: You always want to maximize score and minimize error.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

Nov 12
This is worth 1,000+ hours of engineering work every year:

1. Reproducing a bug
2. Getting detailed debug data
3. Writing how to reproduce it
4. Putting it all together in a good bug report

This tool can do all of this and cut the time it takes to fix the bug by 70%+:
makes the reporting and fixing process really fast!

Click once, and engineers get:

• Console logs
• Network requests
• Timing waterfall
• Repro steps
• Session & user details
• Device & OS
• Backend logs

Check the attached video. Jam.dev
It's just a browser extension - so anyone can report bugs w/ technical details.

Even after the bug just happened!

You can click instant replay, and Jam will create a detailed report with real-time data and video up to the last 2 minutes.
Read 4 tweets
Oct 1
My new soon-to-be Linux laptop right before I start assembling it. Image
RAM and SSD are now installed. Took me 1 minute and I didn’t even read the manual. Image
The packaging is very nice. A lot of cardboard. This thing comes well protected.

Mostly, frustration-free packaging. Reminiscent of Apple’s boxes. Image
Read 13 tweets
Sep 16
How can you build a good understanding of math for machine learning?

Here is a complete roadmap for you.

In essence, three fields make this up:

• Calculus
• Linear algebra
• Probability theory

Let's take a quick look at them! Image
This thread is courtesy of @TivadarDanka.

3 years ago, he started writing a book about the mathematics of Machine Learning.

It's the best book you'll ever read:



Nobody explains complex ideas like he does.tivadardanka.com/books/mathemat…
1. Linear algebra.

In machine learning, data is represented by vectors. Essentially, training a learning algorithm is finding more descriptive representations of data through a series of transformations.

Linear algebra is the study of vector spaces and their transformations. Image
Read 9 tweets
Aug 12
The single most undervalued fact of linear algebra:

Matrices are graphs, and graphs are matrices.

Encoding matrices as graphs is a cheat code, making complex behavior simple to study.

Let me show you how! Image
By the way, this thread is courtesy of @TivadarDanka. He allowed me to republish it.

3 years ago, he started writing a book about the mathematics of Machine Learning.

It's the best book you'll ever read:



Nobody explains complex ideas like he does.tivadardanka.com/books/mathemat…
If you look at this example, you probably figured out the rule.

Each row is a node, and each element represents a directed and weighted edge. We omit any edges of zero elements.

The element in the 𝑖-th row and 𝑗-th column corresponds to an edge going from 𝑖 to 𝑗. Image
Read 18 tweets
Jul 12
A common fallacy:

If it's raining, the sidewalk is wet. But if the sidewalk is wet, is it raining?

Reversing the implication is called "affirming the consequent." We usually fall for this.

But surprisingly, it's not entirely wrong!

Let's explain it using Bayes Theorem:

1/10 Image
This explanation is courtesy of @TivadarDanka. He allowed me to republish it.

He is writing a book about the mathematics of Machine Learning. It's the best book I've read:



Nobody explains complex ideas like he does.

2/10tivadardanka.com/books/mathemat…
We call propositions of the form "if A, then B" implications.

We write them as "A → B," and they form the bulk of our scientific knowledge.

For example:

"If X is a closed system, then the entropy of X cannot decrease" is the second law of thermodynamics.

3/10
Read 10 tweets
Jun 12
Some of the skills you need to start building AI applications:

• Python and SQL
• Transformer and diffusion models
• LLMs and fine-tuning
• Retrieval Augmented Generation
• Vector databases

Here is one of the most comprehensive programs that you'll find online:
"Generative AI for Software Developers" is a 4-month online course.

It's a 5 to 10-hour weekly commitment, but you can dedicate as much time as you want to finish early.

Here is the link to the program:

I also have a PDF with the syllabus:bit.ly/4aNOJdy


I'm a huge fan of online education, but most of it is all over the place and mostly theoretical.

This program is different:

You'll work on 4 different hands-on projects. You'll learn practical skills you can use at the office right away.cdn.sanity.io/files/tlr8oxjg…
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(