Tweet

Santiago

Follow @svpino

15 Mar, 8 tweets, 2 min read

You aren't doing yourself any favors if you aren't throwing away your validation data regularly.

It's painful, I know, but you are looking for trouble if you don't do it.

Let's talk about what happens with your data and your model.

Grab the ☕️, and let's do this thing. 🧵👇

https://twitter.com/svpino/status/1359827674097672195?s=20

Every machine learning tutorial teaches you about splitting your dataset.

They either go with train/test or train/validation/test. Nomenclature doesn't matter here. You just need to understand how each one of these is used.

Here is a thread about this:

https://twitter.com/svpino/status/1359827674097672195?s=20

Let's think of a neural network and focus on the train set for a second.

We use this to train our model. The data on this set is the one the network uses to adjust the weights.

And, of course, the model will get really good at solving this set.

The validation set is just for that: we use it to compute our loss and maybe accuracy, so we know how good the model is.

Keep in mind the model is never trained on a validation set. No weights are optimized because of the data on this set.

At least not directly.

Here is the problem: we look at the model results on the validation set and change the model.

Yes, we may not be adjusting weights directly based on the data, but we are adjusting them indirectly by updating the model based on the results.

Every time you update your model based on your validation set results, you are leaking information and overfitting to that data.

It only takes a few rounds for the results on that set to be completely useless.

How do you avoid this? Ideally, you swap out that validation set as frequently as you can.

You can take all that data, merge it with the train data, and get a new fresh set.

Unfortunately, there's no real substitute for a fresh validation dataset to test your model.

Yes, there are different cross-validation techniques, but they won't fully solve the issue.

Bottom line: yeah, you probably need more data.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

15 Mar

My recommendation to learn machine learning:

▫️ Machine Learning Crash Course (Google)
▫️ Machine Learning (Coursera)

Take them in order. They are both free. They are both amazing.

(Before you embark on this journey, make sure you feel comfortable writing Python 🐍.)

https://twitter.com/ttnmhttspr/status/1371375266761560064

Not really. Nothing has changed with the fundamentals. The course is as relevant today as it was back in 2010.

https://twitter.com/ttnmhttspr/status/1371375266761560064

https://twitter.com/aleti_sunil/status/1371335131143811073

Experience will help you make a few sensible choices that you can later test.

https://twitter.com/aleti_sunil/status/1371335131143811073

Read 8 tweets

Santiago

@svpino

14 Mar

A good way to understand how shit works is by breaking it down as much as you can.

Here is some code showing Dropout working on an array. And this is a thread explaining how it works.

☕️🧵👇

First, the code.

I want you to notice that Dropout does a couple of things:

▫️ It zeroes-out a percentage of the units.
▫️ It scales the remaining units to account for the missing values.

The second one wasn't obvious to me.

Remember those kids from school that sat together and copied from each other during exams?

They aced every test but were hardly brilliant, remember?

Eventually, the teacher had to set them apart. That was the only way to force them to learn.

Read 9 tweets

Santiago

@svpino

12 Mar

Do you like Star Wars?

Let's talk about John, a machine learning engineer.

He is building a model to predict whether people will like a future Star Wars episode.

This is the story of how John screwed things up.

☕️🧵👇

They are screening the new Star War episode, and the theater seems like a good place to collect some information to create the model.

John goes right in and gives an optional survey to the people sitting in the middle of the theater right before the movie starts.

John does the same for a couple of weeks, then goes back to his computer and puts together a model.

It turns out that the model sucks.

The new episode doesn't do well despite the model predicting the opposite.

What happened?

Read 8 tweets

Santiago

@svpino

12 Mar

“Autoencoders and rotten bananas” is the first story trying to teach you something new about machine learning.

A lot of work to get this out, and I hope you enjoy it as much as I did writing it.

digest.underfitted.io/archive/364757

https://twitter.com/HenryDouglas47/status/1370376327149187077?s=20

This is such a great question!

Why in the world do we need autoencoders when we have pretty good compression algorithms?

Autoencoders will never do better than our existing techniques (like for example, JPEG encoding).

https://twitter.com/HenryDouglas47/status/1370376327149187077?s=20

JPEG works for any image. Autoencoders only work for the type of images that it was trained on.

That's a big disadvantage.

But only if you are trying to use them to compress images.

Here is a killer application of autoencoders: anomaly detection.

Read 5 tweets

Santiago

@svpino

11 Mar

Before you start building a machine learning model, you need a baseline.

I find it helpful to think about 3 different levels and tackle them in order.

Here is how I do this: ☕️🧵👇

▫️ Level 1: The human baseline

Before anything else happens, I find it useful to understand how humans do when solving the problem.

This gives me the ceiling that I should aspire to (or maybe even beat it, if I'm lucky!)

Sometimes, the human baseline will hint at whether a model is a feasible solution for the problem.

For example, if the data doesn't contain information that we can use to make predictions, humans will do very poorly. This will save us a lot of work!

Read 10 tweets

Santiago

@svpino

11 Mar

"Do I need a Ph.D. or a Master's degree to work as a machine learning engineer?"

No.

A lot of companies ask for degrees to weed out people that apply to jobs prematurely.

If you have the required skills and show your experience, the degree will not matter.

https://twitter.com/dry79911736/status/1369981859577298948

Both are valid paths and it will come down to your personal preference or what you need to accomplish

If you aren’t sure, online courses give you less risk upfront.

https://twitter.com/dry79911736/status/1369981859577298948

https://twitter.com/dennisjunior247/status/1369961361455874048?s=20

There are specific positions that do require (and will continue to do so) a degree in a related field.

But the industry has changed in the last few years. Fewer places require degrees anymore, and the tendency will continue in that direction.

https://twitter.com/dennisjunior247/status/1369961361455874048?s=20

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?