Tweet

Santiago

Follow @svpino

15 Oct, 23 tweets, 4 min read

If you haven't looked into machine learning yet, you better start now.

I started looking seriously into machine learning around spring of 2015.

The field was very different back then.

Just to give you an idea, the top most popular deep learning frameworks didn't exist:

• TensorFlow was released at the end of 2015
• PyTorch in 2016

In just 5 - 6 years we have gone from "read my paper... it's cool" to "holly shit, look what my phone is doing!"

Machine learning has turned the industry upside down.

We have gone from "that's impossible" to "of course we can!" in record time.

I know there's a lot of hype out there. Bullshitters bullshit.

But I don't care about that. I did an inventory of my life and found out that machine learning controls:

• The information I consume
• The things I buy
• The videos I watch
• The games I play

And more.

The hype machine is humming because machine learning is delivering.

Hype has to stay ahead of what's possible. The more we accomplish, the more we'll hype the possibilities.

More hype is a good indication of what's happening.

Let's do an exercise.

Look at everything we have done over the last 5 years, and tell me what do you think will happen in the next 5?

Where's the limit, assuming there's one?

I'm beyond bullish about the future!

There's a problem, though.

The demand for qualified people is out of this world.

I tried to find relevant statistics, but all I got was a 2017 article estimating 300,000 AI practitioners and professionals worldwide.

That seems low. I'll multiply it by 10 just for kicks.

Let's assume there are 3 million AI researchers and practioners worldwide.

In comparison, "the Internet" estimated 24 million worldwide developers in 2019!

That's 8 times more developers... and remember I already multiplied the 2017 by 10!

If you are a software developer, you are probably aware that there's a huge demand for your talent.

Well... there are 24 million-many of you.

Do you see now why companies are willing to pay small fortunes for machine learning engineers?

But there's even more...

A lot of the firepower we have in the field is doing research. This is great, and one of the main reasons we have made a ton of progress.

But this opens a really big gap: who turns research papers into actual work benefiting people?

There are some whacky estimates online about the number of worldwide machine learning engineers that have the skills to implement enterprise-level machine learning solutions.

I got depressed just by looking at the number. I won't repeat it here.

But it is astonishly low.

I'm sure you get the point.

We need people and we need them yesterday.

More importantly, you need to understand that there's a place here for everyone, regardless of your current background, skillset, curriculum, experience, and whatever else you can come up with.

The First Generation had it rough.

They had to come up with the math, write it on paper, build every tool we have today.

That was a tough time. But that time has changed significantly.

I'm not surprised when people recommend a 1-mile long prerequisite list to anyone interested in machine learning.

Maybe you have seen something like this?

Calculus, Linear Algebra, Probabilities & Statistics, Python, R, This and That Theory™, etc.

Sounds familiar?

Assuming *everyone* that wants to contribute has to master the same laundry list is just not real.

All of those prerequisities are valuable and well-intentioned, but they vary a lot depending on your focus.

Some of them are even becoming less relevant every day.

Today we have tools we didn't have 5 - 10 years ago.

These tools abstract a lot of the hard things we had to know before. This is good. This opens the field to more people.

Takeway: I'm 110% percent there's a place for you here.

But what's the rush?

If you are a software developer today, you are already working on cool projects and enjoying the market's demand for your skills.

Why should you consider looking into machine learning?

I've have three reasons, in no particular order.

First, as I hopefully convinced you already, if you think your skills are in demand, wait until you add machine learning to your list.

Augmenting your skills will give you a shitton of optionality.

The second reason was what pushed me towards machine learning.

Many of the problems you'll be solving are hard, scary, unexplored, and with unlimited potential impact.

When machine learning works, it feels magic.

I never got that feeling before.

The third reason is because I think you don't have a choice.

I believe there will be a time when *most* of the software we build will incorporate machine learning in one way or another.

This doesn't mean that "software will all be machine learning."

I remember when "the web will kill desktop software" was the craziest thing ever said.

If your definition of "killing" is "completely irradicating," then no, the web didn't kill desktop software.

But if you get past that technicality... holy shit!

I believe machine learning has a similar role to play.

From how we build software all the way to what we build, I see machine learning becoming the heart of the future.

Today it's kind of cool-only-part-of-big-things.

Tomorrow?

@svpino

Why would you wait for the future to catch up with you when you have a chance to build that future?

If at this point, anything here made you *remotely* interested, please reach out. I'm happy to answer your questions and help you get started.

This is me → @svpino. Stay tuned.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

12 Oct

A big part of my work is to build computer vision models to recognize things.

It's usually ordinary stuff: An antenna, a fire extinguisher, a bag, a ladder.

Here is a trick I use to solve some of these problems.

The good news about having to recognize everyday objects:

There are a ton of pre-trained models that help with that. You can start with one of these models and get decent results out of the box.

This is important. I'll come back to it in a second.

Many of the use cases that I tackle are about "augmenting" the people who are working with machine learning.

Let's say you have a team looking at drone footage to find squirrels. Eight hours every day looking at images.

This sucks. I can help with that.

Read 19 tweets

Santiago

@svpino

11 Oct

Last week I trained a machine learning model using 100% of the data.

Then I used the model to predict the labels on the same dataset I used to train it.

I'm not kidding. Hear me out: ↓

Does this sound crazy?

Yes.

Would I be losing my shit if I heard that somebody did this?

Yes.

So what's going on?

I have a dataset with a single numerical feature and a binary target.

I need to know the threshold that better separates the positive samples from the negative ones.

I don't want a model to make predictions; I just need to know the threshold.

Read 10 tweets

Santiago

@svpino

8 Oct

I get asked about machine learning all the time.

Here are my answers to some of these questions: ↓

Q: Where do I start?

Start by learning how to program.

Take your time. Usually, a solid year of Python experience will set you up for success.

Kaggle has a great introductory tutorial to get you started with Python.

@AndrewYNg

Q: I already have plenty of Python experience. Now what?

For most people, I recommend the "Machine Learning Crash Course" created by Google or the "Intro to Machine Learning" from Kaggle.

If you are feeling adventurous, take "Machine Learning" from @AndrewYNg on Coursera.

Read 15 tweets

Santiago

@svpino

7 Oct

More data is usually not the way to turn around a mediocre machine learning model.

I've heard too many times that deep learning's silver bullet is throwing more data at a problem.

That hasn't been my experience.

Good Data is better than Big Data.

https://twitter.com/jeande_d/status/1446069996409470980

More data, even with a moderate amount of mislabeled examples, will hurt your model.

https://twitter.com/jeande_d/status/1446069996409470980

https://twitter.com/coo_ooi/status/1446071108734705667

Assuming the data is good, then more data is probably not going to be a problem.

Unfortunately, the quality of data is usually inversely proportional to the amount of it. More data is often mediocre data.

But if your data is good, no harm.

https://twitter.com/coo_ooi/status/1446071108734705667

Read 4 tweets

Santiago

@svpino

6 Oct

Which one do you prefer? The code on the left, or the code on the right?

I'd love to hear why.

I always was a “left” kind of programmer.

For quite some time now I’ve been forcing myself to use the right style.

Look at “EAFP vs LBYL”. Pretty interesting arguments.

- LBYL - Look Before You Leap. (Left)

- EAFP - Easier to Ask for Forgiveness than Permission. (Right)

Also, I love all of you, but it’s usually a good practice to answer the question using one of the two options instead of going with a third, imaginary option that you feel is better for your imaginary problem.

😋

Read 5 tweets

Santiago

@svpino

1 Oct

A team led by MIT examined 10 of the most-cited datasets used to test machine learning systems.

They found that around 3.4% of the data was inaccurate or mislabeled.

Those are very popular datasets. How about yours?

↓

I've worked with many datasets for image classification.

Unfortunately, mislabeled data is a common problem.

It is hard for people to consistently label visual concepts, especially when the answer is not apparent.

This is a big problem.

Basically, we are evaluating models with images of elephants, expecting them to get classified as "lions."

Your model can't perform well this way.

Read 12 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?