Santiago Profile picture
5 May, 13 tweets, 3 min read
Machine learning education is broken.

If you are preparing for a research position, you are good. If you are looking to get out there and start solving problems, not even close.

Here are some thoughts so you can get ahead.

Most classes, courses, and books cover the same road.

They start with a dataset. They finish with a working model. The focus is always on everything that happens in between.

Dataset → Model.

This is great, but not enough.
Real-life situations rarely start with a dataset, and they never end after you finish building your model.

Applying machine learning successfully is hard.

Here are a few examples that you should keep in mind.
First challenge: Properly framing up the problem.

If you don't understand the problem, you can't determine what data you need. If you don't understand the data, you can't build a good model.
I've never seen a company that had their data ready to go.

In fact, most of them don't even have data at all and need you to determine what exactly they should start collecting.

You usually have to go Problem → Potential Solution → Data.
Another fun challenge: Getting the data from the point of origin to a place where you can start using it.

Who's putting together the pipeline to move the data? Who's building processes to clean it and get it ready?
Talk about deploying models, and people roll their eyes.

I've talked to a lot of data scientists that have no idea how to get this done. They struggle to look past a Jupyter notebook, and they have to learn on the job.
Another gap:

Everyone is laser-focused on building models that solve problems, but almost nobody looks at them to help other people do their job better.

Combining models with humans unlocks a lot of value.
There are many more gaps that we need to fill.

From drift monitoring to automatic retraining processes, bias mitigation, and everything in between.

Heck, even camera selection before building a deep learning model for computer vision is a common issue!
A few programs out there are starting to adventure outside of the "Dataset → Model" approach.

I hope more join the party.

The more we cover, the better we can tackle the problems waiting for us.
If you are getting ready to go out there, going above and beyond to close these gaps represents an incredible opportunity.

Show up with a broader understanding of what it takes to get the work done, and companies will throw a ridiculous amount of money at you.
If you are looking for more information about machine learning in the real world, follow me @svpino, and I'll give you something to think about every week.

We can do this together. One tweet at a time.
You don't make up the data. You collect it.

It's not uncommon to start working on a project and be 1-2 years away from having the necessary data to solve their problem.

Plant the seed today, so you can harvest it when ready.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

6 May
12 machine learning YouTube videos.

On libraries, algorithms, tools, and theory.

1. Jupyter Notebooks:

2. Pandas:

3. Matplotlib:

4. Seaborn: ImageImageImageImage
5. Numpy:

6. Decision Trees:

7. Neural Networks:

8. Scikit-Learn: ImageImageImageImage
Read 4 tweets
1 May
A little over 12 years ago, the police started building a case against me.

That was stressful. They were watching. They wanted to take me off the streets.

Here is the story of how I fled Cuba and came to the United States.

After finishing college, I started taking freelance projects.

That was illegal. The Cuban government didn't allow people to make money working for foreign companies.

If you were lucky, you could get 2 years in jail. They called it "illicit enrichment."
We were a small group of friends. We met at my house every morning.

We paid a foreign national for Internet access. Cubans weren't allowed to buy it, so we had to get creative.

It was a 56kbs connection shared across 4 computers.
Read 13 tweets
30 Apr
We've all heard the horror stories.

Are you ready for machine learning math? Are you sure you can download a library, go through a course and make things happen?

This is some unsolicited advice.

↓ 1/10
Back then, we had to write custom training loops. Every time.

We all heard horror stories about the complexity of statistics and how ugly linear algebra was. This was a real thing.

The barrier to start with machine learning was high and full of thorns.

↓ 2/10
Today, things are different.

The lack of programming skills is a much bigger hurdle than not understanding how derivatives work.

Wanna have a better chance? Learn to code today. Worry about math tomorrow.

↓ 3/10
Read 10 tweets
28 Apr
Free machine learning education.

Many top universities are making their Machine Learning and Deep Learning programs publicly available. All of this information is now online and free for everyone!

Here are 6 of these programs. Pick one and get started!

Introduction to Deep Learning
MIT Course 6.S191
Alexander Amini and Ava Soleimany

Introductory course on deep learning methods and practical experience using TensorFlow. Covers applications to computer vision, natural language processing, and more.

introtodeeplearning.com
Deep Learning
NYU DS-GA 1008
Yann LeCun and Alfredo Canziani

This course covers the latest techniques in deep learning and representation learning with applications to computer vision, natural language understanding, and speech recognition.

atcold.github.io/pytorch-Deep-L…
Read 8 tweets
27 Apr
$5 only for the next 50 orders.

1,131 people have bought it. 99% 5-star ratings.

Don't like it, and you pay nothing.

Link here → gum.co/kBjbC/five
If you bought it already or aren't interested, like/retweet the original tweet, and you'll be supporting my work as much as if you were paying.

Thank you from the bottom of my heart!
40 left.
Read 5 tweets
27 Apr
Data is the core of machine learning.

It should not surprise you that most of the work you'll have to do is related to capturing, managing, processing, and validating data.

A few recommendations for those who would like to start.

1/7
As you get your feet wet, these are roughly some of the areas that you should cover:

• Data collection
• Data visualization
• Imputation
• Handling outliers
• Encoding
• Normalization and scaling
• Binning and grouping

2/7
Here is a good, introductory, free course provided by Google:

"Data Preparation and Feature Engineering in ML." — developers.google.com/machine-learni…

It covers the process of collecting, transforming, splitting, and creating datasets that machine learning algorithms can use.

↓ 3/7
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(