Tweet

Santiago

Follow @svpino

19 Mar, 23 tweets, 4 min read

Thoughts about starting with machine learning.

☕️🧵👇

Three unnegotiable prerequisites:

1. Software development
2. Algorithms and data structures
3. Communication

If you build a strong foundation on these, you'll be unstoppable.

Learn to build software.

It's hard to make progress with machine learning if you struggle with programming.

Don't listen to those who shit on Algorithms and Data Structures and label them as "useless."

Believe me, if you take the time, these will give you a leg up for what's about to come.

Communication has many different angles.

Start with the following two questions:

- How do I ask the right question?
- How do I provide the right answer?

Yes, you can do machine learning with a lot of different programming languages.

But why would you make it harder on yourself?

Learn Python 🐍.

The right amount of Math at the right time is important.

I promise it's way less scary than what you currently believe.

Start working on problems. Add math as you need it.

While you are learning, don't worry about the code.

Coding is easy. Thinking is darn hard.

Focus on the analysis.

No-code tools are a superpower.

Look into Weka and learn how you can use it to avoid writing code that somebody already wrote.

If you are looking for a course, start with the first one you hear about.

Don't worry about "the best" one.

Google's Machine Learning Crash Course is a great way to start from the beginning.

Hard to beat a good book (or two). These are my all-time favorites:

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
💰 amzn.to/2KPuRAo

Deep Learning with Python
💰 amzn.to/3lpNEPn

When starting, focus on understanding the power of representations and getting as good as you can at feature engineering.

Feed garbage to your fancy algorithms and they will give you garbage back. No exceptions.

4 libraries you will never fully learn, but you'll have to keep trying:

- NumPy
- Pandas
- Scikit-Learn
- Matplotlib

Learn how to use notebooks (Jupyter, Google Colab.)

They will be an essential part of your career.

Here is the overall, high-level machine learning process that you'll need to follow:

1. Define the Problem
2. Prepare Data
3. Choose an algorithm
4. Improve Results
5. Present Results

If you want to get more specific, here are some of the steps:

▫️ Analyze the problem
▫️ Gather the data
▫️ Prepare the data
▫️ Choose the right model
▫️ Train the model
▫️ Evaluate the results
▫️ Look for biases
▫️ Tune it
▫️ Deploy the model
▫️ Monitor it
▫️ Retrain it

9 questions you need to keep asking:

1 What problem am I solving?
2 Why do I need to solve it?
3 What data do I have?
4 How is the data biased?
5 How do I transform it?
6 How do I collect more?
7 How do I model this?
8 What does success look like?
9 How do I productize it?

Neural networks are hot. I'd recommend you ignore them at the beginning.

Instead, here is a good list to kick off your learning:

1. Linear regression
2. Logistic regression
3. Decision Trees
4. K-NN

I source good problems from Kaggle.

I'd recommend you start with the Titanic competition.

The House Pricing challenge is another great problem where you can learn a ton.

Machine learning doesn't end with a model.

If you can't serve those predictions in a reliable and scalable way, you will have a hard time selling your value.

(Research positions work differently.)

A real problem: we are putting models out there that are horribly biased and shaping society in ways we are just beginning to see.

Ethics is not an optional subject.

Looking for kick-ass online courses?

- Coursera - Machine Learning
- Coursera - Deep Learning Specialization
- MIT 6.S191 Introduction to Deep Learning
- DS-GA 1008 Deep Learning
- UC Berkeley Full Stack Deep Learning
- Cornell Tech CS 5787 Applied Machine Learning

Machine learning is not easy, but it's not impossible.

Take it one day at a time.

I promise it's going to be well worth it.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

18 Mar

I can't shut up about neural networks.

What questions do you have?

https://twitter.com/Purni80707264/status/1372641475557740552?s=20

They aren't necessarily opposite concepts.

Fully connected refer to networks composed of layers where every node is connected to every node of the next layer.

Deep networks refer to networks with many layers. They could be fully connected or not.

https://twitter.com/Purni80707264/status/1372641475557740552?s=20

https://twitter.com/mayor_onyebueke/status/1372641390136602631?s=20

Especially with deep learning, where you have many layers full of nodes, it's hard to understand the "thinking" of a network because you'll have to reverse-engineer million of float values and try to make sense of them.

Hard to do.

https://twitter.com/mayor_onyebueke/status/1372641390136602631?s=20

Read 13 tweets

Santiago

@svpino

18 Mar

The ability to reuse the knowledge of one model and adapt it to solve a different problem is one of the most consequential breakthroughs in machine learning.

Grab your ☕️ and let's talk about this.

🧵👇

A deep learning model is like a Lego set, with many pieces connected, forming a long structure.

These pieces are layers, and each layer has a responsibility.

Although we don't know exactly the role of every layer, we know that the closer they get to the output, the more specific they get.

The best way to understand what I mean is through an example: a model that will process car images.

Read 13 tweets

Santiago

@svpino

17 Mar

Here are some of the features that make Python 🐍 a freaking cool language.

🧵👇

1. You can slice and dice arrays very easily.

2. What's even better: negative indexing is really cool, and you can use it to refer to items from the last element in the list.

Read 15 tweets

Santiago

@svpino

16 Mar

5 Python 🐍 package managers that I'm not using anymore:

▫️ conda
▫️ virtualenv
▫️ venv
▫️ pipenv
▫️ poetry

🤷‍♂️

Instead, for several weeks now, I've been using development containers in Visual Studio Code.

Life-changing. Give 'em a try.

https://twitter.com/svpino/status/1365778771169726469

Here is a thread I wrote a few weeks back when I started using them:

https://twitter.com/svpino/status/1365778771169726469

An important note: here I’m referring to the “virtual environment” capabilities of these tools. I still need to pip modules down.

But I’ve been isolating environments with the containers instead.

Read 5 tweets

Santiago

@svpino

15 Mar

You aren't doing yourself any favors if you aren't throwing away your validation data regularly.

It's painful, I know, but you are looking for trouble if you don't do it.

Let's talk about what happens with your data and your model.

Grab the ☕️, and let's do this thing. 🧵👇

https://twitter.com/svpino/status/1359827674097672195?s=20

Every machine learning tutorial teaches you about splitting your dataset.

They either go with train/test or train/validation/test. Nomenclature doesn't matter here. You just need to understand how each one of these is used.

Here is a thread about this:

https://twitter.com/svpino/status/1359827674097672195?s=20

Let's think of a neural network and focus on the train set for a second.

We use this to train our model. The data on this set is the one the network uses to adjust the weights.

And, of course, the model will get really good at solving this set.

Read 8 tweets

Santiago

@svpino

15 Mar

My recommendation to learn machine learning:

▫️ Machine Learning Crash Course (Google)
▫️ Machine Learning (Coursera)

Take them in order. They are both free. They are both amazing.

(Before you embark on this journey, make sure you feel comfortable writing Python 🐍.)

https://twitter.com/ttnmhttspr/status/1371375266761560064

Not really. Nothing has changed with the fundamentals. The course is as relevant today as it was back in 2010.

https://twitter.com/ttnmhttspr/status/1371375266761560064

https://twitter.com/aleti_sunil/status/1371335131143811073

Experience will help you make a few sensible choices that you can later test.

https://twitter.com/aleti_sunil/status/1371335131143811073

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?