When starting, focus on understanding the power of representations and getting as good as you can at feature engineering.
Feed garbage to your fancy algorithms and they will give you garbage back. No exceptions.
4 libraries you will never fully learn, but you'll have to keep trying:
- NumPy
- Pandas
- Scikit-Learn
- Matplotlib
Learn how to use notebooks (Jupyter, Google Colab.)
They will be an essential part of your career.
Here is the overall, high-level machine learning process that you'll need to follow:
1. Define the Problem 2. Prepare Data 3. Choose an algorithm 4. Improve Results 5. Present Results
If you want to get more specific, here are some of the steps:
▫️ Analyze the problem
▫️ Gather the data
▫️ Prepare the data
▫️ Choose the right model
▫️ Train the model
▫️ Evaluate the results
▫️ Look for biases
▫️ Tune it
▫️ Deploy the model
▫️ Monitor it
▫️ Retrain it
9 questions you need to keep asking:
1 What problem am I solving?
2 Why do I need to solve it?
3 What data do I have?
4 How is the data biased?
5 How do I transform it?
6 How do I collect more?
7 How do I model this?
8 What does success look like?
9 How do I productize it?
Neural networks are hot. I'd recommend you ignore them at the beginning.
Instead, here is a good list to kick off your learning:
1. Linear regression 2. Logistic regression 3. Decision Trees 4. K-NN
I source good problems from Kaggle.
I'd recommend you start with the Titanic competition.
The House Pricing challenge is another great problem where you can learn a ton.
Machine learning doesn't end with a model.
If you can't serve those predictions in a reliable and scalable way, you will have a hard time selling your value.
(Research positions work differently.)
A real problem: we are putting models out there that are horribly biased and shaping society in ways we are just beginning to see.
Ethics is not an optional subject.
Looking for kick-ass online courses?
- Coursera - Machine Learning
- Coursera - Deep Learning Specialization
- MIT 6.S191 Introduction to Deep Learning
- DS-GA 1008 Deep Learning
- UC Berkeley Full Stack Deep Learning
- Cornell Tech CS 5787 Applied Machine Learning
Machine learning is not easy, but it's not impossible.
Take it one day at a time.
I promise it's going to be well worth it.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Especially with deep learning, where you have many layers full of nodes, it's hard to understand the "thinking" of a network because you'll have to reverse-engineer million of float values and try to make sense of them.
The ability to reuse the knowledge of one model and adapt it to solve a different problem is one of the most consequential breakthroughs in machine learning.
Grab your ☕️ and let's talk about this.
🧵👇
A deep learning model is like a Lego set, with many pieces connected, forming a long structure.
These pieces are layers, and each layer has a responsibility.
Although we don't know exactly the role of every layer, we know that the closer they get to the output, the more specific they get.
The best way to understand what I mean is through an example: a model that will process car images.