Tweet

Santiago

Follow @svpino

8 Nov, 12 tweets, 3 min read

If I were to start building a career in machine learning today, here is where I'd focus:

1. Python from the get go.
2. Learn how to build software.

I'd take my time here and avoid rushing into the "machine learning" specific stuff.

Something interesting happens here: ↓

A lot of people start learning software development because they want to get into machine learning.

Then they realize that machine learning is not what they care about.

This is great: there are many ways to build a successful career in the software industry.

As soon as you're comfortable, here is what I'd tackle next:

3. Machine learning fundamentals
4. Hands-on machine learning

I like to cover these at the same time, instead of one after the other: learn some theory, then apply it right away.

Something to keep in mind:

In order to get through some of the machine learning theory, you'll need to understand some of the math that make things possible.

Do you need to be a mathematician? Not even close.

High-school level math should get you through most of what you need at this stage.

The two most important things you can do at this point:

1. Solve a lot of problems.
2. Answer a lot of questions.

This is what "practice" looks like.

Bonus points if you can incorportate some of the things you are learning into your daily work. ← I was able to do this.

There are a few adjacent areas that will round up your skills:

5. How to build a RESTful API?
6. How to containerize it?
7. How to deploy it somewhere?

Basically, we want to go from "the model runs in my notebook" to "holly shit! the model runs in the cloud!"

I've noticed that many people that work for big tech companies have built a "focus on a single area" mentality.

They can afford big teams working on a single problem.

Unfortunately, not everyone works for big tech.

Most companies out there need engineers that can wear many hats.

This means that you'll have to:

1. Build datasets
2. Train models
3. Deploy them
4. Monitor them
5. Maintain them

This is not about "being cheap" or "asking too much." This is the reality out there.

Things I didn't refer to specifically, but you should definitely consider:

1. A strong foundation on Computer Science fundamentals help.

2. Understanding how to work with both relational and non-relational databases is a must.

3. Invest time in communication. Lots of it.

Last week I posted a few bullets about this, and some people replied back with a version of "that will take too long."

Indeed it will. It has taken me a couple of decades and I still know very little.

This is a lifelong journey.

Takes time, but it's incredibly rewarding.

https://twitter.com/realchrisebert/status/1457680106579705871

Beneficial: absolutely!

Required: not unless you want to build a research-focused career.

https://twitter.com/realchrisebert/status/1457680106579705871

https://twitter.com/Machine01776819/status/1457735955763322882?s=20

Here is a very good counter-argument to my statement of "not worrying about machine learning at the start."

https://twitter.com/Machine01776819/status/1457735955763322882?s=20

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @svpino

Santiago

@svpino

2 Nov

Here is the story of one of those hidden issues with machine learning models that books don't tell you about.

This happened in real life: ↓

Imagine you are building a computer vision model.

It goes something like this:

1. Load a dataset of images
2. Train a model with those images
3. Export the final model

Pretty standard stuff.

To make it more specific, let's imagine that you are using OpenCV to load the images from the disk.

Something like the attached screenshot.

Nothing fancy here, right?

Read 11 tweets

Santiago

@svpino

29 Oct

A step-by-step guide to your first Computer Vision problem and 10 questions you should answer after that.

No math and no fancy degrees. If you can read Python, you can do this.

↓

If this is your first time looking at this type of problems, my goal is for you to get familiar with some of the high-level ideas.

There will be some hand-waving, but don't worry about that. Focus on the process and the big pieces.

@DeepnoteHQ

Here is a @DeepnoteHQ notebook with the code and the entire documentation.

You can open it and run it yourself step by step:

deepnote.com/@svpino/MNIST-…

Read 10 tweets

Santiago

@svpino

26 Oct

Here is a problem for you to solve:

How many total handshakes will happen between 10 different people assuming everyone handshakes everyone else?

Don't start drawing things on paper. There's a simple way to solve this: ↓

Let's talk about "triangular series" really quick:

Here is an example of one: 1 2 3 4 5.

I know because I can organize these numbers in a triangle like the attached image shows.

Each row has an equivalent number of points (*'s).

Triangular series always start with 1. We can use "n" to denote the highest number of the series.

So in our [1 2 3 4 5] example, n = 5.

Read 10 tweets

Santiago

@svpino

24 Oct

Full-stack Machine Learning Engineers are becoming one of the hottest commodities out there.

Full-stack machine learning engineer is the person that’s capable of working on the design, implementation, deployment, and maintainance of a machine learning system.

Different people expand or contract the term “Full-Stack” at their convenience.

That’s ok. We don’t need a dictionary to talk about this.

Full-stack is when you can work on end-to-end systems.

Read 7 tweets

Santiago

@svpino

22 Oct

What's a machine learning pipeline?

Well, it turns out that many different things classify as "machine learning pipelines."

Here are five of the different "pipelines" you should be aware of: ↓

Our first pipeline: "Data pipeline."

This goes from ingesting the data from its sources to the final destination where we will consume it.

Sometimes, the data pipeline includes transformations of that data. Sometimes it doesn't.

This leads me to the second pipeline.

The second pipeline: "Data transformation pipeline."

"Wait, I thought this was part of the data pipeline?" You are right; sometimes it is. Sometimes it isn't.

Sometimes, you need to separate "general" transformations from use case-specific transformations.

Read 8 tweets

Santiago

@svpino

19 Oct

One of the most useful things you can learn:

Greedy algorithms, how they work, and how to solve problems using them.

Here is why they are fundamental: ↓

Greedy algorithms:

• Pretty intuitive to understand
• Easy to come up with them
• A great way to solve many problems

Optimization is the root of all evil. Many times, a greedy solution is all you need to solve a problem.

At each step, a greedy algorithm always makes the best optimal choice.

(Unfortunately, this approach is not always guaranteed to converge to the optimal solution. More about this later.)

Here is an example problem where you could use a greedy algorithm:

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Santiago

Try unrolling a thread yourself!

More from @svpino

Santiago

Santiago

Santiago

Santiago

Santiago

Santiago

Did Thread Reader help you today?

Like this author's thread?