Santiago Profile picture
2 Jan, 6 tweets, 1 min read
5 differences between a machine learning system and the software you are building today:

๐Ÿงต๐Ÿ‘‡
1. The Team

Usually, a machine learning system needs the involvement of many different disciplines:

- Data Scientists
- Data Engineers
- Machine Learning Engineers

Plus, the same roles that are needed by a conventional software system.

๐Ÿ‘‡
2. The Development Process

Machine learning is a very experimental process. Creating a model requires a lot of exploration, usually not needed in software development.

๐Ÿ‘‡
3. The Testing Process

Testing a machine learning system is much more involved than testing a regular piece of software.

Here are three steps unique to machine learning:

โ–ซ๏ธ Data validation
โ–ซ๏ธ Testing model updates
โ–ซ๏ธ Model validation

๐Ÿ‘‡
4. The Deployment Process

In a machine learning system, you are dealing with an entire pipeline, from data collection and model training all the way to automatic model monitoring.

This pipeline is much more complex than a regular CI/CD cycle in software development.

๐Ÿ‘‡
5. Monitoring

Data is constantly changing, and it directly impacts the performance of machine learning systems in production.

Systems decay even without going through any modifications.

This requires constant monitoring to detect and correct drift.

โ€ข โ€ข โ€ข

Missing some Tweet in this thread? You can try to force a refresh
ใ€€

Keep Current with Santiago

Santiago Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @svpino

3 Jan
A machine learning workflow:

1. Define the problem
2. Assemble a dataset
3. Determine success metrics
4. Decide on evaluation method
5. Prepare the data
6. Establish a baseline
7. Develop a model that beats the baseline
8. Overfit model
9. Regularize model
10. Tune model
Where's model validation in this workflow?

Notice that steps 8, 9, and 10 presume the existence of a mechanism to evaluate the model. This means that model validation is implicitly part of this workflow.
"Assembling a dataset" focuses on determining what will be the sources of data that we will need to solve the problem.

Before understanding metrics of success, we need to have access to the data that we will be using.

Later, "Preparing the data" focuses on that data.
Read 4 tweets
31 Dec 20
For a long time, I didn't understand how to use Virtual Environments in Python ๐Ÿ.

If this is just, let's end it here and now: ๐Ÿงต๐Ÿ‘‡
[2] Virtual Environments let you deal with the dependencies that your code has with external Python libraries.

It avoids having conflicts when your projects depend on different versions of the same library.

๐Ÿ‘‡
[3] Let's imagine that you are building your first Python project and you install the "requests" library:

pip install requests

You get version 2.24.0 installed in your system.

๐Ÿ‘‡
Read 9 tweets
28 Dec 20
I told everyone that I didn't care.

"Screw math! I've never been great with it, so I'm not starting with machine learning to fail at the end."

That was many years ago.

Math is still hard, but I don't think you should be scared at all. Here is why: ๐Ÿงต๐Ÿ‘‡
[1] One thing changed my mind: school pushed me to the deep end of the pool, and I was forced to swim.

I had to face my fears, and I started machine learning and realized that the math involved is not as scary (or as much) as some people believe.

๐Ÿ‘‡
[2] Probably one of the most frequent questions I get is around the math needed for machine learning.

Answer:

โ–ซ๏ธ Probabilities and Statistics
โ–ซ๏ธ Linear Algebra
โ–ซ๏ธ Calculus

But it turns out that this is not helpful.

๐Ÿ‘‡
Read 18 tweets
27 Dec 20
I've worked with Dell, HP, IBM, Cisco, HSBC, Disney, G4S, among other large companies.

Don't think for a minute that they have things figured out.

They have amazing development teams. They also have mediocre and straight-horrible teams.

๐Ÿงต๐Ÿ‘‡
[2] In my experience, smaller companies tend to be more selective when hiring: they can't afford to make a mistake.

I've found out that these smaller companies build consistently decent teams. (Although they have a harder time hiring talent.)

๐Ÿ‘‡
[3] Larger companies, on the other hand, build teams across many different departments. Maintaining consistency is hard, if not impossible.

I met excellent teams: sharp, organized, building excellent products using state-of-the-art technology.

๐Ÿ‘‡
Read 6 tweets
27 Dec 20
An introduction to one of the the most basic structures used in machine learning: a tensor.

๐Ÿงต๐Ÿ‘‡
Tensors are the data structure used by machine learning systems, and getting to know them is an essential skill you should build early on.

A tensor is a container for numerical data. It is the way we store the information that we'll use within our system.

(2 / 16)
Three primary attributes define a tensor:

โ–ซ๏ธ Its rank
โ–ซ๏ธ Its shape
โ–ซ๏ธ Its data type

(3 / 16)
Read 16 tweets
26 Dec 20
11 key concepts of Machine Learning.

โ€” Supervised Learning Edition โ€”

๐Ÿงต๐Ÿ‘‡
๐Ÿ˜œ

Before starting, remember that, if you follow me, one of your enemies will be immediately destroyed (and you'll get to read more of these threads, of course.)

And if you don't follow me, well, you just hurt my feelings.

๐Ÿ˜œ
1. Labels

(Also referred to as "y")

The label is the piece of information that we are predicting.

For example:

- the animal that's shown in a picture
- the price of a house
- whether a message is spam or not

๐Ÿ‘‡
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!