Pau Labarta Bajo Profile picture
Jul 7 โ€ข 10 tweets โ€ข 4 min read Twitter logo Read on Twitter
Wanna learn MLOps?

Stop reading blog posts.
Build a prediction service instead ๐Ÿš€

Here is a project you can build (for free) ๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ปโ†“โ†“โ†“
Let's build a Machine Learning service to predict the Air Quality Index (AQI) in your city in the next 3 days, using a 100% serverless stack.

You will learn a lot, AND you will build something useful for society.

Win-win ๐Ÿ†๐Ÿ†

These are steps to build this โ†“
Step 1: Feature generation script ๐Ÿ

1 โ†’ fetches raw weather and pollutant data from an external API like

2 โ†’ computes features from this raw data (aka model inputs), and targets (aka model outputs)

3 โ†’ stores these features in the *Feature Store* https://t.co/72uTTBYnqFaqicn.org/city/barcelona
Step 2: Backfill historical (features, targets) โฎ๏ธ

To train a Machine Learning model later, you need enough historical data (features, targets) in your Feature Store.

Run the feature script for a range of past dates, to get enough training data.
Step 4: Model training script ๐Ÿ‹๏ธ

1 โ†’ fetches historical (features, targets) from the Feature Store.

2 โ†’ trains and evaluate the best ML model possible for this data, e.g. XGBoostRegressor.

3 โ†’ stores the trained model in the Model Registry.
Step 5: Automate execution of the feature script ๐Ÿ•ฐ๏ธ

Create a GitHub action to automatically run the feature script (from step 1) every hour.

GitHub actions are serverless computing power to run your code on a schedule. For free.

Beautiful.
Step 6: Create a web app to show model predictions ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป

Streamlit is a powerful Python library to develop and deploy web data apps.

Your app

1 โ†’ loads the model and features from the *Feature Store*,

2 โ†’ computes model predictions and shows them on a beautiful UI.

BOOM!
Bonus ๐ŸŽ

You can create another GitHub action to automate the model training script.

Why re-train the model? ๐Ÿค”

Because ML model performance decreases over time.
The best way to mitigate this is to regularly re-train the model, like once a week.
Wanna level up in ML/MLOps?

Join my e-mail list and get one article ๐—˜๐˜ƒ๐—ฒ๐—ฟ๐˜† ๐—ฆ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฑ๐—ฎ๐˜† ๐—บ๐—ผ๐—ฟ๐—ป๐—ถ๐—ป๐—ด โ†“
datamachines.xyz/subscribe/
Every week I share real-world Data Science/Machine Learning content.

Follow me @paulabartabajo_ so you do not miss what's coming next.

Wanna help?
Like/Retweet the first tweet below to spread the wisdom โ†“โ†“โ†“

โ€ข โ€ข โ€ข

Missing some Tweet in this thread? You can try to force a refresh
ใ€€

Keep Current with Pau Labarta Bajo

Pau Labarta Bajo Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @paulabartabajo_

Jul 6
Junior data scientists stay inside Jupyter.

Senior data scientists go beyond...
... so their ML models reach production ๐Ÿš€

Wanna learn how? โ†“
Jupyter notebooks are the most popular environment to develop Machine Learning models.

They are the faster way to

โ†’ add code
โ†’ fix code
โ†’ re-run code

With the hope that every new run will bring better results, and hence a better model.
In real-world projects, you need to re-run notebooks hundreds (if not thousands of times).

And keeping track of all results quickly becomes tedious.

So you slow down your pace.

And get lost in numbers.
Read 10 tweets
Jul 6
Wanna learn enough git to be a data scientist?

A hands-on tutorial in 10 steps ๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ปโ†“โ†“โ†“
#1 Create your project folder and cd into it
#2 Create a README file.

This is the first thing anyone visiting your repository will see.
You better have one. And you better make it pretty.
Read 15 tweets
Jul 5
I used to spend hours setting up IAM roles, Docker registries, and EC2 instances, to build ML products

... until I discovered this โ†“
You don't need to be an AWS expert to develop real-world ML apps.

There is a faster way, called ๐—ฆ๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ๐—น๐—ฒ๐˜€๐˜€ ๐— ๐—Ÿ
๐—ช๐—ต๐—ฎ๐˜ ๐—ถ๐˜€ ๐—ฆ๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ๐—น๐—ฒ๐˜€๐˜€ ๐— ๐—Ÿ?

The idea is simple:

You integrate the services you need at the code level, using each service's Python SDK.

Your code is focused on business logic that differentiates your product, so you spend 0 time on the infra you need to run it.
Read 6 tweets
Jul 1
3 years ago I struggled to land my first freelance ML engineering contract.

Then I discovered this โ†“
Building one professional real-world ML project is the best way to stand out from the crowd, and land an ML job.

Here is what I did, ๐˜€๐˜๐—ฒ๐—ฝ-๐—ฏ๐˜†-๐˜€๐˜๐—ฒ๐—ฝ ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ปโ†“
Step 1. Find a real-world problem you are interested in

Working on projects is harder than completing online courses.
But hey, no pain no gain.

It is VERY important you work on a problem you are interested in.
Otherwise, you will quit.
Read 10 tweets
Jun 29
2 years ago I got tired of developing ML models... that never made it into production.

Then I discovered this โ†“
It is best practice in Software Engineering to first build a working MVP (minimal viable product) fast. And then start iterating.

In the case of ML, an MVP is a minimal system that can

- ingest new data
- make predictions on this data, and
- publish these predictions
So, instead of trying to build the perfect model in a notebook, try to build this minimal system.

This is the recipe that will help you build and ship ML.

And not get stuck in the Jupyter-notebook-infinite-development loop.
Read 5 tweets
Jun 29
Wanna learn with me how to

- train ๐Ÿ‹๏ธ
- deploy ๐Ÿš€
- automate ๐Ÿช

a real-time ML system in 10 steps?

Welcome to the ๐—›๐—ฎ๐—ป๐—ฑ๐˜€-๐—ผ๐—ป ๐—ง๐—ฟ๐—ฎ๐—ถ๐—ป & ๐——๐—ฒ๐—ฝ๐—น๐—ผ๐˜† ๐— ๐—Ÿ ๐—ง๐˜‚๐˜๐—ผ๐—ฟ๐—ถ๐—ฎ๐—น ๐Ÿค—

Here is ๐—ฆ๐˜๐—ฒ๐—ฝ 2, where we generate the training data
And here is the full code.

Give it a star โญ on GitHub if you like it ๐Ÿ™
โ†“โ†“โ†“
github.com/Paulescu/handsโ€ฆ
Wanna get more real-world ML/MLOps content?
โ†’ Follow me @paulabartabajo_

Wanna help?
โ†’ Like/Retweet the first tweet below to spread the wisdom โ†“โ†“โ†“
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(