Pau Labarta Bajo Profile picture
The Real-World ML guy | Learn to build real-world ML apps at https://t.co/xWr8Hm8zI5
4 subscribers
Dec 16 โ€ข 4 tweets โ€ข 2 min read
ML Project Idea ๐Ÿ’ก

Let's predict air quality โ†“ Image Here is a full example, with source code, to learn how to build a complete ML app that predicts air quality in different European cities.

Clone the code, modify it, and deploy it!
github.com/logicalclocks/โ€ฆ
Dec 15 โ€ข 18 tweets โ€ข 4 min read
Are you a data scientist using CSV files to store your data?

What if I told you there is a better way?

Can you imagine a

-> lighter ๐Ÿฆ‹
-> faster ๐ŸŽ๏ธ
-> cheaper ๐Ÿ’ธ

file format to save your datasets?

Read this thread so you don't need to imagine anymore ๐Ÿ‘‡๐Ÿพ Image Do not get me wrong. I love CSVs.

You can open them with any text editor, inspect them and share them with others.

They have become the standard file format for datasets in the AI/ML community.

However, they have a little problem...
Dec 11 โ€ข 9 tweets โ€ข 3 min read
3 years ago I struggled to build ML products.

Then I discovered this โ†“ Image Unless you are a researcher in academia, and your goal is to publish a paper, you cannot just focus on the ML model you wanna train.

You need to think further down the line and think of the business problem you are trying to solve.

This is the "product-first" mindset.
Oct 22 โ€ข 15 tweets โ€ข 4 min read
Let's design an ML system to predict crypto prices, step-by-step โ†“๐Ÿงต The problem

We want to build a real-time API that serves in real-time short-term predictions on crypto prices.

For example
To predict the price of Ethereum (ETH) in the next 10 seconds.
Oct 20 โ€ข 7 tweets โ€ข 2 min read
One skill every ML engineer has to master โ†“ ๐— ๐—Ÿ ๐—ฆ๐˜†๐˜€๐˜๐—ฒ๐—บ ๐—ฑ๐—ฒ๐˜€๐—ถ๐—ด๐—ป

Yes. And do you know why?

Because good ML system design hasn't changed at all in the last 5 years.

And it won't.
Oct 13 โ€ข 15 tweets โ€ข 3 min read
Every aspiring data scientist I talk to is overwhelmed by the colossal amount of online courses to choose from ๐Ÿคฏ

My solution to this problem โ†“ Learning is about connecting the dots.

However, it feels like there are too many dots to connect when learning data science.

Too many courses...
Too many blog posts...
Too many technologies...

Solution: You need to change the way you learn.
Sep 22 โ€ข 8 tweets โ€ข 2 min read
Time-series are used everywhere

โ†’ At Uber to optimize fleet efficiency
โ†’ At Amazon to forecast inventory levels
โ†’ At every hedge fund to project asset prices.

Still, there is a lack of ML engineers who can build real-world time-series products.

So here is your chance โ†“ Let's build a *complete* ML service that forecasts taxi rides in NYC, similar to what Uber does to forecast demand.

The 3 ingredients we need are;

- a dataset
- a Python library to build a good predictive model
- a deployment strategy

For example โ†“
Sep 18 โ€ข 10 tweets โ€ข 3 min read
3 years ago I struggled to land my first freelance ML engineering contract.

Then I discovered this โ†“ Image Building one professional real-world ML project is the best way to stand out from the crowd, and land an ML job.

Here is what I did, ๐˜€๐˜๐—ฒ๐—ฝ-๐—ฏ๐˜†-๐˜€๐˜๐—ฒ๐—ฝ ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ปโ†“
Sep 18 โ€ข 12 tweets โ€ข 3 min read
Let's build an AI Coding assistant with Llama3 โ†“๐Ÿงต๐Ÿฆ™ Step 1. Download llama3 with Ollama ๐Ÿฆ™

Ollama is an open-source tool to run Large Language Models locally, that you can download for free from here.

ollama.com/download
Aug 13 โ€ข 15 tweets โ€ข 4 min read
Let's design an ML system to predict crypto prices, step-by-step โ†“๐Ÿงต The problem

We want to build a real-time API that serves in real-time short-term predictions on crypto prices.

For example
To predict the price of Ethereum (ETH) in the next 10 seconds.
Aug 1 โ€ข 15 tweets โ€ข 4 min read
Let's build a real-time ML system to predict short-term prices
โ†“โ†“โ†“๐Ÿงต The problem

We want to build a real-time API that serves in real-time short-term predictions on crypto prices.

For example
To predict the price of Ethereum (ETH) in the next 10 seconds.
Jul 3 โ€ข 11 tweets โ€ข 3 min read
Most of my ML model prototypes never reached production ๐Ÿ˜ตโ€๐Ÿ’ซ

Until I changed my mindset ๐Ÿง โ†“โ†“โ†“ Image ๐Ÿ”ฌ Model-first mindset

A model-first mindset is what Kaggle competitions and most online courses are about.

Your ONLY focus is to build the best possible mapping between a set of input features, and a target metric

And in real-world ML this is often not the best approach...
Jul 2 โ€ข 13 tweets โ€ข 4 min read
How do you build
> ๐—ฟ๐—ฒ๐—ฎ๐—น-๐˜๐—ถ๐—บ๐—ฒ ML systems โšก
> at ๐˜€๐—ฐ๐—ฎ๐—น๐—ฒ ๐ŸŽ›๏ธ
> ๐˜„๐—ถ๐˜๐—ต๐—ผ๐˜‚๐˜ ๐—ฏ๐˜‚๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐—ฐ๐—ฎ๐˜€๐—ต ๐Ÿ’ธ?

๐Ÿงตโ†“ ๐—ง๐—ต๐—ฒ ๐—ฝ๐—ฟ๐—ผ๐—ฏ๐—น๐—ฒ๐—บ ๐Ÿค”

Letโ€™s say you work as an ML engineer at a fintech startup, whose flagship product is a mobile app for online payments.

A critical problem you need to tackle from day 0 is the automatic detection of fraudulent transactions.
Jul 1 โ€ข 10 tweets โ€ข 4 min read
ML Project Idea ๐Ÿ’ก

Let's predict taxi demand in NYC in the next 60 minutes ๐Ÿš•โ†“ Image Business problem ๐Ÿ’ผ

Let's create a predictive model to forecast the number of taxi rides that will happen in Manhattan (New York City)

- in the next hour
- for each taxi zone (e.g. Zone 113 "Lower Manhattan)

Let's do it in 6 steps โ†“ Image
Jun 25 โ€ข 6 tweets โ€ข 2 min read
ML Project Idea ๐Ÿ’ก

Let's predict flight delays ๐Ÿ›ฌ โ†“ Image Here is a full example, with source code, to learn how to build a complete ML app that predicts flight delays for Stockholm Arlanda airport.

Clone the code, modify it, and deploy it!
github.com/SebastianoMeneโ€ฆ
Jun 18 โ€ข 17 tweets โ€ข 5 min read
Let's build an LLM agent in Python, step-by-step โ†“๐Ÿงต Why agents ๐Ÿค–โ“

Because Large Language Models alone are not enough to accurately answer complex tasks that require

-> External information that was not present in the training dataset used to fit the LLM paramaters
or
-> Many reasoning steps
Jun 12 โ€ข 6 tweets โ€ข 2 min read
ML Project Idea ๐Ÿ’ก

Let's predict flight delays ๐Ÿ›ฌ โ†“ Image Here is a full example, with source code, to learn how to build a complete ML app that predicts flight delays for Stockholm Arlanda airport.

Clone the code, modify it, and deploy it!
github.com/SebastianoMeneโ€ฆ
Jun 11 โ€ข 4 tweets โ€ข 2 min read
ML Project Idea ๐Ÿ’ก

Let's predict air quality โ†“ Image Here is a full example, with source code, to learn how to build a complete ML app that predicts air quality in different European cities.

Clone the code, modify it, and deploy it!
github.com/logicalclocks/โ€ฆ
Jun 8 โ€ข 9 tweets โ€ข 3 min read
Wanna learn time-series forecasting? ๐Ÿ“ˆ

No more reading blog posts.
It is time to forecast for real ๐Ÿ˜Ž

Here is a project you can build ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿง‘๐Ÿฝโ€๐Ÿ’ปโ†“ Business problem ๐Ÿ’ผ

Let's create a predictive model to forecast the number of taxi rides that will happen in Manhattan (New York City)

- per hour (e.g. tomorrow between 5 PM and 6 PM), and
- per zone (e.g. Zone 113 "Lower Manhattan)

in the following 3 days. Image
Jun 7 โ€ข 10 tweets โ€ข 3 min read
Wanna become an NLP engineer?

Stop taking online courses.
Build something instead ๐Ÿ—๏ธ

Here is a project you can build ๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ปโ†“โ†“โ†“ Reading blog posts about multi-billion-parameter Language Models is very cool.

However, building real-world NLP products from these models is where the real business value is. And this is what companies look for in the job market.

So, here is a PRO project you can build โ†“
Jun 7 โ€ข 11 tweets โ€ข 3 min read
Tired of training lots of Machine Learning models, and not getting better results? ๐Ÿ˜ตโ€๐Ÿ’ซ

This is how you solve this ๐Ÿง โ†“ A Machine Learning model is the output of a 3-step workflow where you:

1 โ†’ Fetch raw data, for example from an external database.

2 โ†’ Process the data into a tabular format, so you have N features and 1 target.

3 โ†’ Train ML models (e.g. XGBoost) and tune hyper-parameters. Image