The Real-World ML guy | Learn to build real-world ML apps at https://t.co/xWr8Hm8zI5
4 subscribers
Dec 16 โข 4 tweets โข 2 min read
ML Project Idea ๐ก
Let's predict air quality โ
Here is a full example, with source code, to learn how to build a complete ML app that predicts air quality in different European cities.
Are you a data scientist using CSV files to store your data?
What if I told you there is a better way?
Can you imagine a
-> lighter ๐ฆ
-> faster ๐๏ธ
-> cheaper ๐ธ
file format to save your datasets?
Read this thread so you don't need to imagine anymore ๐๐พ
Do not get me wrong. I love CSVs.
You can open them with any text editor, inspect them and share them with others.
They have become the standard file format for datasets in the AI/ML community.
However, they have a little problem...
Dec 11 โข 9 tweets โข 3 min read
3 years ago I struggled to build ML products.
Then I discovered this โ
Unless you are a researcher in academia, and your goal is to publish a paper, you cannot just focus on the ML model you wanna train.
You need to think further down the line and think of the business problem you are trying to solve.
This is the "product-first" mindset.
Oct 22 โข 15 tweets โข 4 min read
Let's design an ML system to predict crypto prices, step-by-step โ๐งต
The problem
We want to build a real-time API that serves in real-time short-term predictions on crypto prices.
For example
To predict the price of Ethereum (ETH) in the next 10 seconds.
Oct 20 โข 7 tweets โข 2 min read
One skill every ML engineer has to master โ
๐ ๐ ๐ฆ๐๐๐๐ฒ๐บ ๐ฑ๐ฒ๐๐ถ๐ด๐ป
Yes. And do you know why?
Because good ML system design hasn't changed at all in the last 5 years.
And it won't.
Oct 13 โข 15 tweets โข 3 min read
Every aspiring data scientist I talk to is overwhelmed by the colossal amount of online courses to choose from ๐คฏ
My solution to this problem โ
Learning is about connecting the dots.
However, it feels like there are too many dots to connect when learning data science.
Too many courses...
Too many blog posts...
Too many technologies...
Solution: You need to change the way you learn.
Sep 22 โข 8 tweets โข 2 min read
Time-series are used everywhere
โ At Uber to optimize fleet efficiency
โ At Amazon to forecast inventory levels
โ At every hedge fund to project asset prices.
Still, there is a lack of ML engineers who can build real-world time-series products.
So here is your chance โ
Let's build a *complete* ML service that forecasts taxi rides in NYC, similar to what Uber does to forecast demand.
The 3 ingredients we need are;
- a dataset
- a Python library to build a good predictive model
- a deployment strategy
For example โ
Sep 18 โข 10 tweets โข 3 min read
3 years ago I struggled to land my first freelance ML engineering contract.
Then I discovered this โ
Building one professional real-world ML project is the best way to stand out from the crowd, and land an ML job.
Here is what I did, ๐๐๐ฒ๐ฝ-๐ฏ๐-๐๐๐ฒ๐ฝ ๐ฉโ๐ป๐จ๐ฝโ๐ปโ
Sep 18 โข 12 tweets โข 3 min read
Let's build an AI Coding assistant with Llama3 โ๐งต๐ฆ
Step 1. Download llama3 with Ollama ๐ฆ
Ollama is an open-source tool to run Large Language Models locally, that you can download for free from here.
Let's design an ML system to predict crypto prices, step-by-step โ๐งต
The problem
We want to build a real-time API that serves in real-time short-term predictions on crypto prices.
For example
To predict the price of Ethereum (ETH) in the next 10 seconds.
Aug 1 โข 15 tweets โข 4 min read
Let's build a real-time ML system to predict short-term prices
โโโ๐งต
The problem
We want to build a real-time API that serves in real-time short-term predictions on crypto prices.
For example
To predict the price of Ethereum (ETH) in the next 10 seconds.
Jul 3 โข 11 tweets โข 3 min read
Most of my ML model prototypes never reached production ๐ตโ๐ซ
Until I changed my mindset ๐ง โโโ
๐ฌ Model-first mindset
A model-first mindset is what Kaggle competitions and most online courses are about.
Your ONLY focus is to build the best possible mapping between a set of input features, and a target metric
And in real-world ML this is often not the best approach...
Jul 2 โข 13 tweets โข 4 min read
How do you build
> ๐ฟ๐ฒ๐ฎ๐น-๐๐ถ๐บ๐ฒ ML systems โก
> at ๐๐ฐ๐ฎ๐น๐ฒ ๐๏ธ
> ๐๐ถ๐๐ต๐ผ๐๐ ๐ฏ๐๐ฟ๐ป๐ถ๐ป๐ด ๐ฐ๐ฎ๐๐ต ๐ธ?
Letโs say you work as an ML engineer at a fintech startup, whose flagship product is a mobile app for online payments.
A critical problem you need to tackle from day 0 is the automatic detection of fraudulent transactions.
Jul 1 โข 10 tweets โข 4 min read
ML Project Idea ๐ก
Let's predict taxi demand in NYC in the next 60 minutes ๐โ
Business problem ๐ผ
Let's create a predictive model to forecast the number of taxi rides that will happen in Manhattan (New York City)
- in the next hour
- for each taxi zone (e.g. Zone 113 "Lower Manhattan)
Let's do it in 6 steps โ
Jun 25 โข 6 tweets โข 2 min read
ML Project Idea ๐ก
Let's predict flight delays ๐ฌ โ
Here is a full example, with source code, to learn how to build a complete ML app that predicts flight delays for Stockholm Arlanda airport.
Let's build an LLM agent in Python, step-by-step โ๐งต
Why agents ๐คโ
Because Large Language Models alone are not enough to accurately answer complex tasks that require
-> External information that was not present in the training dataset used to fit the LLM paramaters
or
-> Many reasoning steps
Jun 12 โข 6 tweets โข 2 min read
ML Project Idea ๐ก
Let's predict flight delays ๐ฌ โ
Here is a full example, with source code, to learn how to build a complete ML app that predicts flight delays for Stockholm Arlanda airport.
Let's predict air quality โ
Here is a full example, with source code, to learn how to build a complete ML app that predicts air quality in different European cities.
Here is a project you can build ๐ฉ๐ฝโ๐ป๐จโ๐ปโโโ
Reading blog posts about multi-billion-parameter Language Models is very cool.
However, building real-world NLP products from these models is where the real business value is. And this is what companies look for in the job market.
So, here is a PRO project you can build โ
Jun 7 โข 11 tweets โข 3 min read
Tired of training lots of Machine Learning models, and not getting better results? ๐ตโ๐ซ
This is how you solve this ๐ง โ
A Machine Learning model is the output of a 3-step workflow where you:
1 โ Fetch raw data, for example from an external database.
2 โ Process the data into a tabular format, so you have N features and 1 target.
3 โ Train ML models (e.g. XGBoost) and tune hyper-parameters.