We are going back to the basics to simplify ML algorithms.
... today's turn is Multiple Linear Regression! ๐๐ป
In MLR, imagine you're baking.
You've got different ingredients or variables.
You need the perfect recipe (model) for your cake (prediction).
Each ingredient's quantity (coefficient) affects the taste (outcome).
1๏ธโฃ ๐๐๐ง๐ ๐๐๐ง๐๐๐ฅ๐๐ก๐ ๐ฃ๐๐๐ฆ๐
We're using height and weight - a classic duo often assumed to have a linear relationship.
But assumptions in data science? No way! ๐ง
Let's find out:
- Do height and weight really share a linear bond?
2๏ธโฃ ๐๐๐ง๐ ๐๐ซ๐ฃ๐๐ข๐ฅ๐๐ง๐๐ข๐ก ๐ง๐๐ ๐! ๐ต๏ธโโ๏ธ
Before we get our hands dirty with modeling, let's take a closer look at our data.
Remember, the essence of a great model lies in truly understanding your data first. ๐๏ธ
However... what about Gender?
๐๐๐ก๐๐๐ฅ'๐ฆ ๐ฅ๐ข๐๐
Let's start with the basics: when we plot height against weight, we see a linear pattern emerge.
However... when we consider gender...
It turns out that it significantly affects the weight for a given height.
3๏ธโฃ ๐๐๐ฌ๐ข๐ก๐ ๐๐๐๐๐๐ง
Splitting our data by gender, we can perform two SINGLE linear regression.
The slopes of these lines are almost identical, which indicates a similar behavior.
But what about the intercepts?
They tell us that start from different baselines. ๐ฆ
4๏ธโฃ ๐ ๐จ๐๐ง๐-๐ฉ๐๐ฅ๐๐๐๐๐ ๐
We can add multiple variables to perform a MULTIPLE Linear Regression.
The core theory is the same: We still use a linear function to predict our target.
But we can track N independent values
So we can consider both Height and Gender โก๏ธ N=2
5๏ธโฃ ๐ง๐ฌ๐ฃ๐๐ฆ ๐ข๐ ๐ฉ๐๐ฅ๐๐๐๐๐๐ฆ ๐ฒ
MLR accepts both numbers and categories.
HEIGHT is a numerical variable - which is a variable that can be measured.
GENDER is a category - It splits our data into different groups.
To consider categories in our model, they have to be encoded into a binary variable.
So say hello to dummy variables! ๐๐ป
We can easily convert our gender variable into a boolean one with 1 and 0.
6๏ธโฃ ๐ง๐๐ ๐๐ค๐จ๐๐ง๐๐ข๐ก ๐งฎ
Our regression equation is like a secret recipe.
It tells us how much of each ingredient (variables) we need.
Any unit increase in height makes the weight increase.
But gender affects this relationship too.
So we need to compute the weights!
7๏ธโฃ ๐๐๐ก๐๐ ๐ฅ๐๐ฆ๐จ๐๐ง๐ฆ ๐
We can use scikit-learn to implement such MLR.
The code is quite straightforward and we can easily obtain all three weights.
We get a single equation for both cases.
When considering that gender is either 0 or 1, we obtain two equations.
And they are quite similar to the ones we obtained in the beginning.
So this is all for now on Linear Regression.
Next week I'll write about Logistic Regression!
So you better stay tuned! ๐ค
Did you like this thread?
Then join my freshly started DataBites newsletter to get all my content right to your mail every Sunday! ๐งฉ
How to make your LLMs smarter and more efficient explained!๐๐ป
(Don't forget to bookmark for later ๐)
Creating an LLM demo is a breeze.
But... refining it for production? That's where the real challenge begins! ๐ ๏ธ
Teams often grapple with LLMs lacking deep knowledge or delivering inaccurate outputs.
How do we fix this?
Optimization isn't a one-size-fits-all. Approach it along two axes:
๐ง ๐๐ผ๐ป๐๐ฒ๐ ๐ ๐ข๐ฝ๐๐ถ๐บ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป: Is the model missing the right info?
โ๏ธ ๐๐๐ ๐ข๐ฝ๐๐ถ๐บ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป: Is the model's output off-target? ๐ฏ
Simple Linear Regression exemplified for dummies๐๐ป
(Don't forget to bookmark for later! ๐)
1๏ธโฃ ๐๐๐ง๐ ๐๐๐ง๐๐๐ฅ๐๐ก๐ ๐ฃ๐๐๐ฆ๐
We're using height and weight - a classic duo often assumed to have a linear relationship.
But assumptions in data science? No way! ๐ง
Let's find out:
- Do height and weight really share a linear bond?
Do you like this post?
Then join my DataBites newsletter to get all my content right to your mail every Sunday! ๐งฉ
Linear regression is the simplest statistical regression method used for predictive analysis.
It can be performed with multiple variables.... but today we'll focus on a single one.
Also known as Simple Linear Regression.
1๏ธโฃ ๐ฆ๐๐ ๐ฃ๐๐ ๐๐๐ก๐๐๐ฅ ๐ฅ๐๐๐ฅ๐๐ฆ๐ฆ๐๐ข๐ก
In Simple Linear Regression, we use one independent variable to predict a dependent one.
The main goal? ๐ฏ
Finding a line of best fit.
It's simple yet powerful, revealing hidden trends in data.
Linear regression is the simplest statistical regression method used for predictive analysis.
It can be performed with multiple variables.... but today we'll focus on a single one.
Also known as Simple Linear Regression.
1๏ธโฃ ๐ฆ๐๐ ๐ฃ๐๐ ๐๐๐ก๐๐๐ฅ ๐ฅ๐๐๐ฅ๐๐ฆ๐ฆ๐๐ข๐ก
In Simple Linear Regression, we use one independent variable to predict a dependent one.
The main goal? ๐ฏ
Finding a line of best fit.
It's simple yet powerful, revealing hidden trends in data.
The Encoder is the part responsible for processing input tokens through self-attention and feed-forward layers to generate context-aware representations.
๐ Itโs the powerhouse behind understanding sequences in NLP models.
Are you enjoying this post?
Then join my newsletter DataBites to get all my content right to your mail every week! ๐งฉ