David Andrés 🤖📈🐍 Profile picture
Jan 7 9 tweets 2 min read Read on X
Do you want to forecast seasonal time series data?

Remove the seasonality and add it back at the end! That's basically what STL method does. Image
STL stands for “Seasonal and Trend decomposition using LOESS”. It is a versatile and robust method for decomposing time series.

It uses LOESS (Locally Estimated Scatterplot Smoothing) instead of Moving Average to extract the seasonal component.
1️⃣ Decompose the Time Series:

Utilize STL to split the time series into three parts:
• trend
• seasonal component
• residual component
2️⃣ Deseasonalize:

Subtract the seasonal component from the main series, creating a deseasonalized version.

This will yield the trend with residuals, which are not just noise, they represent random or irregular fluctuations that are not captured by the trend or the seasonality.
3️⃣ Forecast Deseasonalized data:

Employ non-seasonal methods like ARIMA or Simple Exponential Smoothing to predict the trend and residuals in the deseasonalized data.
4️⃣ Forecast Seasonality:

STL predicts future seasons by repeating the last observed season; for instance, last year's seasonal pattern for monthly data.
5️⃣ Reapply Seasonal Component:

Finally, incorporate the forecasted seasonal data back to the deseasonalized forecast, restoring it to the original scale.

That's how you can forecast your time series data using STL.
You can learn more about this in the latest issue of MLPills💊

Read it here 👇
open.techwriters.info/mlpills/issue-…
You should also join our newsletter, DSBoost🚀

Every week we share:
🔹Interviews
🔹Podcast notes
🔹Learning resources
🔹Interesting collections of content

Subscribe for free👇👇
dsboost.dev

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with David Andrés 🤖📈🐍

David Andrés 🤖📈🐍 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @daansan_ml

Jan 8
ARIMA models have three parameters: 'p', 'q' and 'd'.

They need to be optimized... but, before that, do you know how to interpret each of them?

Learn what each of them mean here 🧵 👇 Image
ARIMA stands for Auto-Regressive Integrated Moving Average.

It is a statistical method used for time series forecasting, particularly in analyzing and predicting future values based on past observations, by capturing underlying trends and patterns in the data.

Let's see more 👇
🟢 d → order of differencing

Differencing is a method used to make a non-stationary time series stationary (remove trends and seasonality from a time series).

The ‘d’ parameter represents the number of times the data needs to be differenced to make it stationary.
Read 8 tweets
Jan 4
Are you familiar with the most common Machine Learning algorithms?

Today, I will complete the Top 10 of the most commonly used ones!

Check them out 🧵 👇 Image
7️⃣ Neural networks are composed of interconnected layers of artificial neurons that learn complex and nonlinear patterns from data by adjusting weights and biases through backpropagation.

Useful for solving a wide range of problems, such as image recognition, NLP...
8️⃣ Random forest combines multiple decision trees, each trained on a random subset of the data and features, and aggregates their predictions for classification or regression tasks.

Useful for achieving high accuracy and robustness, as well as reducing overfitting and variance.
Read 8 tweets
Jan 3
You normally forecast the trend of your data, but there are cases in which the variance is also important.

The most common example is Finance, but there are other fields in which it is also relevant.

ARCH models are used for that!

Learn when they are useful 🧵👇 Image
1️⃣ Economics:
In macroeconomics, ARCH/GARCH models can be employed to model and forecast the volatility of economic indicators or time series data.
They can be useful in studying the variability of inflation rates, interest rates, and other economic variables.
2️⃣ Environmental Science:
ARCH models can be applied to study and forecast volatility in environmental data, such as climate variables or pollution levels.
Read 13 tweets
Jan 2
Do you have outliers in your data?

What should you do with them? 🤔

Here's a guide on effectively managing them 🧵 👇 Image
1️⃣ Remove Outliers:

📄 Involves identifying and eliminating the data points that deviate significantly from the rest of the data.

🕑 When the outliers are suspected to be due to errors or anomalies and their removal doesn’t significantly reduce the sample size.
2️⃣ Imputation:

📄 Substituting the outliers in the target variable with more representative values.

🕑 When the outliers are in the variable to predict, and replacing them with the mean, median, or a model-predicted value would not distort the underlying data distribution.
Read 12 tweets
Jan 1
Are you familiar with the most common Machine Learning algorithms?

Today, I introduce 6 of the most commonly used ones!

Check them out 🧵 👇 Image
1️⃣ Linear regression predicts continuous values (e.g. sales, prices) by finding the best-fitting line between input and output variables.

Useful for understanding how input changes affect output.
2️⃣ Decision tree splits data into branches based on rules or criteria for classification or regression tasks. Each branch is a possible outcome or decision, and each leaf node is a final prediction.

Useful for visualizing and explaining logic behind predictions.
Read 10 tweets
Dec 30, 2023
Discover how Kernel Smoothing can discover hidden trends in your data!

Do you know this Data Smoothing technique?
Find out more here 🧵 👇 Image
Kernel smoothing is a technique used to reduce noise and capture the underlying patterns in your data.

Kernel smoothing works by averaging data points to produce a smooth curve.
But first, where does the name come from?

A "kernel" is a mathematical function that assigns weights to data points within a specific range of a target point.

It determines how much influence each point should have when calculating the smoothed value at a given location.
Read 11 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(