David Andrés 🤖📈🐍 Profile picture
Jan 5, 2023 10 tweets 3 min read Read on X
Be careful how you forecast the future in Time Series! 🤔

There are two main ways:

1️⃣ The traditional way or multi-step forecast

2️⃣ Rolling forecast

Let's see how they differ 👇

🧵 Thread🧵 Image
🔎 Let's consider the Apple stock price for this example.

⚠️ However, this is applicable to any Time Series data!
1️⃣ The traditional way or multi-step forecast

This consists of training the model once, with all the available data.

Then we forecast for several days in the future. Image
Imagine today is 30/11/2021.

We split the data in two:
- Training: prices until "today"
- Testing: prices from "today"
- The Training set will be used to train the model.

- The Testing will be used to evaluate the results, as this is the Actual price that we ideally want to get.
This traditional forecasting method will be able to predict the first period, and maybe the following one too.

However, after that, the results will be really poor! 😔 Image
⚠️ PROBLEM: You are missing important data if you do that → the evolution of the price from the training day until today.
🤔 SOLUTION: Retrain the model every day with all the available data → This is called "Rolling forecast".
Tomorrow we'll discuss the Rolling forecast!

But, you can check my previous thread about it 👇
Please 🔁Retweet the FIRST tweet of the thread if you found it useful!

🔔 Follow me @daansan_ml if you are interested in:

🐍 #Python
📊 #DataScience
📈 #TimeSeries
🤖 #MachineLearning

Thanks! 😉

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with David Andrés 🤖📈🐍

David Andrés 🤖📈🐍 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @daansan_ml

Jul 28
Introduction to some Advanced EDA Techniques 👇 Image
1️⃣ Dimensionality Reduction
For datasets with many variables, techniques like Principal Component Analysis (PCA) or t-SNE can help you visualize high-dimensional data in two or three dimensions.
2️⃣ Clustering
Unsupervised learning techniques like K-means clustering can help identify natural groupings in your data that might not be apparent from simple visualizations.
Read 6 tweets
Jul 27
EDA clearly explained 👇 Image
Exploratory Data Analysis (EDA) is a process used for investigating your data to discover patterns, anomalies, relationships, or trends using statistical summaries and visual methods.
It is essential for understanding the data's underlying structure and characteristics before applying more formal statistical or Machine Learning methods.

Some key points that we should normally check are👇
Read 13 tweets
Jul 21
Multi Query, an Advanced Retrieval Strategy for RAG, clearly explained 👇 Image
Multi Query is a powerful Query Translation technique to enhance information retrieval in AI systems.

It involves generating multiple variations of an original query to improve the chances of finding relevant information.
How it works:
Instead of relying on a single query, Multi Query uses language models to create several rephrased versions of the original question. Each version captures different aspects or interpretations of the user's intent.
Read 7 tweets
Jul 14
DBSCAN clearly explained 👇 Image
DBSCAN, or Density-Based Spatial Clustering of Applications with Noise, is a powerful clustering algorithm.

It finds clusters of varying shapes and sizes while handling noise and outliers.
What is it?
DBSCAN is an unsupervised learning algorithm that groups together closely packed points and marks points in low-density regions as outliers.
Read 8 tweets
Jul 7
Linear Regression clearly explained 👇 Image
What is it?

Linear Regression is a statistical method for predicting the value of a continous dependent variable based on one or more independent variables. It estimates the relationship using a linear equation.
How it works:

• Take input features

• Calculate a weighted sum plus a bias term

• Use the equation ( y = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ )

• Minimize the error (usually Mean Squared Error)
Read 8 tweets
Jun 29
Retrieval Augmented Generation (RAG) for LLM systems clearly explained 👇 Image
RAG helps bridge the gap between large language models and external data sources, allowing AI systems to generate relevant and informed responses by leveraging knowledge from existing documents and databases.

It involves a five-step process 👇
1️⃣ Data Collection
The first step is gathering all the data needed for the application - user manuals, databases, FAQs, etc. For a customer support chatbot, this could include product documentation, troubleshooting guides, and common inquiries.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(