Yesterday we discussed the first way of forecasting with your Time Series model:
1๏ธโฃ The traditional way or multi-step forecast
Today is time for the second (and better) way:
2๏ธโฃ Rolling forecast
2๏ธโฃ Rolling forecast
As mentioned yesterday, this consists of training the model every day, with all the available data until the present day.
Then we forecast for tomorrow.
Let's continue with yesterday's example.
The data we used was the Apple stock price.
We assumed that today was 30/11/2021.
We split the data in two:
- Training: prices until "today"
- Testing: prices from "today"
- Training set to train the model.
- Testing set to evaluate the results, as this is the Actual price to match.
This will consider all the available data, which will significantly improve the predictions! ๐คฏ
NOTE: this data or model are not the best ones, so this model seems to kind of replicate the previous price. This was not the purpose of this thread, so we will not focus on that.
โถ๏ธ TL;DR
The rolling forecasting method is a much better way of evaluating your Time Series model.
The traditional method performs poorly as it does not consider all the available data.
Check yesterday's thread about the Traditional forecasting method ๐
Multi Query, an Advanced Retrieval Strategy for RAG, clearly explained ๐
Multi Query is a powerful Query Translation technique to enhance information retrieval in AI systems.
It involves generating multiple variations of an original query to improve the chances of finding relevant information.
How it works:
Instead of relying on a single query, Multi Query uses language models to create several rephrased versions of the original question. Each version captures different aspects or interpretations of the user's intent.
DBSCAN, or Density-Based Spatial Clustering of Applications with Noise, is a powerful clustering algorithm.
It finds clusters of varying shapes and sizes while handling noise and outliers.
What is it?
DBSCAN is an unsupervised learning algorithm that groups together closely packed points and marks points in low-density regions as outliers.
Linear Regression is a statistical method for predicting the value of a continous dependent variable based on one or more independent variables. It estimates the relationship using a linear equation.
How it works:
โข Take input features
โข Calculate a weighted sum plus a bias term
โข Use the equation ( y = ฮฒโ + ฮฒโxโ + ฮฒโxโ + ... + ฮฒโxโ )
โข Minimize the error (usually Mean Squared Error)
Retrieval Augmented Generation (RAG) for LLM systems clearly explained ๐
RAG helps bridge the gap between large language models and external data sources, allowing AI systems to generate relevant and informed responses by leveraging knowledge from existing documents and databases.
It involves a five-step process ๐
1๏ธโฃ Data Collection
The first step is gathering all the data needed for the application - user manuals, databases, FAQs, etc. For a customer support chatbot, this could include product documentation, troubleshooting guides, and common inquiries.
Support Vector Machine is a useful Machine Learning algorithm frequently used for both classification and regression problems.
โญ this is a ๐๐๐ฝ๐ฒ๐ฟ๐๐ถ๐๐ฒ๐ฑ ๐น๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ฎ๐น๐ด๐ผ๐ฟ๐ถ๐๐ต๐บ.
Basically, they need labels or targets to learn!
Its goal is to find a boundary that maximally separates the data into different classes (classification) or fits the data with a line/plane (regression).
They excel at handling intricate datasets where finding the right boundary seems challenging.