Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Avi Chawla

@_avichawla

Jan 17 • 7 tweets • 2 min read • Read on X

Traditional RAG vs. Agentic RAG, clearly explained (with visuals):

Traditional RAG has many issues:

- It retrieves once and generates once. If the context isn't enough, it cannot dynamically search for more info.

- It cannot reason through complex queries.

- The system can't modify its strategy based on the problem.

Agentic RAG attempts to solve this.

The following visual depicts how it differs from traditional RAG.

The core idea is to introduce agentic behaviors at each stage of RAG.

Steps 1-2) An agent rewrites the query (removing spelling mistakes, etc.)

Step 3-8) An agent decides if it needs more context.

↳ If not, the rewritten query is sent to the LLM.
↳ If yes, an agent finds the best external source to fetch context, to pass it to the LLM.

Step 9) We get a response.

Step 10-12) An agent checks if the answer is relevant.

↳ If yes, return the response.
↳ If not, go back to Step 1.

This continues for a few iterations until we get a response or the system admits it cannot answer the query.

This makes RAG more robust since agents ensure individual outcomes are aligned with the goal.

That said, the diagram shows one of the many blueprints an agentic RAG system may possess.

You can adapt it according to your specific use case.

That's a wrap!

If you enjoyed this tutorial:

Find me → @_avichawla

Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @_avichawla

Avi Chawla

@_avichawla

Jan 11

Bayes' Theorem, clearly explained:

Bayes' Theorem is a cornerstone of probability theory!

It calculates the probability of an event, given that another event has occurred.

It's like updating your guess with fresh information!

Before we delve into the details, let's take a quick look at its formula:

Imagine you're trying to guess if it will rain today.

You start with a general belief based on the weather forecast (say, a 40% chance of rain).

This is your 'prior' probability:

Read 8 tweets

Avi Chawla

@_avichawla

Jan 7

15 ways to optimize neural network training, clearly explained:

Before we dive in, this visual explains what we are discussing today.

Let's understand them now.

Basic ones:

1) Use efficient optimizers—AdamW, Adam, etc.

2) Utilize hardware accelerators (GPUs/TPUs).

3) Max out the batch size.

4) Use multi-GPU training through Model/Data/Pipeline/Tensor parallelism (check the visual).

Read 9 tweets

Avi Chawla

@_avichawla

Jan 3

All-reduce and ring-reduce for multi-GPU training, clearly explained (with visuals):

Data parallelism:

• Replicates the model across all GPUs.

• Divides the data into smaller batches for every GPU.

• Computes the gradients on each GPU.

Since each GPU processes a different data chunk, the GPUs must be synchronized before the next iteration.

Algorithm 1) All-reduce

The most obvious solution is to send the gradients from one GPU device to all other GPU devices and compute averages.

But this utilizes too much bandwidth.

Total elements transferred = G*(G-1)*N
↳ G = total GPUs
↳ N = gradient matrix size

Read 10 tweets

Avi Chawla

@_avichawla

Dec 30, 2024

Active learning in ML, clearly explained (with visuals):

As the name suggests, the idea is to build the model with active human feedback on examples it is struggling with.

The visual below summarizes this:

Let’s get into the details.

Begin by manually labeling a tiny percentage of the dataset.

While there’s no rule on how much data should be labeled, I have used active learning (successfully) while labeling as low as ~1% of the dataset.

Read 9 tweets