Avi Chawla Profile picture
Jan 19 9 tweets 3 min read Read on X
Let's build a multi-agent internet research assistant with OpenAI Swarm & Llama 3.2 (100% local):
Before we begin, here's what we're building!

The app takes a user query, searches the web for it, and turns it into a well-crafted article.

Tool stack:
- @ollama for running LLMs locally.
- @OpenAI Swarm for multi-agent orchestration.
- @Streamlit for the UI.
The architecture diagram below illustrates the key components (agents/tools) & how they interact with each other!

Let's implement it now!
Agent 1: Web search and tool use

The web-search agent takes a user query and then uses the DuckDuckGo search tool to fetch results from the internet. Image
Agent 2: Research Analyst

The role of this agent is to analyze and curate the raw search results and make them ready to use for the content writer agent. Image
Agent 3: Technical Writer

The role of a technical writer is to use the curated results and turn them into a polished, publication-ready article. Image
Create a workflow

Now that we have all our agents and tools ready, it's time to put them together and create a workflow.

Here's how we do it: Image
The Chat interface

Finally we create a Streamlit UI to provide a chat interface for our application.

Done! Image
That's a wrap!

If you enjoyed this tutorial:

Find me → @_avichawla

Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Avi Chawla

Avi Chawla Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @_avichawla

Jan 17
Traditional RAG vs. Agentic RAG, clearly explained (with visuals):
Traditional RAG has many issues:

- It retrieves once and generates once. If the context isn't enough, it cannot dynamically search for more info.

- It cannot reason through complex queries.

- The system can't modify its strategy based on the problem.
Agentic RAG attempts to solve this.

The following visual depicts how it differs from traditional RAG.

The core idea is to introduce agentic behaviors at each stage of RAG.
Read 7 tweets
Jan 11
Bayes' Theorem, clearly explained:
Bayes' Theorem is a cornerstone of probability theory!

It calculates the probability of an event, given that another event has occurred.

It's like updating your guess with fresh information!

Before we delve into the details, let's take a quick look at its formula: Image
Imagine you're trying to guess if it will rain today.

You start with a general belief based on the weather forecast (say, a 40% chance of rain).

This is your 'prior' probability: Image
Read 8 tweets
Jan 7
15 ways to optimize neural network training, clearly explained:
Before we dive in, this visual explains what we are discussing today.

Let's understand them now.
Basic ones:

1) Use efficient optimizers—AdamW, Adam, etc.

2) Utilize hardware accelerators (GPUs/TPUs).

3) Max out the batch size.

4) Use multi-GPU training through Model/Data/Pipeline/Tensor parallelism (check the visual).
Read 9 tweets
Jan 3
All-reduce and ring-reduce for multi-GPU training, clearly explained (with visuals):
Data parallelism:

• Replicates the model across all GPUs.

• Divides the data into smaller batches for every GPU.

• Computes the gradients on each GPU.

Since each GPU processes a different data chunk, the GPUs must be synchronized before the next iteration.
Algorithm 1) All-reduce

The most obvious solution is to send the gradients from one GPU device to all other GPU devices and compute averages.

But this utilizes too much bandwidth.

Total elements transferred = G*(G-1)*N
↳ G = total GPUs
↳ N = gradient matrix size Image
Read 10 tweets
Dec 30, 2024
Active learning in ML, clearly explained (with visuals):
As the name suggests, the idea is to build the model with active human feedback on examples it is struggling with.

The visual below summarizes this:

Let’s get into the details.
Begin by manually labeling a tiny percentage of the dataset.

While there’s no rule on how much data should be labeled, I have used active learning (successfully) while labeling as low as ~1% of the dataset. Image
Read 9 tweets
Dec 26, 2024
Traditional RAG vs. HyDE, clearly explained (with visuals):
Questions are not semantically similar to their answers.

As a result, several irrelevant contexts get retrieved due to a higher cosine similarity. Image
HyDE attempts to solve this.

The following visual depicts how it differs from traditional RAG and HyDE.
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(