Avi Chawla Profile picture
Jan 19 9 tweets 3 min read Read on X
Let's build a multi-agent internet research assistant with OpenAI Swarm & Llama 3.2 (100% local):
Before we begin, here's what we're building!

The app takes a user query, searches the web for it, and turns it into a well-crafted article.

Tool stack:
- @ollama for running LLMs locally.
- @OpenAI Swarm for multi-agent orchestration.
- @Streamlit for the UI.
The architecture diagram below illustrates the key components (agents/tools) & how they interact with each other!

Let's implement it now!
Agent 1: Web search and tool use

The web-search agent takes a user query and then uses the DuckDuckGo search tool to fetch results from the internet. Image
Agent 2: Research Analyst

The role of this agent is to analyze and curate the raw search results and make them ready to use for the content writer agent. Image
Agent 3: Technical Writer

The role of a technical writer is to use the curated results and turn them into a polished, publication-ready article. Image
Create a workflow

Now that we have all our agents and tools ready, it's time to put them together and create a workflow.

Here's how we do it: Image
The Chat interface

Finally we create a Streamlit UI to provide a chat interface for our application.

Done! Image
That's a wrap!

If you enjoyed this tutorial:

Find me → @_avichawla

Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Avi Chawla

Avi Chawla Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @_avichawla

6h
9 RAG, LLM, and AI Agent cheat sheets for AI engineers (with visuals):
1️⃣ Transformer vs. Mixture of Experts in LLMs

Mixture of Experts (MoE) is a popular architecture that uses different "experts" to improve Transformer models.

The visual below explains how they differ from Transformers.

Here's my detailed thread about it👇
2️⃣ 5 techniques to fine-tune LLMs

Traditional fine-tuning is infeasible with LLMs since they have billions of parameters (and 100s of GBs in size).

Check my detailed explainer thread on this👇
Read 11 tweets
Mar 26
5 MCP servers that will give superpowers to your AI Agents:
Integrating a tool/API with Agents demands:
- reading docs
- writing code
- updating the code, etc.

To simplify this, platforms now offer MCP servers. Developers can plug them to let Agents use their APIs instantly.

Below, let's look at 5 incredibly powerful MCP servers.
1️⃣ Firecrawl MCP server

This adds powerful web scraping capabilities to Cursor, Claude, and any other LLM clients using @firecrawl_dev.

Tools include:
- Scraping
- Crawling
- Deep research
- Extracting structured data
- and more

Check this demo👇
Read 8 tweets
Mar 25
Let's build an MCP server (100% locally):
Before diving in, here's what we'll be doing:

- Understand MCP with a simple analogy.
- Build a local MCP server and interact with it via @cursor_ai IDE.
- Integrate @firecrawl_dev's MCP server and interact with its tools (shown in the video).

Let's dive in 🚀!
First, let's understand MCP using a translation analogy.

Imagine you only know English.

To get info from a person who only knows:
- French, you must learn French.
- German, you must learn German.
- and so on.

Learning even 5 languages will be a nightmare for you!
Read 14 tweets
Mar 21
5 levels of Agentic AI systems, clearly explained (with visuals):
Agentic AI systems don't just generate text; they can make decisions, call functions, and even run autonomous workflows.

The visual explains 5 levels of AI agency—from simple responders to fully autonomous agents.

Let's dive to learn more about them.
1️⃣ Basic responder

- A human guides the entire flow.
- The LLM is just a generic responder that receives an input and produces an output. It has little control over the program flow.

See this visual👇
Read 9 tweets
Mar 15
Let's fine-tune DeepMind's latest Gemma 3 (100% locally):
Before we begin, here's what we'll be doing.

We'll fine-tune our private and locally running Gemma 3.

To do this, we'll use:
- @UnslothAI for efficient fine-tuning.
- @ollama to run it locally.

Let's begin! Image
1) Load the model

We start by loading the Gemma 3 model and its tokenizer using Unsloth: Image
Read 9 tweets
Mar 14
Let's build a multi-agent book writer, powered by DeepMind's Gemma 3 (100% local):
Today, we are building an Agentic workflow that writes a 20k word book from a 3-5 word book title.

Tech stack:
- Bright Data to scrape web at scale.
- @crewAIInc for orchestration.
- @GoogleDeepMind's Gemma 3 as the LLM.
- @ollama to serve Gemma 3 locally.

Let's build it!
Here's our workflow:

• Using Bright Data, Outline Crew scrapes data related to the book title and decides the chapter count and titles.
• Many Writer Crews are invoked in parallel to write one chapter each.
• Combine all chapters to get the book.

Let's implement this!
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(