Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Akshay 🚀

@akshay_pachaar

Apr 19, 2024 • 11 tweets • 4 min read • Read on X

Scrolly

Let's build a RAG app using MetaAI's Llama-3 (100% local):

Before we begin, take a look at what we're about to create!

Here's what you'll learn:

- @Ollama for locally serving a LLM (Llama-3)
- @Llama_Index for orchestration
- @Streamlit for building the UI
- @LightningAI for development & hosting

Let's go! 🚀

The architecture diagram presented below illustrates some of the key components & how they interact with each other!

It will be followed by detailed descriptions & code for each component:

1️⃣ & 2️⃣ : Loading the knowledge base

A knowledge base is a collection of relevant and up-to-date information that serves as a foundation for RAG. In our case it's the docs stored in a directory.

Here's how you can load it as document objects in LlamaIndex:

3️⃣ The embedding model

Embedding is a meaningful representation of text in form of numbers.

The embedding model is responsible for creating embeddings for the document chunks & user queries.

We are using @SnowflakeDB's `arctic-embed-m`, one of the best model in it's class.

4️⃣ Indexing & storing

Embeddings created by embedding model are stored in a vector store that offers fast retrieval and similarity search by creating an index over our data.

By default, LlamaIndex provides a in-memory vector store that’s great for quick experimentation.

5️⃣ Creating a prompt template

A custom prompt template is use to refine the response from LLM & include the context as well:

6️⃣ & 7️⃣ Setting up a query engine

The query engine takes a query string & use it to fetch relevant context and then sends them both as a prompt to the LLM to generate a final natural language response.

Here's how you set it up:

8️⃣ The Chat interface

We create a UI using Streamlit to provide a chat interface for our RAG application.

The code for this & all we discussed so far is shared in the next tweet!

Check this out👇

I used @LightningAI⚡️ Studio for developing this application!

You will find all the code & everything you need to run it! ✨

Clone a FREE studio now & take it for a spin...👇
lightning.ai/lightning-ai/s…

If you interested in:

- Python 🐍
- Machine Learning 🤖
- AI Engineering ⚙️

Find me → @akshay_pachaar ✔️
Everyday, I share tutorials on above topics!

Cheers! 🥂

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @akshay_pachaar

Akshay 🚀

@akshay_pachaar

Jul 14

ML researchers just built a new ensemble technique.

It even outperforms XGBoost, CatBoost, and LightGBM.

Here's a complete breakdown (explained visually):

For years, gradient boosting has been the go-to for tabular learning.

TabM is a parameter-efficient ensemble that provides:
- The speed of an MLP.
- The accuracy of GBDT.

The visual below explains how it works.

Let's dive in!

In tabular ML:

- MLPs are simple and fast, but usually underperform on tabular data.
- Deep ensembles are accurate but bloated and slow.
- Transformers are powerful but rarely practical on tables.

The image below depicts an MLP ensemble, and it looks heavily parameterized👇

Read 8 tweets

Akshay 🚀

@akshay_pachaar

Jul 12

A Crash Course on Building AI Agents!

Here's what it covers:

- What is an AI agent
- Connecting agents to tools
- Overview of MCP
- Replacing tools with MCP servers
- Setting up observability and tracing

All with 100% open-source tools!

This course builds agents based on the following definition:

An AI agent uses an LLM as its brain, has memory to retain context, and can take real-world actions through tools, like browsing web, running code, etc.

In short, it thinks, remembers, and acts.

100% open-source tech stack:

- @crewAIInc for building MCP ready agents
- @zep_ai Graphiti to add human like memory
- @Cometml Opik for observability and tracing.

You can find the entire code here: github.com/patchy631/ai-e…

Read 5 tweets

Akshay 🚀

@akshay_pachaar

Jul 11

MCP is on fire.

AI agents can now talk to real world tools, apps and actually get stuff done.

This changes everything.

Here are 10 amazing examples:

1️⃣ WhatsApp MCP

Exchange images, videos, and voice notes on WhatsApp!

Pair it with the ElevenLabs MCP server for AI-powered transcription & audio messages with 3,000+ voices.

Check this out👇

2️⃣ MCP-powered Agentic RAG

I created this server for Cursor and lets it perform deep web searches, as well as RAG over a specified directory.

Everything from the comforts of your IDE:

Read 12 tweets

Akshay 🚀

@akshay_pachaar

Jul 10

90% of Python programmers don't know these 11 ways to declare type hints:

Type hints are incredibly valuable for improving code quality and maintainability.

Today, I'll walk you through 11 must-know principles to declare type hints in just two minutes.

Let's begin! 🚀

1️⃣ Type hints for standard Python objects:

The most basic (and must-know) way to declare type hints for standard Python objects is as follows👇

Read 15 tweets

Akshay 🚀

@akshay_pachaar

Jul 7

Temperature in LLMs, clearly explained (with code):

Let's prompt OpenAI GPT-3.5 with a low temperature value twice.

It produces identical responses from the LLM.

Check the response below👇

Now, let's prompt it with a high temperature value.

This time, it produces a gibberish output. Check the output below👇

What is going on here? Let's dive in!

Read 9 tweets

Akshay 🚀

@akshay_pachaar

Jul 3

7 MCP projects for AI Engineers (with video tutorials):

1️⃣ MCP meets Ollama

An MCP client is a component in an AI app (like Cursor) that establishes connections to external tools.

Learn how to build it 100% locally.

Full walkthrough:

2️⃣ MCP-powered shared memory for Claude Desktop and Cursor

Devs use Claude Desktop and Cursor independently.

Learn how to add a knowledge graph based common memory layer to cross-operate without losing context.

Full walkthrough: