Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Akshay 🚀

@akshay_pachaar

Jun 13 • 11 tweets • 3 min read • Read on X

Scrolly

Model Context Protocol (MCP), clearly explained:

MCP is like a USB-C port for your AI applications.

Just as USB-C offers a standardized way to connect devices to various accessories, MCP standardizes how your AI apps connect to different data sources and tools.

Let's dive in! 🚀

At its core, MCP follows a client-server architecture where a host application can connect to multiple servers.

Key components include:

- Host
- Client
- Server

Here's an overview before we dig deep 👇

The Host and Client:

Host: An AI app (Claude desktop, Cursor) that provides an environment for AI interactions, accesses tools and data, and runs the MCP Client.

MCP Client: Operates within the host to enable communication with MCP servers.

Next up, MCP server...👇

The Server

A server exposes specific capabilities and provides access to data.

3 key capabilities:

- Tools: Enable LLMs to perform actions through your server
- Resources: Expose data and content from your servers to LLMs
- Prompts: Create reusable prompt templates and workflows

The Client-Server Communication

Understanding client-server communication is essential for building your own MCP client-server.

Let's begin with this illustration and then break it down step by step... 👇

1️⃣ & 2️⃣: capability exchange

client sends an initialize request to learn server capabilities.

server responds with its capability details.

e.g., a Weather API server provides available `tools` to call API endpoints, `prompts`, and API documentation as `resource`.

3️⃣ Notification

Client then acknowledgment the successful connection and further message exchange continues.

Before we wrap, one more key detail...👇

Unlike traditional APIs, the MCP client-server communication is two-way.

Sampling, if needed, allows servers to leverage clients' AI capabilities (LLM completions or generations) without requiring API keys.

While clients to maintain control over model access and permissions

I hope this clarifies what MCP does.

In the future, I'll explore creating custom MCP servers and building hands-on demos around them.

Over to you! What is your take on MCP and its future?

https://x.com/akshay_pachaar/status/1933503158225154514

If you found it insightful, reshare with your network.

Find me → @akshay_pachaar ✔️
For more insights and tutorials on LLMs, AI Agents, and Machine Learning!

https://x.com/akshay_pachaar/status/1933503158225154514

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @akshay_pachaar

Akshay 🚀

@akshay_pachaar

Jun 11

Object-oriented programming in Python, clearly explained:

We break it down to 6 important concepts:

- Object 🚘
- Class 🏗️
- Inheritance 🧬
- Encapsulation 🔐
- Abstraction 🎭
- Polymorphism 🌀

Let's take them one-by-one... 🚀

1️⃣ Object 🚘

Just look around, everything you see can be treated as an object.

For instance a Car, Dog, your Laptop are all objects.

An Object can be defined using 2 things:

- Properties: that describe an object
- Behaviour: the functions that an object can perform

...👇

Read 11 tweets

Akshay 🚀

@akshay_pachaar

Jun 5

Self-attention in LLMs, clearly explained:

Before we start a quick primer on tokenization!

Raw text → Tokenization → Embedding → Model

Embedding is a meaningful representation of each token (roughly a word) using a bunch of numbers.

This embedding is what we provide as an input to our language models.

Check this👇

The core idea of Language modelling is to understand the structure and patterns within language.

By modeling the relationships between words (tokens) in a sentence, we can capture the context and meaning of the text.

Read 9 tweets

Akshay 🚀

@akshay_pachaar

Jun 3

Let's build an MCP-powered Agentic RAG (100% local):

Below, we have an MCP-powered Agentic RAG that searches a vector database and falls back to web search if needed.

To build this, we'll use:
- @firecrawl_dev search endpoint for web search.
- @qdrant_engine as the vector DB.
- @cursor_ai as the MCP client.

Let's build it!

Here's how it works:

1) The user inputs a query through the MCP client (Cursor).
2-3) The client contacts the MCP server to select a relevant tool.
4-6) The tool output is returned to the client to generate a response.

Let's implement this!

Read 12 tweets

Akshay 🚀

@akshay_pachaar

Jun 1

Function calling & MCP for LLMs, clearly explained (with visuals):

Before MCPs became popular, AI workflows relied on traditional Function Calling for tool access. Now, MCP is standardizing it for Agents/LLMs.

The visual below explains how Function Calling and MCP work under the hood.

Let's learn more!

In Function Calling:

- The LLM receives a prompt.
- The LLM decides the tool.
- The programmer implements a procedure to accept a tool call request and prepare a function call.
- A backend service executes the tool.

Let's implement this to understand better!

Read 10 tweets

Akshay 🚀

@akshay_pachaar

May 30

Let's build an MCP server that connects to 200+ data sources (100% local):

Before we dive in, here's a quick demo of what we're building!

Tech stack:

- @MindsDB to power our unified MCP server
- @cursor_ai as the MCP host
- @Docker to self-host the server

Let's go! 🚀

Here's the workflow:

- User submits a query
- Agent connects to MindsDB MCP server to find tools
- Selects appropriate tool based on user query and call it
- Finally, returns a contextually relevant response

Now, let's dive into the code!

Read 12 tweets

Akshay 🚀

@akshay_pachaar

May 29

KV caching in LLMs, clearly explained (with visuals):

KV caching is a technique used to speed up LLM inference.

Before understanding the internal details, look at the inference speed difference in the video:

- with KV caching → 9 seconds
- without KV caching → 42 seconds (~5x slower)

Let's dive in!

To understand KV caching, we must know how LLMs output tokens.

- Transformer produces hidden states for all tokens.
- Hidden states are projected to vocab space.
- Logits of the last token is used to generate the next token.
- Repeat for subsequent tokens.

Check this👇

Read 11 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Akshay 🚀

Try unrolling a thread yourself!

More from @akshay_pachaar

Akshay 🚀

Akshay 🚀

Akshay 🚀

Akshay 🚀

Akshay 🚀

Akshay 🚀

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!