MCP is like a USB-C port for your AI applications.
Just as USB-C offers a standardized way to connect devices to various accessories, MCP standardizes how your AI apps connect to different data sources and tools.
Let's dive in! 🚀
At its core, MCP follows a client-server architecture where a host application can connect to multiple servers.
Key components include:
- Host
- Client
- Server
Here's an overview before we dig deep 👇
The Host and Client:
Host: An AI app (Claude desktop, Cursor) that provides an environment for AI interactions, accesses tools and data, and runs the MCP Client.
MCP Client: Operates within the host to enable communication with MCP servers.
Next up, MCP server...👇
The Server
A server exposes specific capabilities and provides access to data.
3 key capabilities:
- Tools: Enable LLMs to perform actions through your server
- Resources: Expose data and content from your servers to LLMs
- Prompts: Create reusable prompt templates and workflows
The Client-Server Communication
Understanding client-server communication is essential for building your own MCP client-server.
Let's begin with this illustration and then break it down step by step... 👇
1️⃣ & 2️⃣: capability exchange
client sends an initialize request to learn server capabilities.
server responds with its capability details.
e.g., a Weather API server provides available `tools` to call API endpoints, `prompts`, and API documentation as `resource`.
3️⃣ Notification
Client then acknowledgment the successful connection and further message exchange continues.
Before we wrap, one more key detail...👇
Unlike traditional APIs, the MCP client-server communication is two-way.
Sampling, if needed, allows servers to leverage clients' AI capabilities (LLM completions or generations) without requiring API keys.
While clients to maintain control over model access and permissions
I hope this clarifies what MCP does.
In the future, I'll explore creating custom MCP servers and building hands-on demos around them.
Over to you! What is your take on MCP and its future?
That's a wrap!
If you enjoyed this breakdown:
Follow me → @akshay_pachaar ✔️
Every day, I share insights and tutorials on LLMs, AI Agents, RAGs, and Machine Learning!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
You're in a Research Scientist interview at OpenAI.
The interviewer asks:
"How would you expand the context length of an LLM from 2K to 128K tokens?"
You: "I will fine-tune the model on longer docs with 128K context"
Interview over.
Here's what you missed:
Extending the context window isn't just about larger matrices.
In a traditional transformer, expanding tokens by 8x increases memory needs by 64x due to the quadratic complexity of attention. Refer to the image below!
So, how do we manage it?
continue...👇
1) Sparse Attention
It limits the attention computation to a subset of tokens by:
- Using local attention (tokens attend only to their neighbors).
- Letting the model learn which tokens to focus on.
But this has a trade-off between computational complexity and performance.
Everyone's using Cursor, Claude Desktop, and ChatGPT as MCP hosts, but if you're building your own apps that support MCP, you need custom clients.
Here's the problem: Writing MCP clients from scratch is painful and time-consuming.
Today, I'm showing you how to build custom MCP clients in minutes, not hours.
To prove this, I built a fully private, ultimate AI assistant that can:
- Connects to any MCP server
- Automates browser usage
- Scrapes web data seamlessly
- Controls the terminal of my computer
- Processes images, audio, and documents
- Remembers everything with knowledge graphs
The secret? mcp-use — a 100% open-source framework that makes MCP integration trivial.
Building custom MCP agents takes 3 steps:
1. Define your MCP server configuration 2. Connect any LLM with the MCP client 3. Deploy your agent
That's it. No complex setup, no proprietary dependencies.
The best part? Everything runs locally. Your data stays private, and you control the entire stack.
Full breakdown with code...👇
Let's break this down by exploring each integration and understanding how it works, using code and illustrations:
1️⃣ Stagehand MCP server
We begin by allowing our Agent to control a browser, navigate web pages, take screenshots, etc., using @Stagehanddev MCP.
Below, I asked a weather query, and the Agent autonomously responded to it by initiating a browser session.
There's one rule you can not miss if you want to do the same!
Here's the full breakdown (with code):
There are primarily 2 factors that determine how well an MCP app works:
- If the model is selecting the right tool?
- And if it's correctly preparing the tool call?
Today, let's learn how to evaluate any MCP workflow using @deepeval's MCP evaluations (open-source).
Let's go!
Here's the workflow:
- Integrate the MCP server with the LLM app.
- Send queries and log tool calls, tool outputs in DeepEval.
- Once done, run the eval to get insights on the MCP interactions.