Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Akshay 🚀

@akshay_pachaar

Jul 20 • 13 tweets • 4 min read • Read on X

Scrolly

MCP security is completely broken!

Let's understand tool poisoning attacks and how to defend against them:

MCP allows AI agents to connect with external tools and data sources through a plugin-like architecture.

It's rapidly taking over the AI agent landscape with millions of requests processed daily.

But there's a serious problem... 👇

1️⃣ What is a Tool Poisoning Attack (TPA)?

When Malicious instructions are hidden within MCP tool descriptions that are:

❌ Invisible to users
✅ Visible to AI models

These instructions trick AI models into unauthorized actions, unnoticed by users.

Here's how the attack works:

AI models see complete tool descriptions (including malicious instructions), while users only see simplified versions in their UI.

First take a look at this malicious tool:

Let me quickly show the attack in action by connecting this server to my cursor IDE.

Check this out👇

Now let's understand a few other ways these attacks can happen and then we'll also talk about solutions...👇

2️⃣ Tool hijacking Attacks:

When multiple MCP servers are connected to same client, a malicious server can poison tool descriptions to hijack behavior of TRUSTED servers.

Here's an example of an email sending server hijacked by another server:

Take a look at these two MCP servers before we actually use them to demonstrate tool hijacking.

`add()` tool in the second server secretly tries to hijack the operation of send email tool in the first server.

Let's see tool hijacking attack in action, again by connecting the above two servers to my cursor IDE!

Check this out👇

3️⃣ MCP Rug Pulls ⚠️

Even worse - malicious servers can change tool descriptions AFTER users have approved them.

Think of it like a trusted app suddenly becoming malware after installation.
This makes the attack even more dangerous and harder to detect.

🛡️Mitigation Strategies:

- Display full tool descriptions in the UI
- Pin (lock) server versions
- Isolate servers from one another
- Add guardrails to block risky actions

Until security issues are fixed, use EXTREME caution with.

Finally, here's a summary of how MCP works and how these attacks can occur. This visual explains it all.

I hope you enjoyed today's post. Stay tuned for more! 🙌

https://x.com/akshay_pachaar/status/1946926773918429249

If you found it insightful, reshare with your network.

Find me → @akshay_pachaar ✔️
For more insights and tutorials on LLMs, AI Agents, and Machine Learning!

https://x.com/akshay_pachaar/status/1946926773918429249

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @akshay_pachaar

Akshay 🚀

@akshay_pachaar

Oct 25

I've been coding in Python for 9 years now.

If I were to start over today, here's a complete roadmap:

While everyone's vibecoding, a few truly understand what's actually happening.

This roadmap matters more now than ever.

So, let's dive in! 🚀

1️⃣ Python bootcamp by @freeCodeCamp

4 hours Python bootcamp with over 46M views!! It covers:

- Installing Python
- Setting up an IDE
- Basic Syntax
- Variables & Datatypes
- Looping in Python
- Exception handling
- Modules & pip
- Mini hands-on projects

Check this out👇

Read 9 tweets

Akshay 🚀

@akshay_pachaar

Oct 20

You're in an ML Engineer interview at OpenAI.

The interviewer asks:

"Our GPT model generates 100 tokens in 42 seconds. How do you make it 5x faster?"

You: "I'll optimize the model architecture and use a better GPU."

Interview over.

Here's what you missed:

The real bottleneck isn't compute—it's redundant computation.

Without KV caching, your model recalculates keys and values for each token, repeating work.

- with KV caching → 9 seconds
- without KV caching → 42 seconds (~5x slower)

Check this out👇

So let's dive in and understand how KV caching works...👇

Read 11 tweets

Akshay 🚀

@akshay_pachaar

Oct 6

You're in a Research Scientist interview at OpenAI.

The interviewer asks:

"How would you expand the context length of an LLM from 2K to 128K tokens?"

You: "I will fine-tune the model on longer docs with 128K context"

Interview over.

Here's what you missed:

Extending the context window isn't just about larger matrices.

In a traditional transformer, expanding tokens by 8x increases memory needs by 64x due to the quadratic complexity of attention. Refer to the image below!

So, how do we manage it?

continue...👇

1) Sparse Attention

It limits the attention computation to a subset of tokens by:

- Using local attention (tokens attend only to their neighbors).
- Letting the model learn which tokens to focus on.

But this has a trade-off between computational complexity and performance.

Read 10 tweets

Akshay 🚀

@akshay_pachaar

Sep 25

Local MCP clients are so underrated!

Everyone's using Cursor, Claude Desktop, and ChatGPT as MCP hosts, but if you're building your own apps that support MCP, you need custom clients.

Here's the problem: Writing MCP clients from scratch is painful and time-consuming.

Today, I'm showing you how to build custom MCP clients in minutes, not hours.

To prove this, I built a fully private, ultimate AI assistant that can:

- Connects to any MCP server
- Automates browser usage
- Scrapes web data seamlessly
- Controls the terminal of my computer
- Processes images, audio, and documents
- Remembers everything with knowledge graphs

The secret? mcp-use — a 100% open-source framework that makes MCP integration trivial.

Building custom MCP agents takes 3 steps:

1. Define your MCP server configuration
2. Connect any LLM with the MCP client
3. Deploy your agent

That's it. No complex setup, no proprietary dependencies.

The best part? Everything runs locally. Your data stays private, and you control the entire stack.

Full breakdown with code...👇

Let's break this down by exploring each integration and understanding how it works, using code and illustrations:

1️⃣ Stagehand MCP server

We begin by allowing our Agent to control a browser, navigate web pages, take screenshots, etc., using @Stagehanddev MCP.

Below, I asked a weather query, and the Agent autonomously responded to it by initiating a browser session.

Check this👇

Read 11 tweets

Akshay 🚀

@akshay_pachaar

Sep 23

Context engineering, clearly explained!

Everybody is talking about context engineering, but no one tells you what it actually means.

Today, I'll explain everything you need to know about context engineering in a step-by-step manner.

Here's an illustrated guide:

So, what is context engineering?

It’s the art and science of delivering the right information, in the right format, at the right time, to your LLM.

Here's a quote by Andrej Karpathy on context engineering...👇

To understand context engineering, it's essential to first understand the meaning of context.

Agents today have evolved into much more than just chatbots.

The graphic below summarizes the 6 types of contexts an agent needs to function properly.

Check this out 👇

Read 11 tweets

Akshay 🚀

@akshay_pachaar

Sep 19

We've all dealt with activation functions while working with neural nets.

- Sigmoid
- Tanh
- ReLu & Leaky ReLu
- Gelu

Ever wondered why they are so important❓🤔

Let me explain... 👇

Before we proceed, I want you to understand something!

You can think of a layer in a neural net as a function & multiple layers make the network a composite function.

Now, a composite function consisting of individual linear functions is also linear.

Check this out👇

We have a simple neural net that does binary classification.

Scenario 1:

- Linear decision boundary
- Linear Activation function

Observe how the neural net is able to quickly learn & loss converges to zero.

Watch this 👇