Akshay 🚀 Profile picture
Simplifying LLMs, AI Agents, RAGs and Machine Learning for you! • Co-founder @dailydoseofds_• BITS Pilani • 3 Patents • ex-AI Engineer @ LightningAI
24 subscribers
Jun 13 11 tweets 3 min read
Model Context Protocol (MCP), clearly explained: MCP is like a USB-C port for your AI applications.

Just as USB-C offers a standardized way to connect devices to various accessories, MCP standardizes how your AI apps connect to different data sources and tools.

Let's dive in! 🚀
Jun 11 11 tweets 3 min read
Object-oriented programming in Python, clearly explained: We break it down to 6 important concepts:

- Object 🚘
- Class 🏗️
- Inheritance 🧬
- Encapsulation 🔐
- Abstraction 🎭
- Polymorphism 🌀

Let's take them one-by-one... 🚀 Image
Jun 5 9 tweets 3 min read
Self-attention in LLMs, clearly explained: Before we start a quick primer on tokenization!

Raw text → Tokenization → Embedding → Model

Embedding is a meaningful representation of each token (roughly a word) using a bunch of numbers.

This embedding is what we provide as an input to our language models.

Check this👇 Image
Jun 3 12 tweets 4 min read
Let's build an MCP-powered Agentic RAG (100% local): Below, we have an MCP-powered Agentic RAG that searches a vector database and falls back to web search if needed.

To build this, we'll use:
- @firecrawl_dev search endpoint for web search.
- @qdrant_engine as the vector DB.
- @cursor_ai as the MCP client.

Let's build it!
Jun 1 10 tweets 4 min read
Function calling & MCP for LLMs, clearly explained (with visuals): Before MCPs became popular, AI workflows relied on traditional Function Calling for tool access. Now, MCP is standardizing it for Agents/LLMs.

The visual below explains how Function Calling and MCP work under the hood.

Let's learn more!
May 30 12 tweets 4 min read
Let's build an MCP server that connects to 200+ data sources (100% local): Before we dive in, here's a quick demo of what we're building!

Tech stack:

- @MindsDB to power our unified MCP server
- @cursor_ai as the MCP host
- @Docker to self-host the server

Let's go! 🚀
May 29 11 tweets 4 min read
KV caching in LLMs, clearly explained (with visuals): KV caching is a technique used to speed up LLM inference.

Before understanding the internal details, look at the inference speed difference in the video:

- with KV caching → 9 seconds
- without KV caching → 42 seconds (~5x slower)

Let's dive in!
May 27 14 tweets 5 min read
Let's build an MCP-powered financial analyst (100% local): Before we dive in, here's a quick demo of what we're building!

Tech stack:

- @crewAIInc for multi-agent orchestration
- @Ollama to locally serve DeepSeek-R1 LLM
- @cursor_ai as the MCP host

Let's go! 🚀
May 20 9 tweets 3 min read
5 levels of Agentic AI systems, clearly explained (with visuals): Agentic AI systems don't just generate text; they can make decisions, call functions, and even run autonomous workflows.

The visual explains 5 levels of AI agency—from simple responders to fully autonomous agents.

Let's dive to learn more about them.
May 17 11 tweets 5 min read
9 MCP, LLM, and AI Agent cheat sheets for AI engineers (with visuals): 1️⃣ Model context Protocol

MCP is like a USB-C port for your AI applications.

Just as USB-C standardizes device connections; MCP standardizes AI app connections to data sources and tools.

Here's my detailed thread about it👇
May 16 14 tweets 4 min read
Let's build an MCP-powered synthetic data generator (100% local): Today, we're building an MCP server that every data scientist will love to have.

Tech stack:

- @cursor_ai as the MCP host
- @datacebo's SDV to generate realistic tabular synthetic data

Let's go! 🚀
May 15 14 tweets 5 min read
Let's build a multi-agent book writer, powered Qwen3 (100% local): Today, we are building an Agentic workflow that writes a 20k word book from a 3-5 word book title.

Tech stack:
- @firecrawl_dev for web scraping.
- @crewAIInc for orchestration.
- @ollama to serve Qwen 3 locally.
- @LightningAI for development and hosting

Let's go! 🚀
May 9 7 tweets 3 min read
Traditional RAG vs. Agentic RAG, clearly explained (with visuals): Traditional RAG has many issues:

- It retrieves once and generates once. If the context isn't enough, it cannot dynamically search for more info.

- It cannot reason through complex queries.

- The system can't modify its strategy based on the problem.
May 5 13 tweets 4 min read
How LLMs work, clearly explained: Before diving into LLMs, we must understand conditional probability.

Let's consider a population of 14 individuals:

- Some of them like Tennis 🎾
- Some like Football ⚽️
- A few like both 🎾 ⚽️
- And few like none

Here's how it looks 👇 Image
May 4 7 tweets 3 min read
5 amazing Jupyter Notebook tricks not known to many: 1️⃣ Retrieve a cell's output in Jupyter

If you often forget to assign the results of a Jupyter cell to a variable, you can use the `Out` dictionary to retrieve the output. Image
Apr 30 9 tweets 3 min read
Let's fine-tune DeepMind's latest Gemma 3 (100% locally): Before we begin, here's what we'll be doing.

We'll fine-tune our private and locally running Gemma 3.

To do this, we'll use:
- @UnslothAI for efficient fine-tuning.
- @ollama to run it locally.

Let's begin! Image
Apr 26 16 tweets 5 min read
Let's build an MCP-powered multi-agent deep researcher (100% local): Before we dive in, here's a quick demo of what we're building!

Tech stack:

- @Linkup_platform for deep web research
- @crewAIInc for multi-agent orchestration
- @Ollama to locally server DeepSeek
- @cursor_ai as MCP host

Let's go! 🚀
Apr 21 10 tweets 4 min read
Transformer vs. Mixture of Experts in LLMs, clearly explained (with visuals): Mixture of Experts (MoE) is a popular architecture that uses different "experts" to improve Transformer models.

The visual below explains how they differ from Transformers.

Let's dive in to learn more about MoE!
Apr 17 12 tweets 5 min read
10 MCP, AI Agents, and RAG projects for AI Engineers (with code): 1️⃣ Real-time Voice RAG Agent

In this project you'll learn how to build a real-time Voice RAG Agent.

You will also learn how to clone your voice in just 5 seconds.

Check the full breakdown (with code) below 👇
Apr 15 8 tweets 3 min read
MCP vs A2A (Agent2Agent) protocol, clearly explained: Agentic applications require both A2A and MCP.

- MCP provides agents with access to tools.
- While A2A allows agents to connect with other agents and collaborate in teams.

Today, I'll clearly explain what A2A is and how it can work with MCP.
Apr 13 8 tweets 3 min read
Traditional RAG vs. Graph RAG, clearly explained (with visuals): top-k retrieval in RAG rarely works.

Imagine you want to summarize a biography where each chapter details a specific accomplishment of an individual.

Traditional RAG struggles because it retrieves only top-k chunks while it needs the entire context. Image