MCP is like a USB-C port for your AI applications.
Just as USB-C offers a standardized way to connect devices to various accessories, MCP standardizes how your AI apps connect to different data sources and tools.
Let's dive in! 🚀
At its core, MCP follows a client-server architecture where a host application can connect to multiple servers.
Let's build a Multimodal RAG app over complex webpages using DeepSeek's Janus-Pro (running locally):
The video depicts a multimodal RAG built with a SOTA tech stack.
We'll use:
- ColiVara's SOTA document understanding and retrieval to index webpages.
- @firecrawl_dev for reliable scrapping.
- @huggingface transformers to locally run DeepSeek Janus
Let's build it!
Here's an overview of our app:
1-2) Generate a PDF of web page screenshots with Firecrawl. 3) Index it on ColiVara for SOTA retrieval.
4-5) Query the ColiVara client to retrieve context.
6-7) Use DeepSeek Janus Pro as the LLM to generate a response.
Before we dive in, here's a quick demo of our agentic workflow!
Tech stack:
- @Llama_Index workflows for orchestration
- @Linkup_platform for deep web search
- @Cometml's Opik to trace and monitor
- @Qdrant_engine to self-host vectorDB
Let's go! 🚀
Here's an overview of what the app does:
— First search the docs with user query
— Evaluate if the retrieved context is relevant using LLM
— Only keep the relevant context
— Do web search if needed
— Aggregate the context & generate response