Let's build a (Text2SQL + RAG), hybrid agentic workflow:
Before we dive in, here's a quick demo of what we're building!
Tech stack:
- @Llama_Index for orchestration
- @Milvusio to self-host a vectorDB
- @CleanlabAI to validate the response
- @OpenRouterAI to access the latest Qwen3
Let's go! 🚀
Here's how our app works:
- LLM processes the query to select a tool
- Converts the query into right format (text/SQL)
- Executes the tool and fetch the output
- Generates a response with enriched context
- Validates the response using Cleanlab's Codex
I have been fine-tuning LLMs for more that 2 years now!
Here are the top 5 LLM fine-tuning techniques, explained with visuals:
Traditional fine‑tuning is impractical for LLMs (billions of params; 100s GB).
Since this kind of computing isn't accessible to everyone, parameter-efficient finetuning (PEFT) came into existence.
Today, we’ll cover the top 5 PEFT techniques, step by step.
Some background!
LLM weights are matrices of numbers adjusted during finetuning.
Most PEFT techniques involve finding a lower-rank adaptation of these matrices—a smaller-dimensional matrix that can still represent the information stored in the original.
Let's build a "Chat with your Code" RAG app using Qwen3-Coder:
Before we begin, take a look at what we're about to create!
Tech stack:
- @Llama_Index for orchestration
- @Milvusio to self-host a vectorDB
- @CleanlabAI codex to validate the response
- @OpenRouterAI to access @Alibaba_Qwen 3 Coder.
Let's go! 🚀
The architecture diagram presented below illustrates some of the key components & how they interact with each other!
It will be followed by detailed descriptions & code for each component: