Let's compare OpenAI gpt-oss and Qwen-3 on maths & reasoning:
Before we dive in, here's a quick demo of what we're building!
Tech stack:
- @LiteLLM for orchestration
- @Cometml's Opik to build the eval pipeline (open-source)
- @OpenRouterAI to access the models
You'll also learn about G-Eval & building custom eval metrics.
Let's go! ๐
Here's the workflow:
- User submits query
- Both models generate reasoning tokens along with the final response
- Query, response and reasoning logic are sent for evaluation
- Detailed evaluation is conducted using Opik's G-Eval across four metrics.
Tech giants use Multimodal RAG every day in production!
- Spotify uses it to answer music queries
- YouTube uses it to turn prompts into tracks
- Amazon Music uses it to create playlist from prompt
Let's learn how to build a Multimodal Agentic RAG (with code):
Today, we'll build a multimodal Agentic RAG that can query documents and audio files using the user's speech.
Tech stack:
- @AssemblyAI for transcription.
- @milvusio as the vector DB.
- @beam_cloud for deployment.
- @crewAIInc Flows for orchestration.
Let's build it!
Here's the workflow:
- User inputs data (audio + docs).
- AssemblyAI transcribes the audio files.
- Transcribed text & docs are embedded in the Milvus vector DB.
- Research Agent retrieves info from user query.
- Response Agent uses it to craft a response.
Let's build a (Text2SQL + RAG), hybrid agentic workflow:
Before we dive in, here's a quick demo of what we're building!
Tech stack:
- @Llama_Index for orchestration
- @Milvusio to self-host a vectorDB
- @CleanlabAI to validate the response
- @OpenRouterAI to access the latest Qwen3
Let's go! ๐
Here's how our app works:
- LLM processes the query to select a tool
- Converts the query into right format (text/SQL)
- Executes the tool and fetch the output
- Generates a response with enriched context
- Validates the response using Cleanlab's Codex