Let's build a Browser Automation Agent using gpt-oss (100% local):
Browser is still the most universal interface with 4.3 billion page visited every day!
Here's a quick demo of how we can completely automate it!
Tech stack:
- @stagehanddev open-source AI browser automation
- @crewAIInc for orchestration
- @ollama to run gpt-oss
Let's go!π
System overview:
- User enters an automation query.
- Planner Agent creates an automation plan.
- The Browser Automation Agent executes it using the Stagehand tool.
- The Response Agent generates a response.
Let's compare GPT-5 and Claude Opus-4.1 for code generation:
Today, we're building a CodeArena, where you can compare any two code-gen models side-by-side.
Tech stack:
- @LiteLLM for orchestration
- @Cometml's Opik to build the eval pipeline
- @OpenRouterAI to access cutting-edge models
- @LightningAI for hosting CodeArena
Let's go!π
Here's the workflow:
- Choose models for code generation comparison
- Import a GitHub repository and offer it as context to LLMs
- Use context + query to generate code from both models
- Evaluate generated code using Opik's G-Eval
Let's compare OpenAI gpt-oss and Qwen-3 on maths & reasoning:
Before we dive in, here's a quick demo of what we're building!
Tech stack:
- @LiteLLM for orchestration
- @Cometml's Opik to build the eval pipeline (open-source)
- @OpenRouterAI to access the models
You'll also learn about G-Eval & building custom eval metrics.
Let's go! π
Here's the workflow:
- User submits query
- Both models generate reasoning tokens along with the final response
- Query, response and reasoning logic are sent for evaluation
- Detailed evaluation is conducted using Opik's G-Eval across four metrics.
Tech giants use Multimodal RAG every day in production!
- Spotify uses it to answer music queries
- YouTube uses it to turn prompts into tracks
- Amazon Music uses it to create playlist from prompt
Let's learn how to build a Multimodal Agentic RAG (with code):
Today, we'll build a multimodal Agentic RAG that can query documents and audio files using the user's speech.
Tech stack:
- @AssemblyAI for transcription.
- @milvusio as the vector DB.
- @beam_cloud for deployment.
- @crewAIInc Flows for orchestration.
Let's build it!
Here's the workflow:
- User inputs data (audio + docs).
- AssemblyAI transcribes the audio files.
- Transcribed text & docs are embedded in the Milvus vector DB.
- Research Agent retrieves info from user query.
- Response Agent uses it to craft a response.