There's one rule you can not miss if you want to do the same!
Here's the full breakdown (with code):
There are primarily 2 factors that determine how well an MCP app works:
- If the model is selecting the right tool?
- And if it's correctly preparing the tool call?
Today, let's learn how to evaluate any MCP workflow using @deepeval's MCP evaluations (open-source).
Let's go!
Here's the workflow:
- Integrate the MCP server with the LLM app.
- Send queries and log tool calls, tool outputs in DeepEval.
- Once done, run the eval to get insights on the MCP interactions.
Let's build a reasoning LLM, from scratch (100% local):
Today, we're going to learn how to turn any model into a reasoning powerhouse.
We'll do so without any labeled data or human intervention, using Reinforcement Finetuning (GRPO)!
Tech stack:
- @UnslothAI for efficient fine-tuning
- @HuggingFace TRL to apply GRPO
Let's go! 🚀
What is GRPO?
Group Relative Policy Optimization is a reinforcement learning method that fine-tunes LLMs for math and reasoning tasks using deterministic reward functions, eliminating the need for labeled data.
Here's a brief overview of GRPO before we jump into code: