Prashanth Rao Profile picture
Oct 13 9 tweets 4 min read Read on X
Just published my next @DSPyOSS blog post! I explored the performance of two optimizers: Bootstrap fewshot and GEPA 🔥on an information extraction task. A small model like gemini-2.5-flash-lite does *really* well after optimization (esp. with GEPA)



1/9 thedataquarry.com/blog/learning-…Image
It's clear that GEPA has huge potential in improving prompts (and maybe even full programs) for nearly every kind of task. Bootstrap fewshot optimizers are a decent start to improve results, but you can only go so far with adding fewshot examples. GEPA goes *much* deeper.

2/9 Image
First, optimization in @DSPyOSS is more more akin to the process of *compilation* in programming languages. DSPy's goal is tCo translate high-level instructions (signatures + modules) to a lower level that the LM can work with (i.e., weights).

Optimization = compilation

3/9 Image
In choosing an optimizer, it's useful to understand the "surface area" of what's being optimized in the prompt. Automatic fewshot optimizers target *only* the user/assistant messages , and do NOT modify the instructions (unlike GEPA).

4/9 Image
GEPA is designed to operate on *any* textual traces, starting with the user instructions in the prompt. The example here shows how the baseline instructions (terse, little detail) is improved by GEPA to discover good ways of phrasing details (extractor module).

5/9 Image
The GEPA paper is worth reading many times - in a nutshell, GEPA works so well because it exploits the biggest strengths of LMs - their ability to generate & reason over natural language (instructions, error msgs, reasoning traces, etc.) without relying on gradients

6/9 Image
The key part of creating a GEPA optimization workflow in @DSPyOSS is in incorporating feedback (alongside the score) as part of the metric function. The feedback is a natural language trace that's used by the reflection LM in GEPA to propose new instructions.

7/9 Image
Key lesson learned: It's always worth spending time on curating *high-quality* training, validation and test examples for GEPA (or any other optimizer). The *distribution* of examples matters, as does the train/val/test splits. Similar to working in traditional ML.

8/9 Image
I had a LOT of fun describing the details in the blog post. I know I'm only scratching the surface of what's possible with optimization in @DSPyOSS, and the future looks bright! Looking forward to writing about it in many other use cases. 😁

9/9

thedataquarry.com/blog/learning-…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Prashanth Rao

Prashanth Rao Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(