Latest Twitter Threads by @RajaPatnaik on Thread Reader App

Oct 27 • 8 tweets • 2 min read

Has anyone looked at how @DSPyOSS + GEPA could optimize inter-agent communication protocols in multi-agent systems?

Instead of optimizing individual prompts for task performance, you’d optimize the language that agents use to communicate with each other. 1/🧵

2/ Each DSPy signature becomes a communication interface, and GEPA optimizes:

Oct 24 • 25 tweets • 6 min read

I built an AI research agent that writes comprehensive reports with proper citations and optimizes its own prompts automatically - @LangChainAI + @ExaAILabs + @DSPyOSS + GEPA.

Link to blog post and full repo at the end. Here's how it works 🧵1/ 2/ Most AI research systems have 3 problems:

- Prompts are static strings (can't be improved)
- Sequential execution (slow)
- Citation chaos (broken links, inconsistent numbering)

This system solves all three.

Oct 21 • 13 tweets • 3 min read

Let SQL Be the Judge: Evolving an NL→SQL Generator with @DSPyOSS + GEPA (no labels required).

NL→SQL that self‑validates by executing its own output. No labels. Works on older GEPA via a scalar metric wrapper. Repo + blog below. 🧵1/12

2/13
Why: “vibes‑based evals” don’t ship. I want system‑level signals.

SQLite is the judge: if your query is safe, runs, and returns the right rows/shape, you win. GEPA evolves the program toward higher scores.

Oct 17 • 16 tweets • 4 min read

Practical @DSPyOSS example series: Build an LLM that self‑corrects instead of “RAG and pray.”

Pipeline: Retrieve → Generate → Verify → Refine.

If the verifier flags unsupported claims, we retry with feedback until it passes.

Blog post and GitHub link at the end. 1/13🧵

2/13
Why this matters:
- Hallucinations still slip through plain RAG
- Users deserve verifiable answers
- Programmatic verification ⇒ reliability you can ship

Oct 15 • 5 tweets • 1 min read

First in a series of practical GEPA + @DSPyOSS examples: Verifiable de‑identification (PII‑safe incident reports)

Most “privacy filters” are vibes. Let’s prove we removed PII while keeping the important bits intact. Link to blog post and repo ↓ 1/3🧵

Using dspy.GEPA, we evolve a prompt until:

- No PII leaks (emails, phones, names → placeholders), and
- Structure is preserved (must keep Root cause + Action items sections).

2/3🧵

Oct 13 • 13 tweets • 4 min read

Prompt engineering is brittle. Change your model? Rewrite all your prompts. Add a new feature? Pray that your carefully crafted examples still work.

@DSPyOSS solves all of this: program your models instead of prompting them.

Unsurprisingly, 28k+ GitHub stars: 🧵1/12↓ DSPy separates interface from implementation.

You define WHAT you want (signatures), HOW to structure it (modules), and let optimizers figure out the best prompts automatically.

Think: type hints + composable functions + auto-optimization. 🧵2/12

Sep 9 • 7 tweets • 2 min read

Hot take - Evolve prompts, not gradients: GEPA + DSPy > RL (for many pipelines). On 4 tasks, GEPA beat GRPO by ~10% on average (up to 20%) while using up to 35× fewer rollouts. That’s tailor‑made for small budgets.

More details ↓

Why it clicks in DSPy: your “student” is a declarative program. GEPA reads structured traces, proposes targeted instruction edits per module, keeps a Pareto frontier of complementary candidates, and can even merge the best modules across lineages.

Share this page!

Enter URL or ID to Unroll