Charly Wargnier Profile picture
Feb 11 11 tweets 5 min read Read on X
OpenAI is getting Deepseek’d again.

@Convergence_ai_, a tiny London startup just built one of the most capable AI agents for the web.

Proxy is outperforming Operator on every benchmark handpicked by @OpenAI.



Let’s dive in! 🧵↓ proxy.convergence.aiImage
Let’s start with this, proof that funding isn’t everything.

@OpenAI raised $18 billion.

@Convergence_ai_? Just $12 million! That’s 0.067% of OpenAI’s budget.



And yet, Proxy is faster, smarter, fully autonomous.

Keep scrolling for comparison videos! ↓tech.eu/2024/09/27/con…
Task 1: Find chicken recipes.

❌ Operator gives up after 4 minutes.
✔️ Proxy completes it twice before Operator finishes.
✔️ Not only that, Proxy also delivers complete results.

3/
Task 2: Get the latest basketball news.

❌ Operator gets stuck on a CAPTCHA and needs human help.
✔️ Proxy bypasses it instantly with no manual input.
❌ Operator provides just a link with minimal info.
✔️ Proxy delivers full details in under a minute.

4/
Task 3: Find top-rated @Tripadvisor stays.

❌ Operator takes over 2 minutes and delivers less information.
✔️ Proxy finds 5 top-rated stays in 56 seconds with prices, ratings, and reviews.

5/
Task 4: Get the latest US economy news.

❌ Operator is slow and less informative.
✔️ Proxy provides a detailed, high-quality summary faster.

6/
What’s more, Proxy is ranked #1 globally via the WebVoyager benchmark which assesses agentic capabilities across 600 web based tasks.

7/ Image
.... and when it comes to cost, it’s not even a question.

→ Operator costs $200.
→ Proxy is Free, with a $20/month Pro option.

Check it out:

8/ convergence.ai/#:~:text=Our%2…Image
But wait… there’s more.

Proxy does stuff that Operator simply can’t.

→ Schedule automations to run on repeat - your own AI agent on autopilot. → Instantly share automations on X/Twitter so anyone can run them... with one click! 🤯

9/ Image
That’s a wrap!

Proxy beats Operator where it matters most: speed, autonomy, features, and cost.

Try it for free or go Pro for just $20/month.



10/ proxy.convergence.aiImage
If this was useful, a quick RT would go a long way in giving this London startup a voice and some more oomph against the industry giants! 💪

And if you’re into AI agents and LLMs, don’t forget to follow me @DataChaz for more insights! :)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Charly Wargnier

Charly Wargnier Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @DataChaz

Nov 21
Wild.

Postman's AI Agent Builder lets you turn any API (from over 100,000!) into an MCP server in seconds, no code required 🤯

Your custom MCP server, ready to use in Cursor, Windsurf, Claude Desktop, Docker, plus a lot more! 🧵↓ Image
1/

First, start here →

You’ve got literally 100,000+ APIs to check out.

1. mix and match any endpoints you want
2. download your custom zip file
3. that’s it! postman.com/explore/mcp-ge…
Mind = blown.

That zip file has EVERYTHING:
↳ a readme with setup instructions
↳ your selected endpoints
↳ all the files to run your MCP server locally, on Cursor, Windsurf… even Docker!

You also get an .env file with your prefilled variables → just add your API keys! 🔥 Image
Read 9 tweets
Nov 17
MIT and Oxford just released their $2,500 agentic AI curriculum on GitHub at no cost.

15,000 people already paid for it.

Now it's on GitHub!

It covers patterns, orchestration, memory, coordination, and deployment.
A strong roadmap to production ready systems.

Repo in 🧵 ↓ Image
10 chapters:

Part 1. What agents are and how they differ from plain generative AI.
Part 2. The four agent types and when to use each.
Part 3. How tools work and how to build them.
Part 4. RAG vs agentic RAG and key patterns.
Part 5. What MCP is and why it matters.
Part 6. How agents plan with reasoning models.
Part 7. Memory systems and architecture choices.
Part 8. Multi agent coordination and scaling.
Part 9. Real world production case studies.
Part 10. Industry trends and what is coming next.
Here's the repo:
github.com/aishwaryanr/aw…Image
Read 5 tweets
Nov 13
If you’re still sending raw JSON into your LLMs, you’re burning tokens, latency, and budget!

Try TOON (Token-Oriented Object Notation).

Clear like YAML, compact like CSV:

• 30–60% fewer tokens
• Up to 50% lower costs
• Shines for tabular data.

Free and Open source 🧵↓ Image
💡 Benchmark tip.

Check out @curiouslychase’s ace Format Tokenization Playground:


It lets you compare token counts for various formats:
- CSV
- JSON
- YAML
- and TOON

... all with your own sample data 🔥 curiouslychase.com/playground/for…Image
Read 4 tweets
Nov 13
You underestimate the power of good prompts.

Here are 5 frameworks to copy and paste 🧵 ↓ Image
1 ‑ R‑A‑I‑N

- Act as a (ROLE)
- State the (AIM)
- Use the provided (INPUT)
- Hit the (NUMERIC TARGET)
- In this (FORMAT)

Example:
- ROLE: Senior product designer
- AIM: Redesign our fitness‑app onboarding to cut time‑to‑first‑workout by 30 %
- INPUT: Attached funnel metrics
- NUMERIC TARGET: 30 % improvement
- FORMAT: Mobile UI wireframe + KPI tableImage
2 ‑ C‑L‑A‑R

- Given the (CONTEXT)
- List any (LIMITS)
- Describe the (ACTION)
- Define the expected (RESULT)

Example:
- CONTEXT: Raw Q1 sales data
- LIMITS: Focus on the 3 biggest churn drivers
- ACTION: Quantify impact & propose 2 fixes
- RESULT: Two‑slide executive briefImage
Read 7 tweets
Nov 5
Cut your LLM costs by 50% 🤯

Stop using JSON → switch to TOON (Token-Oriented Object Notation).

It blends YAML’s readability with CSV’s compactness:
↳ 30–60% less tokens
↳ Built-in field validation
↳ GPT-5, Claude, Gemini.

Ace for tabular data.

Free and open-source 🧵↓ Image
💡 Benchmark tip.

Check out @curiouslychase’s ace Format Tokenization Playground:


It lets you compare token counts for various formats:
- CSV
- JSON
- YAML
- and TOON

... all with your own sample data 🔥 curiouslychase.com/playground/for…Image
Read 5 tweets
Oct 31
this guy literally put in 1000 hours of prompt engineering to nail down the 6 patterns that actually matter. Image
He calls it KERNEL, and it's transformed how his entire team uses AI.

Here's the framework:

----

K - Keep it simple

Bad: 500 words of context

Good: One clear goal

Example: Instead of "I need help writing something about Redis," use "Write a technical tutorial on Redis caching"

Result: 70% less token usage, 3x faster responses

----

E - Easy to verify

Your prompt needs clear success criteria

Replace "make it engaging" with "include 3 code examples"

If you can't verify success, AI can't deliver it

My testing: 85% success rate with clear criteria vs 41% without

----

R - Reproducible results

Avoid temporal references ("current trends", "latest best practices")

Use specific versions and exact requirements

Same prompt should work next week, next month

94% consistency across 30 days in my tests

----

N - Narrow scope

One prompt = one goal

Don't combine code + docs + tests in one request

Split complex tasks

Single-goal prompts: 89% satisfaction vs 41% for multi-goal

----

E - Explicit constraints

Tell AI what NOT to do

"Python code" → "Python code. No external libraries. No functions over 20 lines."

Constraints reduce unwanted outputs by 91%

----

L - Logical structure Format every prompt like:

Context (input)

Task (function)

Constraints (parameters)

Format (output)

----

Real example from my work last week:

Before KERNEL: "Help me write a script to process some data files and make them more efficient"

Result: 200 lines of generic, unusable code

After KERNEL:

Task: Python script to merge CSVs
Input: Multiple CSVs, same columns
Constraints: Pandas only, <50 lines
Output: Single merged.csv
Verify: Run on test_data/

Result: 37 lines, worked on first try

----

Actual metrics from applying KERNEL to 1000 prompts:

First-try success: 72% → 94%
Time to useful result: -67%
Token usage: -58%

Accuracy improvement: +340%

Revisions needed: 3.2 → 0.4

----

Advanced tip from this user:

Chain multiple KERNEL prompts instead of writing complex ones.

Each prompt does one thing well, feeds into the next.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(