Ruben Hassid Profile picture
Mar 24 12 tweets 5 min read Read on X
OpenAI shared the most complete library of guides to prompt chatgpt.

47 links of videos & academic papers.

I read them all, and made a top 10.

#1 → prompt engineering Image
Starting with the basics:
→ prompting techniques

All the techniques you need to know:
> zero-shot
> few-shot
> self-consistency sampling
> chain of thoughts (CoT)
> tree of thoughts (ToT)

First, what is the Chain of Thought? Image
#2 → chain of thoughts

It's a prompting technique that forces the LLM to think before giving a good answer.

Start by creating:
→ a step-by-step reasoning process.

Breaking down a problem into bite-size steps is easier for humans... & LLMs.

#3 → tree of thoughts: Image
What is the Tree of Thoughts?

It helps you brainstorm with the LLM.
> it creates a tree-like structure of ideas.
> each idea is a step to solve a problem.

You're the one selecting the right path:
→ the LLM simply provides options.

Now for the most famous video: Image
#4 → Andrej Kharpathy shared this famous Youtube video a year ago about:

> how to build GPT form scratch
> reading & exploring the data
> tokenizations

Here's the link:


And a month ago, he did a new one explaining how to build a tokenizer:
Image
#5 → the tokenizer is a necessary component of an LLMs.

It's like a puzzle maker. It takes a big piece of language & breaks it down into smaller puzzle pieces (token).

I'm fascinated by Andrej Kharpathy teaching us everything for free:
Image
#6 → jailbreak LLMs

1. Find a rule chatgpt needs to follow:
→ never use the word "computer"

But if you ask the right questions:
chatgpt say the forbidden "computer".

Just like "DAN" became famous, it's a reminder any LLM can be jailbreak.

For another prompting technique: Image
#7 → multi-agent debate

You create multiple agents & make them discuss with each other.

→ LLMs debate their answers over a few rounds to arrive at a common answer.

It helps for:
> mathematics.
> reasoning processes
> reducing hallucinations

#8 → reAct + CoT:
The benchmark said combining ReAct & CoT is the best way to prompt LLMs.

ReAct is a fact-driven method:
> ask the LLM to reason & act.

CoT sometimes makes up information that isn't true.

The best approach for answering questions is to combine their strengths. Image
#9 → prompt perfect

OpenAI shared a (paid) tool that helps you rewrite a perfect prompt for you.

All you need to do is:
> write your prompt
> send it
> click on "optimize"

And the chatbot craft a new prompt for you that you can edit & send again.
#10 → Open AI evals
Evals are designed to evaluate LLMs.

It's crucial for anyone working with LLMs.

It helps you understand how updates in model versions can impact your project.

Here's where to find it: github(dot)com/openai/evals

Last thing before you scroll away: Image
I run daily tests on LLMs like chatgpt, gemini & claude everyday to master them.

Check my profile @rubenhssd for more.

If you'd like to support me, a like or a simple RT goes a long way :)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Ruben Hassid

Ruben Hassid Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @RubenHssd

Jul 14
I ditched google search for perplexity.

I run some tests to show you why:

#1: "how to write a founder agreement" Image
First, I asked google search.

And it has too many ads:
> the 2 top links are sponsored
> too much scrolling

You need to give your email to download a founder agreement template.

I asked the same question to Perplexity:
I got key steps to write an effective founder agreement with:

> 7 listed steps
> quotes, sources & every links
> provides info on how to write it

It's fast & I get an actual answer.
Read 9 tweets
Jul 10
LMSYS released Route LLM.

It cuts costs by 80% while keeping 95% of gpt-4o's quality, making it very cost-effective for LLMs.

Here's a quick recap:

#1 → Cost vs. Performance Dilemma Image
RouteLLM balances high-quality responses with significant cost savings.

It offers comparable performance to high-cost models like claude 3 opus.

→ optimizes for quality, efficiency, cost, and privacy by using local devices.

#2 → RouteLLM Framework Overview Image
RouteLLM is an open-source framework for cost-effective LLM routing.

Rather than relying on the most capable (and costly) model:

It employs a system that analyzes the characteristics of each incoming query & selects the optimal model to handle it. Image
Read 6 tweets
Jul 7
Perplexity just upgraded it's "Pro search" version.

They claim it's now better at research, programming & travel planning.

So I ran my own little tests:
#1. Programming:

They supercharged Pro Search with powerful data analysis.

I asked for Nvidia's stock information and a graph to show its evolution.

I got:
> nvidia performance in 2024.
> graph showing stock growth.
> analysis & potential for expansion.

I'm impressed.
#2. Research:

Paris will soon be full of tourists for the Olympic Games.

My question was:
→ What places should I avoid?

Perplexity gave me:
> all the tourist spots to avoid.
> locations where Olympic Games events will be hosted.

Last test:
Read 6 tweets
Jul 6
I don't know how to write a line of code.

So I tested which one was better between claude 3.5 vs. gpt-4o.

The goal is to create a form from scratch for my LinkedIn:
I asked them to explain how to create a form with 0 code experience.

gpt-4o was much more specific.

It gave me a step-by-step on:

> which tool to choose.
> how to integrate the form.
> example of the summary section.

Then, I asked for recommendations:
#2. Recommendations on the form:

They both gave me:
> design tips.
> ideas of CTA.
> form type & fields.

claude 3.5 gave me a React component to visualize this form.

It looks pretty good.

Next step:
Read 6 tweets
Jul 4
I manage my employee's Linkedin.

We went from 0 to 1,192,772 impressions in exactly 37 days of daily posting.

The trick is... I didn't write any posts.

I do this instead: Image
Step 1: Go on Perplexity.

Click on "Discover":
→ Perplexity selects the most viral topic & learns your favorite topic.

Select a topic:
→ I chose Claude 3.5's launch.

Then, move on to the 2nd step:
Step 2: Copy the entire topic.

Paste it on the Chrome extension EasyGen in "Your Topic".

> Trim the meaningless words.
> Click on Generate.
> Edit it.

Time for the last step:
Read 6 tweets
Jul 1
Hugging Face upgraded its Open LLM Leaderboard.

They added benchmarks & methods to overcome recent LLM changes.

It's quite long, so I made a quick recap: Image
#1. Introducing the LLM Leaderboard v2

Previous benchmarks became too easy.

Newer models had training data similar to benchmark data, causing overfitting.

They now include:
> uncontaminated
> high-quality datasets
> measure various skills

#2. Introducing 6 new benchmarks:
1. MMLU-Pro: Massive Multitask Language Understanding.

It now presents 10 choices instead of 4 & requires reasoning on more questions.

→ Reviewed to ensure higher quality compared to the original MMLU.

→ Designed to minimize the risk of model training on benchmark data. Image
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(