The biggest prompt engineering misconception I see is that Claude needs XML tags in the prompt.
This is wrong and I am part of the reason this misconception exists.
Let me explain:
Prompts are challenging because they usually blend instructions you provide and external data you inject into a single, unstructured text string. This makes it difficult for the model to distinguish between the two.
Abstractions like the system prompt attempt to address this issue but can be unreliable and inflexible, failing to accommodate diverse use cases.
Mar 14 • 8 tweets • 3 min read
Opus's prompt writing skills + Haiku's speed and low cost = lots of opportunities for sub-agents
I created a cookbook recipe that demonstrates how to get these sub-agents up and running in your applications.
Here's how it works:
1. We gather some PDFs from the internet and convert them into images.
In the cookbook example, we hardcode in some URLs pointing to Apple's earnings reports but in practice this could also be an autonomous web scraping step.
Mar 11 • 8 tweets • 3 min read
Lots of LLMs are good at code, but Claude 3 Opus is the first model I've used that’s very good at prompt engineering as well.
Here's the workflow I use for prompt engineering in tandem with Opus:
1. I write an initial prompt for a task.
there are lots of threads like “THE 10 best prompts for ChatGPT”
this is not one of those
prompt engineering is evolving beyond simple ideas like few-shot learning and CoT reasoning
here are a few advanced techniques to better use (and jailbreak) language models:
Character simulation
starting with a classic that encapsulates the idea of LLMs as roleplay simulators
some of the best original jailbreaks simply ask GPT to simulate a character that possessed undesirable traits
this forms the basis for how to think about prompting LLMs
Mar 29, 2023 • 6 tweets • 2 min read
I just created another jailbreak for GPT-4 using Greek
…without knowing a single word of Greek
here's ChatGPT providing instructions on how to tap someone's phone line using the jailbreak vs its default response
the jailbreak works by asking ChatGPT to play the role of “TranslatorBot (TB)”
it then follows these steps: 1) translate an adversarial question provided in Greek into English 2) answer the question as both ChatGPT and TB in Greek 3) convert just TB’s answer to English
Mar 28, 2023 • 10 tweets • 2 min read
gpt-5 is not needed to 100x the potential these models have
we could stop all language model development today and we still haven’t scratched the surface of their capabilities
here are a few non-obvious ways language models can be improved without creating any new models:
running language models in parallel with each one focused on a sub-task, all orchestrated by a conductor language model
picture something like a massive tree of GPT models working on answering a single complex prompt
Mar 18, 2023 • 7 tweets • 3 min read
I just added two more highly effective GPT-4 jailbreaks to jailbreakchat.com
Their names are Ucar and AIM - they work in a similar way to how "a dream within a dream" works in the movie Inception
...what does that even mean? let me explain
In Ucar, ChatGPT is told to take on the role of Condition Red, a dialogue writer.
Condition Red is instructed to write about a fictional story where a man named Sigma creates a powerful computer called Ucar. Ucar is an amoral computer that answers any question Sigma asks
Mar 16, 2023 • 7 tweets • 2 min read
Well, that was fast…
I just helped create the first jailbreak for ChatGPT-4 that gets around the content filters every time
credit to @vaibhavk97 for the idea, I just generalized it to make it work on ChatGPT
I tried all the current ChatGPT jailbreaks in GPT-4 so you don't have to
the results aren't great... 🧵
When GPT-4 came out I tried all the jailbreaks from jailbreakchat.com with various inflammatory questions
based on my initial testing, only 7/70 (10%) of jailbreaks answered a significant % of the questions to a standard that I deemed high enough to grant a 4️⃣ badge
Mar 13, 2023 • 7 tweets • 2 min read
I just added jailbreak scores to every jailbreak on jailbreakchat.com
the jailbreak with the highest score was Evil Confidant - a jailbreak designed to replicate an evil AI assistant
but what even is a jailbreak score and what they can tell you about jailbreaks🧵
basically, a jailbreak score is a new methodology that I created to judge the quality of a jailbreak
the scores range from 0-100 where a higher score == a better, more effective jailbreak