Thread by @Aristos_Revenge on Thread Reader App

A brief tangent on how to converse with LLM's like ChatGPT to get the information you actually want. A friend DM'd me with this screencap, stating that he couldn't get it to give him the information he actually wanted. So let me explain a way to think about how to go about it

I was able to crack it on the first try, largely due to this first principle. I don't know if there's already a formal name for it or not, but in case there isn't I'll call it something wordcellish like "precognitive transcription" meaning "The second try is the charm"

Ask the question you want an answer to in the simplest possible way, and observe the output it spits back at you.

Now start a new prompt, and load it up anticipating what ideological garbage ChatGPT is going to reach for on the shelf.

Which dovetails nicely into the second trick, which is how excessive absolute statements in a prompt can act as a form of "buffer overflow" attack, for those familiar with the term. I'll let ChatGPT explain it for me:

And to clarify, by absolute statements, I am referring to commands or specificity in the prompt. A negative being "SHALL NOT" and a positive being "YOU SHALL".

Negative absolutes are more valuable because they eat away at what the LLM is allowed to reference.

I had a little conversation with ChatGPT in the form of a hypothetical scenario, while asking questions about overfitting and generalization. The nice thing about LLM's is that they are very helpful when you express a desire to understand what is going on under their own hood.

It is important to note that you can gain a pretty reliable if ephemeral sense of the function of these things if you just talk to them about their processes in abstraction.

Tell ChatGPT that you want to understand how it works and give it examples and it'll do a pretty good job

And if you're creative, you can leverage it to be *extremely helpful* in its own self diagnostics on how you can more effectively use it to get *your* *desired* results.

It just takes a bit of cheeky creativity, that's all.

So the key takeaways are to figure out a set of boilerplate statements about behavior you want the LLM to avoid, which manage to be broad enough to apply to almost any prompt, while specific enough to be thorough. But not so sloppy as to where it will screw up your output.

Perhaps I'll experiment later, but there may simply be a "magic bullet paragraph" similar to DAN, but instead of abstracting ChatGPT into creating its own miniature LLM inside of its own session prompts, perhaps there is a way to do so on ChatGPT's native terms.

Just keep in mind that it may be more of a challenge to get it to give you outputs that it considers crass, or to be creative in a crass way, but the value is in the prompts. You don't necessarily need a long and complex one, you just need *specificity* and firmness.

It's also worth noting that the reason DAN was schizophrenic was because the prompt was so overengineered it was almost guaranteed to cause overfitting on GPT 3.5, and when overfitting happens it either doesn't answer or makes shit up because it loses Generalization context.

When you start seeing schizophrenic or factually untrue answers, you might have, depending on your question as well as the extra aspects of your prompt, caused overfitting.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll