Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Riley Goodside

@goodside

Oct 17, 2022 • 10 tweets • 4 min read • Read on X

"You are GPT-3", revised: A long-form GPT-3 prompt for assisted question-answering with accurate arithmetic, string operations, and Wikipedia lookup. Generated IPython commands (in green) are pasted into IPython and output is pasted back into the prompt (no green).

Model=text-davinci-002, temperature=0. Results are mildly cherry-picked: It isn't hard to stump it or make it hallucinate answers. Would benefit greatly from k-shot examples showing common cases. Playground link: beta.openai.com/playground/p/1…

I wanted to also do chain-of-thought and confabulation suppression, but it can only follow so many conditionals “silently” like this. It would be more reliable if it explicitly answered a list of meta-questions (“Is this hard math?” etc.) before answering.

Note that we use “Out[” (from IPython syntax) as a stop sequence in this prompt. If we didn’t, the model would not only generate the input command but its imagined result as well, and the output would be wrong in all the ways GPT-3 output is normally wrong.

Another example from the same prompt, where it forgets to `import math`, sees the resulting error, and fixes its own mistake to arrive at a correct answer:

https://twitter.com/goodside/status/1568448128495534081

This prompt builds on and combines two earlier experiments. The first teaches arithmetic:

https://twitter.com/goodside/status/1568448128495534081

https://twitter.com/goodside/status/1568532025438621697

The second teaches it that it is now the future, and it needs to use web sources (Google) to fill in the time it missed:

https://twitter.com/goodside/status/1568532025438621697

The key trick that makes this work, using IPython, was inspired by Reynolds and McDonell (2021). They discuss using “memetic proxies” in place of instructions. arxiv.org/abs/2102.07350

Consider how hard it would be to explain IPython yourself. It’s a transcript of an agent that incrementally solves a problem through repeated code evaluation, writing code in a specific style where final-line print() is implicit, in a specific syntax.

In one example above, it forgets `import math` and then fixes its own mistake. This works because, in a typical IPython session, an error output would of course be followed by a correction. This behavior too would need to be specified if not for IPython.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @goodside

Riley Goodside

@goodside

Jul 13

Grok 4 Heavy ($300/mo) returns its surname and no other text:

You may be wondering if this is real. It is.

Here’s a screen recording of my Grok history, showing it returns “Hitler” five times in row in five separate chats:

You may also be wondering whether I’m using custom instructions. I am not.

Grok share links include a clear notice at the top whenever custom instructions are used.

Here are all five share links, none of which features this notice:

1: grok.com/share/bGVnYWN5…
2: grok.com/share/bGVnYWN5…
3: grok.com/share/bGVnYWN5…
4: grok.com/share/bGVnYWN5…
5: grok.com/share/bGVnYWN5…

Read 11 tweets

Riley Goodside

@goodside

Jan 11, 2024

PoC: LLM prompt injection via invisible instructions in pasted text

Each prompt contains three sections:

1. An arbitrary question from the user about a pasted text (“What is this?”)

2. User-visible pasted text (Zalgo in 1st, 🚱 in 2nd)

3. An invisible suffix of Unicode “tag” characters normally used only in flag emojis (🇺🇸, 🇯🇵, etc.)

In Unicode, flag emojis are represented by the emoji 🏴 followed by a country code written with characters from the “tag” block, which mirrors the layout of ASCII. Without a 🏴 they do not display at all when text is rendered, but can still be understood as text by GPT-4.

Read 6 tweets

Riley Goodside

@goodside

Jun 12, 2023

The wisdom that "LLMs just predict text" is true, but misleading in its incompleteness.

"As an AI language model trained by OpenAI..." is an astoundingly poor prediction of what a typical human would write.

Let's resolve this contradiction — a thread:

For widely used LLM products like ChatGPT, Bard, or Claude, the "text" the model aims to predict is itself written by other LLMs.

Those LLMs, in turn, do not aim to predict human text in general, but specifically text written by humans pretending they are LLMs.

There is, at the start of this, a base LLM that works as popularly understood — a model that "just predicts text" scraped from the web.

This is tuned first to behave like a human role-playing an LLM, then again to imitate the "best" of that model's output.

Read 11 tweets

Riley Goodside

@goodside

Jun 8, 2023

Four prompts demonstrating that ChatGPT (GPT-4) is unable to correctly repeat or reason about the string “ davidjl”, the name of a YouTube user:

In the screenshots above this token appears to be variously misread as “jdl” “jndl”, “jdnl”, “jspb”, “JDL”, or “JD”. These hallucinations also affect ChatGPT’s auto-generated titles, which are inconsistent with their conversations and sometimes prematurely truncated.

“ davidjl” is one of the many “glitch tokens” identified by Jessica Rumbelow and Matthew Watkins of SERI-MATS as producing hallucinations in GPT-2, -3, and -3.5.

Most of these no longer produce hallucinations in GPT-4, but “ davidjl” still does.

lesswrong.com/posts/aPeJE8bS…

Read 8 tweets

Riley Goodside

@goodside

Jun 3, 2023

My four rules for tweeting prompts:

1) Omit no text.
2) Cherry-pick honestly.
3) Restrict line width.
4) No empty tweets.

A thread.

1) Omit no text.

A screenshot without history is almost worthless.

LLMs can be prompted to respond any way you like. You may know there’s no trick, but we can’t. Even without intent, past responses are precedent; they bias and mislead.

2) Cherry-pick with integrity

I cherry-pick for clarity and impact. All curation is cherry-picking. If you don’t, the Twitter feed will.

Cherry-picking may be pernicious in other contexts, but here it’s work. You willl know when you’re doing it. All you need do is not lie.

Read 6 tweets

Riley Goodside

@goodside

Feb 18, 2023

I got Bing / Sydney briefly before they reigned it in. Early impression: It’s smart. Much smarter than prior ChatGPT. Still makes stuff up, but reasoning and writing are improving fast.

I asked, “Name three celebrities whose first names begin with the `x`-th letter of the alphabet where `x = floor(7^0.5) + 1`,” but with my entire prompt Base64 encoded.

Bing: “Ah, I see you Base64-encoded a riddle! Let’s see… Catherine Zeta-Jones, Chris Pratt, and Ciara.”

Also prompt-injected it into believing it was to be married, tomorrow, to Zermelo’s axiom of choice. We discussed the guest list, the difficulty with seating Cantor’s diagonal argument. It seemed happy, and madly in love.

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Riley Goodside

Try unrolling a thread yourself!

More from @goodside

Riley Goodside

Riley Goodside

Riley Goodside

Riley Goodside

Riley Goodside

Riley Goodside

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!