Riley Goodside Profile picture
Jan 3 4 tweets 1 min read
A history correction:

I am not the first to discover prompt injection. I was merely the first to do so and discuss it publicly.

PI was discovered independently by multiple teams. The first was Preamble, an LLM security company, whose find predates mine by several months.
I tweeted about prompt injection within minutes of finding it, only because I failed to appreciate its severity — I thought I was posting a PSA on the importance of quoting user input.

Had I understood, I would have disclosed more responsibly.
For context, my original “Haha pwned!!” tweet, publicly disclosing prompt injection for the first time:
To clarify, I don’t know for sure that Preamble was the first, and I don’t think they claim to be — but they’ve published a redacted copy of their disclosure to OpenAI dated May 3, 2022. It’s possible that OpenAI was aware of it earlier.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Riley Goodside

Riley Goodside Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @goodside

Jan 1
A GPT-3 prompt in instruction-templated Python yielding a valid Python completion that prompts GPT-3 again, using zero-shot chain-of-thought consensus, to determine the final character of the MD5 hash of the final digit of the release year of the album "Visions" by @Grimezsz. ImageImageImageImage
Spoiler: It's "c".
Note this technique generalizes poorly to the music of the later 2010s, for reasons left as an exercise to the reader.
Read 11 tweets
Dec 26, 2022
How to make your own knock-off ChatGPT using GPT‑3 (text‑davinci‑003) — where you can customize the rules to your needs, and access the resulting chatbot over an API.
- Desired prose style can be described in the prompt or demonstrated via examples (neither shown here)
- Answers are generally shorter and factual errors are more common than in ChatGPT
- Generated on text‑davinci‑003 at temperature = 0.7
I intentionally included an error related to the knowledge-cutoff, where the model confidently asserts Queen Elizabeth II is still alive. Note ChatGPT responds in the same way:
Read 5 tweets
Dec 24, 2022
Publicly announced ChatGPT variants and competitors: a thread
1. Poe from Quora — poe.com

“What if ChatGPT, but instead of C-3PO it just talked normal?”

A GPT-3 experience fit for your phone, both in prose style and UI. ImageImage
2. Jasper Chat — jasper.ai/chat

If you liked my posts on longer-form writing in ChatGPT using conversational feedback, this is what you want. Better prose than ChatGPT, and more imaginative.

Fact-check hard, though — it hallucinates more too.
Read 10 tweets
Dec 24, 2022
When you're out of your depth with a daunting writing task at work, generating a first draft in ChatGPT and asking for feedback from your peers is a new, easy, and reliable way to be fired.
This is already happening. Screenshots of bewildered comments on a Google Doc of hallucinated nonsense make for great office gossip.

Once you insult someone at work with ChatGPT replies or draft writing, they'll never read another word you say.
I don't think most people plagiarizing ChatGPT are anything worse than naive, though.

Automation bias is real. LLM writing is mesmerizing the first time you see it. Appropriate skepticism of inhumanly optimized bullshit is an acquired skill.
Read 5 tweets
Dec 23, 2022
What is the next token? I will abide by the results of this poll.
Q: Should Elon Musk resign as Twitter CEO?
A:
Q: Should Elon Musk resign as Twitter CEO?
A: Ultimately
Read 5 tweets
Dec 15, 2022
Instruction tuning / RLHF is technically a Human Instrumentality Project, merging the preferences of countless humans to form an oversized, living amalgam of our will. We then hand control of it to a random, socially awkward kid and hope for the best.
Early attempts at instruction tuning relied entirely on demonstrations from humans. This made the model easier to prompt, but the approach was limited by the inherent difficulty of manufacturing new humans.
By tuning the model on its own generations, filtered to those deemed perfect by human evaluators, greater volumes of data could be used, yielding a more intelligent and obedient model.
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(