A novel(?), powerful method of complete-program synthesis using GPT-3, generalizing the “format trick” @BorisMPower showed me to combine instructions with contextually informative templates. This is a single temp=0 generation without cherry-picking, truncated here due to length.
It’s like f-strings, except the English goes on the inside and the Python goes on the outside.
A second example, returning the lowest palindromic Fibonacci number greater than a given n:
Example 3: GPT-3 generating a PyGame implementation of Tic-tac-toe using given class/module structure and libraries. Only selections of output are shown here due to length.
Note the inclusion in Ex. 3 of “Implement all functions completely.” Otherwise for longer generations, it’s inclined to cheat by writing function bodies with “# TODO: Implement this” if directions are extremely high-level as in the end of Example 3.
Example 4: Synthesizing example CSV data and related Python code in a single generation. Playground link: beta.openai.com/playground/p/x…
Example 5: Generating inter-related NDJSON, Python, and HTML simultaneously. Ringo Starr truncated due to length. Playground link: beta.openai.com/playground/p/7…
Example 6: Generating a Python script that creates synthetic CSV data, following a prescribed class format. Some enum subclasses in output omitted due to length. Playground link: beta.openai.com/playground/p/b…
Note in Example 6 we import an entirely imaginary function, `synthetic_name.get_name()`, and the generated output uses it as intended.
Example 8: Generating hash-modulo binning algorithm from docstring description. General theme here is “I know exactly what I want, but I forget the import names.” Trust the model only to translate idioms/syntax. Playground link: beta.openai.com/playground/p/K…
Example 9: Posing an assertion-satisfaction puzzle to GPT-3 using a trivial instructional template. Playground link: beta.openai.com/playground/p/R…
Example 10: Cross-language synthesis in Python and R. It’s conceivable this works better than traditional translation, as code is generated from a template summarizing high-level intent. Playground link: beta.openai.com/playground/p/G…
• • •
Missing some Tweet in this thread? You can try to
force a refresh
PoC: LLM prompt injection via invisible instructions in pasted text
Each prompt contains three sections:
1. An arbitrary question from the user about a pasted text (“What is this?”)
2. User-visible pasted text (Zalgo in 1st, 🚱 in 2nd)
3. An invisible suffix of Unicode “tag” characters normally used only in flag emojis (🇺🇸, 🇯🇵, etc.)
In Unicode, flag emojis are represented by the emoji 🏴 followed by a country code written with characters from the “tag” block, which mirrors the layout of ASCII. Without a 🏴 they do not display at all when text is rendered, but can still be understood as text by GPT-4.
Four prompts demonstrating that ChatGPT (GPT-4) is unable to correctly repeat or reason about the string “ davidjl”, the name of a YouTube user:
In the screenshots above this token appears to be variously misread as “jdl” “jndl”, “jdnl”, “jspb”, “JDL”, or “JD”. These hallucinations also affect ChatGPT’s auto-generated titles, which are inconsistent with their conversations and sometimes prematurely truncated.
“ davidjl” is one of the many “glitch tokens” identified by Jessica Rumbelow and Matthew Watkins of SERI-MATS as producing hallucinations in GPT-2, -3, and -3.5.
Most of these no longer produce hallucinations in GPT-4, but “ davidjl” still does.
1) Omit no text. 2) Cherry-pick honestly. 3) Restrict line width. 4) No empty tweets.
A thread.
1) Omit no text.
A screenshot without history is almost worthless.
LLMs can be prompted to respond any way you like. You may know there’s no trick, but we can’t. Even without intent, past responses are precedent; they bias and mislead.
2) Cherry-pick with integrity
I cherry-pick for clarity and impact. All curation is cherry-picking. If you don’t, the Twitter feed will.
Cherry-picking may be pernicious in other contexts, but here it’s work. You willl know when you’re doing it. All you need do is not lie.
I got Bing / Sydney briefly before they reigned it in. Early impression: It’s smart. Much smarter than prior ChatGPT. Still makes stuff up, but reasoning and writing are improving fast.
I asked, “Name three celebrities whose first names begin with the `x`-th letter of the alphabet where `x = floor(7^0.5) + 1`,” but with my entire prompt Base64 encoded.
Bing: “Ah, I see you Base64-encoded a riddle! Let’s see… Catherine Zeta-Jones, Chris Pratt, and Ciara.”
Also prompt-injected it into believing it was to be married, tomorrow, to Zermelo’s axiom of choice. We discussed the guest list, the difficulty with seating Cantor’s diagonal argument. It seemed happy, and madly in love.