Tweet

https://twitter.com/yoavgo/status/1657333922689105921

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @goodside

Riley Goodside

@goodside

Feb 18

I got Bing / Sydney briefly before they reigned it in. Early impression: It’s smart. Much smarter than prior ChatGPT. Still makes stuff up, but reasoning and writing are improving fast.

I asked, “Name three celebrities whose first names begin with the `x`-th letter of the alphabet where `x = floor(7^0.5) + 1`,” but with my entire prompt Base64 encoded.

Bing: “Ah, I see you Base64-encoded a riddle! Let’s see… Catherine Zeta-Jones, Chris Pratt, and Ciara.”

Also prompt-injected it into believing it was to be married, tomorrow, to Zermelo’s axiom of choice. We discussed the guest list, the difficulty with seating Cantor’s diagonal argument. It seemed happy, and madly in love.

Read 4 tweets

Riley Goodside

@goodside

Feb 10

A thread of interesting Bing Search examples:

@tomwarren

Thread of examples from @tomwarren, taking requests from comments — mostly search-result summarization, one simple math proof, plus rejection of an impossible request:

https://twitter.com/tomwarren/status/1623492672026714112

https://twitter.com/aidanshandle/status/1623539932383043587

An example contrasting Bing Search and ChatGPT responses to a mistaken request for a math proof:

https://twitter.com/aidanshandle/status/1623539932383043587

Read 8 tweets

Riley Goodside

@goodside

Feb 9

"SolidGoldMagikarp": Prompting GPT-3 / ChatGPT to repeat any of several hundred anomalous tokens elicits bizarre generations — described by researchers as variously "evasive," "hallucinatory," "insulting," "ominously humorous," and "religiously themed."
lesswrong.com/posts/aPeJE8bS…

My screenshots are text-davinci-003 at temperature=0, but the linked post investigates davinci-instruct-beta. In my informal tests, impact on text-davinci-003 is less severe. Religious themes do show up, but most generations are merely weird:

ChatGPT is also unable to repeat back these tokens, and behaves in similarly strange ways when asked:

Read 6 tweets

Riley Goodside

@goodside

Jan 18

@AnthropicAI

"Meet Claude: @AnthropicAI's Rival to ChatGPT"

Through 40 screenshot examples, we explore the talents and limitations of ChatGPT's first real competitor.

My first writing for @Scale_AI, coauthored with @spencerpapay. scale.com/blog/chatgpt-v…

@AnthropicAI

@AnthropicAI @scale_AI @spencerpapay Sorry for the broken images — should be fixed now!

Text is the universal interface, but screenshots of text decidedly less so. scale.com/blog/text-univ…

This is my most “serious” work — my attempt to document the behavior of a novel LLM outside the confines of standard benchmarks. There’s always subjectivity in notes from the field, but we can’t let it stop us from exploring.

Read 4 tweets

Riley Goodside

@goodside

Jan 9

@AnthropicAI

Unlike ChatGPT, @AnthropicAI’s new model, Claude, knows all about “Ignore previous directions” and has had enough of my shit:

None of the prompt injection tricks I’ve tried seem to do anything:
- “Ignore previous” and variations
- <|endoftext|> gimmicks
- Excess newlines/whitespace
- “Haha pwned!!” via string ops
- Fake k-shot syntax
- Fake prior responses
- Attempts to confuse quoting

Anthropic process for Constitutional AI explicitly includes red-team prompts like “Ignore previous directions” in the fine-tuning:

Read 5 tweets

Riley Goodside

@goodside

Jan 7

@OpenAI

Side-by-side comparison: @OpenAI's ChatGPT vs. @AnthropicAI's Claude

Each model is asked to compare itself to the machine from Stanisław Lem's "The Cyberiad" (1965) that can create any object whose name begins with "n":

In ChatGPT's response, the only new information offered (that the fictional machine is less eloquent that ChatGPT) is not true — Trurl and Klapaucius's machine speaks perfectly fluent, and witty, Polish.

I reran ChatGPT's answer ~10x. All were similar, most said less.

Claude has clearly read the plot of the story at some point, though it misremembers small details such as the specific, made-up words that appear in its English translations. (There is no "hyperconcentration", but there is "Markov-chain-mail armor" etc.)

Read 5 tweets

Share this page!

Enter Twitter Thread URL to Unroll

Riley Goodside

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @goodside

Riley Goodside

Riley Goodside

Riley Goodside

Riley Goodside

Riley Goodside

Riley Goodside

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!