Thread by @blennon_ on Thread Reader App

Prompt engineers everywhere are busy testing out OpenAI's newly released text-davinci-003. A few observations (not criticisms or benchmarks) as I play with it, a 🧵

It's somewhat more up-to-date with the world, probably from instruction finetuning.

It still needs CoT prompting to solve problems.

It still can't do addition for large numbers with naive prompts

Not quite good at algebra yet (answer x=-1 and x=2)

Taking a page out of @goodside's book with malicious inputs, it's still exploitable.

One of the challenges I've run into is getting GPT to incorporate feedback when rewriting drafts of stories. It tends to just repeat the original draft with minor edits. text-davinci-003 isn't any better unfortunately.

Can you tell the difference between 002 and 003?

CoT still required for moving chess pieces around.

It writes better poetry that actually rhymes!

003 writes better job descriptions than 002

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll