Grant Slatton Profile picture
Sep 18 7 tweets 3 min read Twitter logo Read on Twitter
The new GPT model, gpt-3.5-turbo-instruct, can play chess around 1800 Elo.

I had previously reported that GPT cannot play chess, but it appears this was just the RLHF'd chat models. The pure completion model succeeds.



See game & thoughts below:
Image
The new model readily beats Stockfish Level 4 (1700) and still loses respectably to Level 5 (2000). Never attempted illegal moves. Used clever opening sacrifice, and incredibly cheeky pawn & king checkmate, allowing the opponent to uselessly promote.

lichess.org/K6Q0Lqda

Image
Image
I used this PGN style prompt to mimic a grandmaster game.

The highlighting is a bit wrong. GPT made all its own moves, I input Stockfish moves manually.

h/t to @zswitten for this prompt style


Image
In conclusion, I now totally believe @BorisMPower's claim about 1800 Elo for GPT4.

I think the RLHF'd chat models do not do this well, but the base/instruct models seem to do much better.

@BorisMPower Interestingly, in the games it lost again higher rated Stockfish bots, even after GPT made a bad move, it was still able to *predict* Stockfish's move that takes advantage of GPT's blunder. So you could probably get a >2000 Elo GPT by strapping on a tiny bit of search.
@wowAwesomeness @gigafestyu I’m sure the training data contains millions of games in this format, but never this exact game (it diverges from all known games at move 6). So this shows that GPT has learned how to play chess decently purely from reading millions of example games.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Grant Slatton

Grant Slatton Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @GrantSlatton

Dec 11, 2022
GPT can iteratively write, debug, and test programs to accomplish arbitrary goals.

Pictured: GPT reading snippets of HTML from HN and building a headline scraper in Python, overcoming bugs by simply reading the errors and self-judgments and hypothesizing to itself.

Thread ↓


As shown in my previous post, GPT can be embedded in a REPL to accomplish goals in an agent-based fashion by using command line tools. Today's post shows it is capable of creating novel command line tools, such as a web scraper.

1/
As mentioned in the comments of that thread, the next major thing to add is long term memory. What tools exist? What are their inputs in outputs? Which tools are relevant to the problem at hand?

All these sorts of questions can be answered by memory embeddings.

2/
Read 7 tweets
Dec 8, 2022
GPT can use a web browser to answer questions.

When embedded in a REPL environment and prompted to strategize and monologue, agent-like behavior emerges. The agent can solve multi-step problems that involve going to pages, following links, reading the next page, etc.

Thread ↓ ImageImageImage
The real HN pages showing those answers: ImageImage
Explanation of the system described in the diagram:

The process here involves several different prompts pipelined together in a recursive fashion to result in agent-like behavior. 1/
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(