The new model readily beats Stockfish Level 4 (1700) and still loses respectably to Level 5 (2000). Never attempted illegal moves. Used clever opening sacrifice, and incredibly cheeky pawn & king checkmate, allowing the opponent to uselessly promote.
@BorisMPower Interestingly, in the games it lost again higher rated Stockfish bots, even after GPT made a bad move, it was still able to *predict* Stockfish's move that takes advantage of GPT's blunder. So you could probably get a >2000 Elo GPT by strapping on a tiny bit of search.
@wowAwesomeness @gigafestyu I’m sure the training data contains millions of games in this format, but never this exact game (it diverges from all known games at move 6). So this shows that GPT has learned how to play chess decently purely from reading millions of example games.
GPT can iteratively write, debug, and test programs to accomplish arbitrary goals.
Pictured: GPT reading snippets of HTML from HN and building a headline scraper in Python, overcoming bugs by simply reading the errors and self-judgments and hypothesizing to itself.
As shown in my previous post, GPT can be embedded in a REPL to accomplish goals in an agent-based fashion by using command line tools. Today's post shows it is capable of creating novel command line tools, such as a web scraper.
1/
As mentioned in the comments of that thread, the next major thing to add is long term memory. What tools exist? What are their inputs in outputs? Which tools are relevant to the problem at hand?
All these sorts of questions can be answered by memory embeddings.
2/
When embedded in a REPL environment and prompted to strategize and monologue, agent-like behavior emerges. The agent can solve multi-step problems that involve going to pages, following links, reading the next page, etc.
Thread ↓
The real HN pages showing those answers:
Explanation of the system described in the diagram:
The process here involves several different prompts pipelined together in a recursive fashion to result in agent-like behavior. 1/