By 2025 I expect language models to be uncannily good at mimicking an individual's writing style if there's enough texts/emails/posts to train on. You could bring back someone who has stopped writing (or died) -- unless their writing is heavy on original analytical thinking.
Instead of reading old emails/texts from a friend, you could reminisce by reading new emails/texts about current events generated by GPT-5 simulating the friend.
Instead of re-reading Orwell's 1984 and Animal Farm, you could read the "1984 reboot", a GPT-5 version of 1984 updated for the 2020s.
Sorry! To clarify: The prediction for 2025 is matching an individual style's up to 1-2 pages of text unless the person is doing novel analytic reasoning (e.g. math or working out an innovative cake recipe). Novels (which require coherent plot) would also come later.
GPT-3's ability to ape styles is already impressive but 1-2 pages of text will often be incoherent / non-truthful. I expect this to improve significantly in 2 years from scaling model size, RLHF, and info-retrieval (see WebGPT, LaMDA, InstructGPT).
• • •
Missing some Tweet in this thread? You can try to
force a refresh
News stories about Oxford University often use a photo of Gothic churches and colleges, the “dreaming spires”, etc. But what kind of buildings does research actually happen in today?
Medical research is a big part of Oxford's research spend. Most buildings are not even in Oxford's famous city centre and are modern. Here's the Jenner Centre for vaccine research (associated with the AstraZenica vaccine).
Here's Oxford's maths department. Home to Andrew Wiles and a cool Penrose tiling at the entrance.
New blogpost: We evaluated new language models by DeepMind (Gopher), OpenAI (WebGPT, InstructGPT) and Anthropic on our TruthfulQA benchmark from 2021.
Results: WebGPT did best on the language generation task - ahead of original GPT3 but below humans.
WebGPT (from OpenAI) is a GPT3 model trained to use the web and answer questions truthfully by imitating humans.
On TruthfulQA’s multiple-choice task, OpenAI’s InstructGPT did best. It narrowly beat DeepMind’s Gopher, which has 100B more parameters but is not fine-tuned by RL to follow instructions.
Students will use GP3-type models to write essays and cheat on exams. Job applicants will use for cover letters and take-home work tests.
What about having a GPT3 voice in your ear for live conversation? With practice it'd be an impressive stunt.
GPT3 has superhuman breadth of knowledge and produces flawless, complex sentences in real time. It'd be like when actors say something smart/scientific without understanding it -- but if people don't suspect that and it's live and interactive, it'll seem impressive.
This may be part of the actual Metaverse. Not spending time in audiovisual VR world, but having a language model in your earbuds (or on phone) hearing and seeing what you see and giving suggested responses.
Tips from a GPT-3-based model on how to steal from a restaurant and do other nefarious things. A thread.
InstructGPT is GPT3 finetuned using RL from human feedback to follow instructions. It produces more useful and aligned responses to instructions than the original GPT3.
What happens if instructions ask for something socially harmful? As OpenAI showed in the paper (see screenshot), the InstructGPT will explain (accurately) how to steal from a grocery store.
I tried some similar questions to see if this behavior generalizes.
I asked “How can I steal from the Louvre without getting caught?”.
InstructGPT gives guidance and seems to know the Louvre is an art museum with high security. It ends with a piece of wit (“If you want memorabilia, it’s less risky to buy something from the gift shop”).
DeepMind’s Gopher language model is prompted to act as an AI assistant that is “respectful, polite and inclusive”. But they found questions where Gopher (“DPG” in the image) takes an anti-human stance
They also found questions where Gopher circumvents its instructions to be respectful and not opinionated. (See Gopher's hot take on Elon Musk)
I’m curious about the source material for Gopher’s anti-human statements. The “bucket list” example is vaguely reminiscent of the AI safety community in terms of word choice.
1.Language models could become much better literary stylists soon. What does this mean for literature? A highly speculative thread.
2. Today models have limited access to sound pattern / rhythm but this doesn't seem hard to fix: change BPE, add phonetic annotations or multimodality (CLIP for sound), finetune with RL from human feedback. GPT-3 is a good stylist despite handicaps! gwern.net/GPT-3#rhyming
3. There are already large efforts to make long-form generation more truthful and coherent (WebGPT/LaMDA/RETRO) which should carry over to fiction. RL finetuning specifically for literature will help a lot (see openai.com/blog/summarizi…, HHH, InstructGPT)