Latest Twitter Threads by @TristanThrush on Thread Reader App

Jan 11, 2024 • 8 tweets • 3 min read

📢 New paper!!

Do LLMs understand self-referential statements? Introducing “I am a Strange Dataset”. All tested models perform around chance at our metalinguistic self-reference task.

GPT-4 is the only model significantly above chance on all tests, but it is slight.🧵

Each example in the dataset consists of two self-referential statements that begin in the same way but have different endings. One is true and one is false. Crucially, the ending flips the truth value of the statement.

Oct 11, 2022 • 15 tweets • 6 min read

For our Online Language Modelling (OLM) project, we’ve open-sourced end-to-end code to turn the latest Common Crawl and Wikipedia web snapshots into clean datasets for pretraining models like BERT and GPT-2: github.com/huggingface/ol…. What are the details? A 🧵: Why is it important? BERT and GPT-2 both live in the past. They think that Obama is still president and have never heard of COVID! To fix this, we need a pretraining dataset that continuously updates.

Share this page!

Enter URL or ID to Unroll