Research institute focused on developing forecasting methods to improve decision-making on high-stakes issues, led by chief scientist Philip Tetlock.
Oct 1, 2024 • 18 tweets • 4 min read
Today, we're excited to announce ForecastBench: a new benchmark for evaluating AI and human forecasting capabilities. Our research indicates that AI remains worse at forecasting than expert forecasters. 🧵
Arxiv:
Website: arxiv.org/abs/2409.19839 forecastbench.org
Evaluating LLM forecasting ability is tricky! Prior work asks models about events that already have (or have not) occurred, risking contamination of training data.
Our solution is to use questions about future events, the outcomes of which are unknowable when forecasts are made.