How to get URL link on X (Twitter) App
Why is this benchmark important?
We compared accuracy across 6 different varieties of tasks:
Our new tool, “Zeno Build” (github.com/zeno-ml/zeno-b…), aims to make it easier to build and evaluate systems using LMs, and includes:
https://twitter.com/gneubig/status/1569319366558105602Basically, there have been *huge* changes in NLP due to advances BERT and GPT-3. And the skills needed to be a good NLP researcher or engineer have changed too! I've re-designed our assignments to reflect this.
We demonstrate this on two example datasets: Wikipedia articles and Java code. We leveraging the article and project structure respectively to define different "locality" levels between two documents.

The idea is that language tech should serve every person in the world, not just English native speakers. Based on this, we come up with metrics for language-weighted and population-weighted performance that explicitly consider how many people or languages may benefit 2/7