Jascha Profile picture
26 Jan, 5 tweets, 2 min read
CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS

We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at github.com/google/BIG-Ben…
#BIGbench
All accepted task submitters will be co-authors on the paper releasing the benchmark. Teams at Google and OpenAI will further evaluate BIG-Bench on their best-performing model architectures, across models spanning from tens of thousands through hundreds of billions of parameters.
We encourage submission of tasks by researchers in fields other than computer science which probe the nature of language or intelligence, including: linguistics, cognitive science, philosophy, neuroscience, psychology, animal intelligence, and logic.
We also encourage submission of tasks which quantify social bias in language models. Including measures of bias in a standard language model benchmark will motivate future research countering it.
The benchmark and results of the model evaluation will be released at the ICLR 2021 Workshop on Enormous Language Models.

See github.com/google/BIG-Ben… for more details.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jascha

Jascha Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jaschasd

8 Aug 20
"Finite Versus Infinite Neural Networks: an Empirical Study." arxiv.org/abs/2007.15801 This paper contains everything you ever wanted to know about infinite width networks, but didn't have the computational capacity to ask! Like really a lot of content. Let's dive in.
Infinite width Neural Network Gaussian Process (NNGP) and Neural Tangent Kernel (NTK) predictions can outperform finite networks, depending on architecture and training practices. For fully connected networks the infinite width limit reliably outperforms the finite network. Image
The NNGP (corresponding to infinite width Bayesian networks) typically outperforms the NTK (corresponding to infinite width networks trained by gradient descent). Image
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!