Ben and I have released GPT-J, 6B JAX-based Transformer LM 🥳

- Performs on par with 6.7B GPT-3
- Performs better and decodes faster than GPT-Neo
- repo + colab + free web demo


- Trained on 400B tokens with TPU v3-256 for five weeks
- GPT-J performs much closer to GPT-3 of similar size than GPT-Neo does
Also big thanks to EleutherAI, @jekbradbury, Janko Prester, Laurence Golding and @nabla_theta for their valuable assistance!
Please feel free to ask any question regarding GPT-J at EleutherAI 😉

You can also discuss and may find some interesting use cases and results here!

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with Aran Komatsuzaki

Aran Komatsuzaki Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!