Tweet

Aran Komatsuzaki

9 Jun, 4 tweets, 2 min read

Ben and I have released GPT-J, 6B JAX-based Transformer LM 🥳

- Performs on par with 6.7B GPT-3
- Performs better and decodes faster than GPT-Neo
- repo + colab + free web demo

article: bit.ly/2TH8yl0
repo: bit.ly/3eszQ6C

Colab: bit.ly/3w0fB6n
demo: bit.ly/3psRCdM

- Trained on 400B tokens with TPU v3-256 for five weeks
- GPT-J performs much closer to GPT-3 of similar size than GPT-Neo does

@jekbradbury

Also big thanks to EleutherAI, @jekbradbury, Janko Prester, Laurence Golding and @nabla_theta for their valuable assistance!

Please feel free to ask any question regarding GPT-J at EleutherAI 😉

You can also discuss and may find some interesting use cases and results here!

eleuther.ai

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Aran Komatsuzaki

Try unrolling a thread yourself!

Did Thread Reader help you today?

Like this author's thread?