Tweet

Humanloop

Oct 20 • 7 tweets • 3 min read

@CarperAI

Today we're excited to announce that we're partnering with @CarperAI of Stability on bringing the first RLHF-trained GPT-3 like model to the open source community.

This will be huge. Let us explain

RLHF – Reinforcement Learning from Human Preferences.

Models are fine tuned using RL from human feedback. They become more helpful, less harmful and they show a huge leap in performance. An RLHF model was preferred over a 100x larger base GPT-3 model.

These models show far greater ability to take instruction, which has massively increased their usability.

We think RLHF-tuned models will ultimately be applied to every domain and task, and these systems will unlock incredible amounts of value in the real world.

Human feedback is critical to aligning these models to do as you want.

We've built the specialised tools needed for this crucial component and are delighted to be contributing to this open-source effort.

CarperAI will be building in public, and will be releasing the data, code and weights over the coming months. Follow along in their discord.

Like what happened to Stable Diffusion, we can't wait to see the innovation that happens once these models become available for all

Join the Carper discord to follow (or get involved!) discord.com/invite/canadag…

Full details of the announcement here humanloop.com/blog/stability…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Separate emails with commas Message

Share this page!

Humanloop

People who liked this thread also liked...

Try unrolling a thread yourself!

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!