Tweet

lmsys.org

@lmsysorg

Mar 30 • 4 tweets • 4 min read Twitter logo

Introducing Vicuna, an open-source chatbot impressing GPT-4!

🚀 Vicuna reaches 90%* quality of ChatGPT/Bard while significantly outperforming other baselines, according to GPT-4's assessment.

Blog: vicuna.lmsys.org
Demo: chat.lmsys.org

Here is a summary of the relative performance of five notable models such as Alpaca and ChatGPT. We use GPT-4 to generate a set of challenging questions and then ask it to assess chatbots’ responses.

*DISCLAIMER: This is a fun and non-scientific experiment with GPT-4.

Through careful prompt engineering, GPT-4 is able to accurately evaluate the response quality in most cases, as shown in the example below.

More examples: vicuna.lmsys.org/eval/
Code: github.com/lm-sys/FastChat

@infwinston

This is a joint effort by students from UC Berkeley, CMU, Stanford, and UC San Diego.
@infwinston @zhuohan123 @suzzzylin @ying11231 @Michaelvll1 @haozhangml @lm_zheng @zsy9509 @Howell53181770

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Share this page!

lmsys.org

People who liked this thread also liked...

Try unrolling a thread yourself!

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!