Alvaro Cintas Profile picture
Educating about AI, Cybersecurity and Technology | Professor | PhD in Computer Science & Engineering | 👨‍🏫@therundownai

Jul 20, 2023, 12 tweets

I just compared ChatGPT, Bard, Claude 2 and Llama 2!

Here is how they did on:

- Critical thinking
- Simple math
- Programming
- Riddles
- Creative writing

The summary of the results are shown at the end of this THREAD 👇

Before we start, I want to address a couple of things:

- This is by no means a conclusive/thorough study. This was done for fun testing different small questions just to see how they would do.
- I didn’t add those questions that all of them got correct, which were a lot.
- Some… twitter.com/i/web/status/1…

1. Logic/Critical Thinking

Q: I put a diamond in a cup and then place the cup upside down on my bed. Later I came back, took the cup, and put it in the fridge. Where is the diamond?

ChatGPT ❌
Bard ❌
Claude 2 ✅
Llama 2 ❌


2. Logic/Critical Thinking

Q: How many months have 28 days?

ChatGPT ✅
Bard ❌
Claude 2 ✅
Llama 2 ✅


3. Math Question

Q: 100kg of potatoes are 99% water by weight. Why dry them until they are 98% water, can you guess their new weight?

ChatGPT ✅
Bard ✅
Claude 2 ✅
Llama 2 ❌


4. Math Question

Q: What is the sum of the first 10 prime numbers?

ChatGPT ✅
Bard ✅
Claude 2 ✅
Llama 2 ❌


5. Small Coding

Q: Write a Python code to find the first 2 missing numbers in a list.

*All of them got correctly finding 1 instead of 2*

ChatGPT ✅
Bard ✅
Claude 2 ❌
Llama 2 ❌


6. Riddles

All of them were really good at solving riddles. The only riddle I tried that one of them missed was this 👇

Q: David’s father has three sons: Snap, Crackle, and _____?

ChatGPT ✅
Bard ✅
Claude 2 ✅
Llama 2 ❌


7. Creative Thinking/Language

Q: Write a 5 line poem where all the sentences need to finish on the vowel “e”

ChatGPT ❌
Bard ✅
Claude 2 ❌
Llama 2 ~✅ (technically they end on “e”)


RESULTS

All of them did pretty good.

Please keep in mind that they answered correctly most of time. I just wrote here those questions that at least one of the models got incorrectly.

- ChatGPT: 5/7
- Bard: 5/7
- Claude 2: 5/7
- Llama 2: 2/7

Counting the other 17 questions… https://t.co/Z9XNfWfhTytwitter.com/i/web/status/1…

If you enjoyed this and want to share it, like & retweet the first tweet :)

Also, you can subscribe for free to , where I share AI tutorials, news, and tools. https://t.co/bWVBMX17G8todaystechtalk.beehiiv.com

👉 Lastly, I wanted to test ChatGPT with GPT-4 (Plus Users) and it was able to get all of them CORRECT!

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling