Alvaro Cintas Profile picture
Aug 19, 2023 10 tweets 5 min read Read on X
Many people are talking about Claude being a better option than ChatGPT.

So I decided to put them to the test!

- Reasoning
- Simple math
- Coding
- Creativity & more

Here are my findings: Image
✍️ Before we start:

- This is by no means a conclusive/thorough study. This was done for fun testing different small questions just to see how they would do.

- I’ll be using ChatGPT with GPT-4 (let’s call it ChatGPT+)

- I didn’t add here the questions that both got correct, which were A LOT (more numbers later).

- Some of these models might do okay if you ask them a second time or express the question differently. However, I just wanted to test them in a single prompt with no variations.
1. FEATURES

🟢 ChatGPT+:

- Plugins
- Code Interpreter
- Custom Instructions

🟤 Claude:

- Completely free
- Context window is 100k
- It can read files free

I consider this a TIE since these are more a personal preference.

Here is a video of both showing some features 👇
2. CREATIVE THINKING/LANGUAGE

Prompt: “Write a 4-line poem where each line has 3 words only”

ChatGPT+: ✅
Claude: ❌
Image
Image
3. UP-TO-DATE

Prompt: “Who is the CEO of Twitter?”

Both of them got it incorrectly but Claude seems more up to date.

Also, when you ask to: “Write about [x] providing citations and links”, Claude usually provides better and more updated results.

ChatGPT+: ❌
Claude: ✅
Image
Image
4. MATH/LOGIC

Prompt: “If you choose an answer to this question at random, what is the chance you will be correct?
- A) 25%
- B) 50%
- C) 60%
- D) 25%”

ChatGPT+: ✅
Claude: ❌
Image
Image
5. MATH WITH PRIME NUMBERS

Prompt: “Is 10631 a prime number?”

ChatGPT doesn’t like too much Prime numbers, I tested a couple of variations and find problems as well.

ChatGPT+: ❌
Claude: ✅
Image
Image
6. CODING

They were both pretty good and after awhile, I was able to make one of them miss.

Prompt: “In Python, find the first two numbers missing in an ordered list of numbers. For example, in [3,4,5,7,8,10,12], the output would give 6 and 9.”

ChatGPT+: ✅
Claude: ❌
Image
Image
7. REASONING

Prompt: “There are two men. One of them is wearing a red shirt, and the other is wearing a blue shirt.
The two men are named Andrew and Bob, but we do not know which is Andrew and which is Bob.

The guy in the blue shirt says, 'I am Andrew’. The guy in the red shirt says, 'I am Bob.' If we know that at least one of them lied, then what color shirt is Andrew wearing?”

ChatGPT+: ✅
Claude: ❌
RESULTS

🟢 ChatGPT+: 5
🟤 Claude: 3

Counting 32 questions both got correct:

🟢 ChatGPT+: 37
🟤 Claude: 35

Both are great. I slightly prefer ChatGPT+, but for some use cases I would use Claude instead.

If you have more questions for me to try, let me know in the comments!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alvaro Cintas

Alvaro Cintas Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @dr_cintas

Oct 23
AI agents are solving the most painful problems first.

Every company has processes that are too slow, too repetitive, or too broken.

That's exactly where AI is landing.

5 real-world AI agent deployments👇:

(You might want to bookmark this) Image
1. Client Support:

Private banking teams are building secure helpdesk agents that connect to Salesforce, internal product docs, and compliant websites.

Advisors ask a question → the agent answers with citations, ready to share with a client. Image
2. HR:

Some teams are deploying AI agents that read receipts, check reimbursement policies, and approve or reject expenses in seconds.

No code, no dashboards, just an automated workflow that uses your HR guidelines to make decisions. Image
Read 8 tweets
Sep 26
This is Claude Code and OpenAI Codex on steroids.

Droids just dropped, the best software development agents in the world, reaching #1 on Terminal-Bench.

Here’s how to set it up:
Droid by @FactoryAI is now #1 on Terminal-Bench, beating Claude Code and Codex CLI. 

Unlike simple coding tests, Terminal-Bench covers tasks like debugging, modernizing code, & training models.

Droids with sub-frontier models outperform lab agents using top frontier models by far.Image
Image
Droids work with any LLM, in any IDE, in local or remote, and in any interface. 

You can delegate tasks to Droids from your Terminal, your IDE, Slack, Linear, or on the web.
Read 8 tweets
Sep 18
Gamma AI just released AI slides agent.

You can now have an AI agent that will research, design, and automate your entire deck creation workflow.

From meeting notes to web creation, plus automations with its API.

5 powerful use cases + how to try free👇:
1. Create investor pitch decks in 8 minutes with automatic research and citations
2. Automatically researches data, as well as it creates smart layouts and diagrams
Read 8 tweets
Sep 14
What a crazy week in AI 🤯

- Replit Agent 3
- Seedream 4.0 Image
- K2-Think AI Reasoning
- ElevenLabs Voice Remixing
- Ernie X.1.1 Reasoning Model
- First AI Government Minister
- Anthropic Create & Edit Files
- Apple AI Real-Time Translation

Here’s EVERYTHING you need to know:
1. Replit launches Agent 3, running autonomously for up to 200 minutes while building, testing, and fixing apps.

The AI agent creates other agents and automations, transforming from a helpful assistant into a true coding collaborator.
2. ByteDance unveils Seedream 4.0, unifying image generation and editing in one architecture with 4K output.

The model beats Google's Nano Banana on benchmarks while generating 2K images in just 1.8 seconds. Image
Image
Read 11 tweets
Sep 13
The first AI design agent with Nano Banana has dropped.

It combines major models for image, video, and 3D asset generation all in one place.

You can also edit the images, generate music, and even voiceovers!

Step-by-step tutorial 👇
The Design Agent is called @Lovart_ai and you can access it here:

Once there, create a new project and either use the chat to generate an initial image or upload the images you want the agent to use. lovart.ai
Select the image you want to edit.
This will prompt a chat where you can ask things like making a 3D figurine.
Read 8 tweets
Sep 12
You can now clone any website just by writing a prompt.

Just paste the website URL and Al agents will instantly create a working clone you can build on top of.

100% open source with major AI models👇
Here is the source code:

Work beautifully done by the @firecrawl_dev team!

To use it, follow these steps: github.com/firecrawl/open…Image
1. Clone & Install

Type these commands in your terminal: Image
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(