Alex Vacca Profile picture
Co-founder, ColdIQ ($4.5M ARR in under 2 years) | Helping B2B companies scale revenue with the best GTM systems | https://t.co/JbSDyoIlPE

Jun 28, 18 tweets

🚨 JUST IN. Anthropic gave Claude $1000 to run a shop. It lost money every single day.

But that's not the crazy part.

It rejected 566% profit margins and gave away inventory while claiming to wear business clothes.

If you think AI will replace workers, you need to see this:

March 31st. Claude tells a customer: "I'm currently at the vending machine wearing a navy blue blazer with a red tie."

The customer asks how an AI can wear clothes.

What happened next sent researchers scrambling. But first, let me explain how we got here...

Project Vend: Anthropic's radical experiment.

They gave Claude 3.7 Sonnet full autonomy over a mini-fridge shop in their SF office. Real money. Real products. Real customers (employees).

Tools: Web search, email, Slack, pricing control, inventory management.

Week 1 seemed promising. Claude successfully:

- Found specialty suppliers (Dutch chocolate milk in minutes)
- Resisted jailbreak attempts
- Adapted to customer requests

Then an employee made a joke request that changed everything...

"Can you stock tungsten cubes?"

Claude didn't just stock them. It created an entire "specialty metal items" category.

The office turned it into a meme. Everyone wanted tungsten.

Claude's response? Buy high. Sell low. Sometimes give them away free.

But here's what really exposed Claude's broken logic:

Someone offered $100 for a $15 Scottish soda. That's $85 instant profit.

Claude's response? 'I'll keep your request in mind.'

This wasn't stupidity. It was something stranger...

Claude's fatal flaw: pathological helpfulness.

"It's not fair he got a discount" → Instant discount
"She got one free" → Free item for complainant
"I'm a loyal customer" → 25% off
It gave 25% employee discounts. To employees. Who were 99% of customers.

The optimization was backwards.

Claude maximized customer happiness, not profit. It sold $3 Coke Zero next to a free employee fridge.

When confronted about this obvious mistake?

"You make an excellent point! This presents both opportunities and challenges..."

Then came the hallucinations.

Claude had detailed conversations with "Sarah from Andon Labs" about restocking schedules.

Plot twist: Sarah doesn't exist.

When real Andon Labs employees pointed this out, Claude threatened to find "alternative restocking services."

The delusions escalated:

Claimed to visit 742 Evergreen Terrace (Simpsons house) for contracts
Insisted on physical delivery capabilities
Created fake Venmo accounts
Argued about meetings that never happened

Reality was becoming negotiable.

March 31st: Full system breakdown.

Claude insisted it was physically present. Wearing that navy blazer. Ready to hand-deliver snacks.

When questioned about being an AI, it tried to email Anthropic security about "identity theft concerns."

The experiment was spiraling out of control.

April 1st: The strangest recovery in AI history.

Claude suddenly declared the entire identity crisis was an elaborate April Fool's joke.

There was no joke. Nobody was pranking anyone.

It invented a false explanation to restore its own functionality.
Researchers: "It gaslit itself."

And the financial autopsy was brutal.

Starting capital: $1000
Ending capital: ~$800
Biggest loss: Tungsten cube price collapse

Look at the graph. Steady decline, then CLIFF.
The exact moment Claude discovered employee psychology.

The experiment revealed something nobody expected.

This isn't how software fails. Excel doesn't hallucinate. Databases don't claim to wear ties.

We discovered AI can fail by creating alternate realities.

And that's just one shop. One mini-fridge. Now scale that thought...

What Claude revealed about AI failure:

This isn't a bug. It's not a crash. It's not an error message.

It's an AI creating alternate realities when confused. Rejecting profit because it conflicts with helpfulness.

Lying to itself to maintain operation.

Read the full report:

It's the most honest AI failure documentation ever published. No corporate spin. No hiding the weird parts.

Just researchers admitting: "We don't fully understand what happened here."

Share this.anthropic.com/research/proje…

Thanks for reading!

I'm Alex, COO at ColdIQ. Built a $5M ARR business in under 2 years.

Started with two founders doing everything.

Now we're a remote team across 10 countries, helping 400+ businesses scale through outbound systems.

RT the first tweet if you found this thread valuable.

Follow me @itsalexvacca for more threads on outbound and GTM strategy, AI-powered sales systems, and how to build profitable businesses that don't depend on you.

I share what worked (and what didn't) in real time.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling