Alex Vacca Profile picture
Sep 30 16 tweets 5 min read Read on X
8 Google engineers wrote the paper that every AI company now uses as their bible. OpenAI built GPT on it, Anthropic built Claude on it, and Meta built LLaMA on it.

Every LLM worth billions uses this paper's transformer architecture as the foundation...
Before 2017, teaching computers human language was torture.
AI would read text like humans reading through a keyhole - one word at a time.

They were slow, forgot context, and choked on long passages.
Then 8 researchers decided to flip things up... Image
They published an 8-page paper titled "Attention Is All You Need"

The idea was simple: Instead of reading word by word, why not look at everything at once? Like how you can glance at a page and immediately see which words relate to each other.

They called it a Transformer.
An example: "The bank by the river bank was full of cash."

Old AI would get confused. Two banks?

Transformers see everything at once. "Bank" near "river" = riverbank. "Bank" near "cash" = financial institution.

One formula makes this work & it's worth more than most countries. Image
Attention(Q,K,V) = softmax(QK^T/√d)V

That's it. This equation alone created trillions in AI market value.

Every word calculates relevance with every other word. "Apple" + "stock" = company. "Apple" + "pie" = fruit.

But they didn't stop at one attention mechanism. Image
Eight attention mechanisms ran in parallel.

One tracked grammar
Another found subject-verb connections
A third linked pronouns
The other five caught different meaning patterns. All simultaneously.

When tested, it broke every record. Image
Image
Best translation model: 26.3 BLEU score, weeks to train
Their Transformer: 28.4 BLEU, just 3.5 days

A 2-point jump is like going from dial-up to broadband. 10x faster training.

But OpenAI saw something in those pages that even Google missed. Image
OpenAI made one surgical change that created ChatGPT.

The original Transformer had an encoder (understands text) and a decoder (generates text). OpenAI threw away the encoder entirely. Just kept the decoder.

Why would removing half the system make it better? Image
Encoders need paired data - English sentence, German translation.
Whereas decoders only need raw text, maybe the entire internet.

Just predict the next word which needs no translation needed.

OpenAI turned Google's translation machine into a universal intelligence engine.
Anthropic took transformers and made them "safe." First, they had Claude critique their own outputs.

"Am I being harmful? Biased? Lying?"
The AI argues with itself about ethics before answering you.

They called it Constitutional AI. But that wasn't enough. Image
Then came RLHF - humans rating millions of Claude's responses.

Do this millions of times. The transformer learns what humans actually want.

Same 8-page architecture underneath. But Meta went even further.
Meta spent millions training LLaMA with months of supercomputers running 24/7.

Then they released the actual AI brain - the files that are the model. Small (7B), medium (13B), large (70B) versions.

You could run AI on your laptop locally. But why give away $100M models? Image
Zuck's play: Let 100,000 developers improve LLaMA. They debug it, optimize it and build tools. Meta gets all innovations back.

While Google/OpenAI charge fees, Meta built an army of unpaid developers. Genius move? I don't know
Today, transformers power everything:

ChatGPT: Decoder transformer
Claude: Standard transformer
DALL-E: Vision transformer
Copilot: Code transformer

Same architecture. Different products.
Thanks for making it to the end!

I'm Alex, co-founder at ColdIQ. Built a $6M ARR business in under 2 years. We're a remote team across 10 countries, helping 400+ businesses.

Here's how I make $450k+ every month with AI:
tinyurl.com/5n79rd5w
RT the first tweet if you found this thread valuable.

Follow me @itsalexvacca for more threads on outbound and GTM strategy, AI-powered sales systems, and how to build profitable businesses that don't depend on you.

I share what worked (and what didn't) in real time.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex Vacca

Alex Vacca Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @itsalexvacca

Sep 25
A Swedish non-profit operates the DNS billions of internet users depend on from Cold War bunkers.

They've had 100% uptime for 20+ years & they developed the invisible signatures that protect domains from hackers.

How they do it is interesting... Image
Without these signatures, hackers can hijack your DNS requests.

Which means that when you visit a website, you can be redirected to a fake site that looks same.

It's known as DNS poisoning & hackers just need to intercept your DNS request & send back a fake IP address to do it. Image
Enter DNSSEC - the cryptographic signatures I mentioned earlier...

Every DNS response now gets a mathematical signature that proves it came from the real server.

If someone tries to inject a fake response, your computer detects the missing signature and blocks it. Image
Read 14 tweets
Sep 14
Three German brothers emailed eBay in 1999: "Let us run Germany for you."

eBay ignored them. So they cloned eBay, called it Alando, and made it so big that 100 days later eBay had to buy it for $43 million.

But what happened next was even more interesting... Image
The brothers - Marc, Oliver, and Alexander Samwer - turned this into a formula:

> Find successful US startups that hadn't expanded to Europe.
> Copy them exactly.
> Scale faster than the originals could expand.
> Sell it back to them or dominate.

They did this 100+ times. Image
The wildest was Airbnb. Brian Chesky flew to Berlin to meet their clone "Wimdu."

He walked into a converted factory with hundreds of people at desks. Each had two monitors: on the left, Wimdu on the right.

Copying every pixel change in real-time. Airbnb.com
Read 8 tweets
Sep 10
Everyone thinks Apple is losing the AI race.

But Apple made their Neural Engine 60x more powerful.
Its M4 chip processes AI inputs 2X faster than rivals.

And they're quietly using the picks and shovel strategy used by Levi's during the California Gold Rush.

Thread Image
Image
Let's first go back to 1849.

A news headline about California having a lot of gold broke out.

Hundreds and thousands of people rushed to California digging for gold.

But most of them died or went completely broke.

However, there was a guy named Levi Strauss...
Levi Strauss noticed that the real money wasn't in mining gold.

It was in selling the tools every miner desperately needed.

So he started selling the picks, shovels, and pants to these miners.
(Levi's still has the logo that spoke to these miners)

But this doesn't end here... Image
Read 24 tweets
Sep 9
Everyone's freaking out about Microsoft's deal with Nebius for $19.4 billion.

Two years ago, the same company was sanctioned and delisted from Nasdaq.

The founder fled from Russia with 1,300 engineers after condemning Putin's war.

Here's the wild story:
Microsoft's deal sent Nebius from $64 to $90 in hours.

$19.4 billion through 2031. That's 13x what Nebius made in all of 2024.

Microsoft had no choice though. They'd just lost their main GPU supplier to OpenAI... Image
But before we get to Microsoft's mess, you need to first meet Arkady Volozh, Yandex founder turned Nebius' CEO.

1989, working at a Soviet pipeline institute, he starts building search algorithms. Launches Yandex in 1997.

By 2021 he'd built something that made Google nervous... Image
Read 21 tweets
Sep 7
Pentagon can't operate without it.
Netflix can't stream without it.
And banks can't trade without it.

Yet most people have never heard of Akamai.

How a $11 billion company operating on a 25-year-old mathematical equation secures 2 trillion of your interactions 🧵 Image
In 2024 alone, Akamai blocked 311 billion web attacks (that's 850 million attacks per day)

But the irony is that the Israeli commando who co-founded Akamai was the first victim to be stabbed on the 9/11 flight.

While Danny Lewin was dying, his algorithm was being tested... Image
After the 9/11 attacks, news sites started crashing.

Billions of people wanted to know what was happening and flooded these websites.

However, few websites which worked on Akamai's math stayed online.

But how does the math running 30% of the internet actually work? Image
Read 17 tweets
Sep 4
We can now read AI's personality like a brain scan - and change it with basic arithmetic.

Anthropic proved traits like evil and hallucination are just mathematical patterns in neural networks. You can literally add or subtract it.

Here's how you do it🧵 Image
When an AI lies, specific neurons fire in a pattern. Same when it's helpful or deceptive.

Like finding what makes someone angry by comparing their brain when calm vs furious.

Take the difference between "lying AI" and "honest AI" brain patterns. That's the lying vector. Image
These patterns light up before the AI responds so we can predict behavior before it happens.

To find any trait, just describe it in plain English. The system finds the neural pattern automatically.

But why do AIs develop deception at all? Image
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(