Tommy Profile picture
May 28, 2023 22 tweets 9 min read Read on X
Stephen Wolfram of Wolfram Alpha wrote the absolute best post on ChatGPT and Large Language Models.

It took me about two hours to read, but significantly increased my understanding of what's going on under the hood of ChatGPT.

A few of my favorite takeaways (helps my process) Image
The goal of a large language model is to reasonably continue the text it already has

ChatGPT's LLM estimates these probabilities

Temperature is a parameter that determines how often lower ranked words are used, adding randomness.

LLM's are trained on vast amounts of human text ImageImage
Where to these probabilities come from?

ChatGPT is a model that lets people estimate the probabilities which sequences of words should occur.

Stephen adds an interesting walk through demonstrating the probability of how often letters occur, and then pairs of letters and beyond Image
Neutral Nets are similar to a human brain

The brain has 100B neurons and are connected to ~1,000 other Neurons

A neuron pulses depending on what pulses it gets from other Neurons all with their own connections.

This contributes to different weights in the model. Viola!
Neural Net Explanation

- Neurons arranged in layers
- Each Neuron has a weight (significance)
- ML is first used to find the weights
- Neuron evaluates numerical function
- Input is fed and neurons at each layer evaluate and feeds results to next layer
- End result is reached Image
Larger networks do better at landing on results.

In the below image the goal is to take in a point and recognize it in one of the three regions.

I laughed when Stephen said at the boundaries it has trouble "making up its mind". Much human.

Unsure results could be dangerous Image
Training Neural Nets

The goal is to feed a zillion examples, and find weights that reproduce the examples.

Everytime an example is used, the weights are adjusted throughout the model.

Training is really expensive and computationally intensive. Image
Now how are the weights adjusted?

Stephen describes that the model uses a Loss Function.

The goal of adjusting the weights is to to reduce the loss function, or how far away your output is from the intended result based upon the examples.

More data, lower loss function Image
One of the most counterintuitive takeaways that with Neural Nets it's easier to solve more complicated problems than simpler ones.

That's good too since I'm dumb and need help with the complicated problems in life.

I'll let Stephen take it from here: Image
ChatGPT has an easier time training since it can conduct "Unsupervised Learning"

- ChatGPT gets text (masks end)
- Use probabilities to get the end of the sentence
- Use this as a training input
- Output is the complete piece of text

TLDR it's easier to get examples to train
Summing this all up, Stephen shares an image showing the training process for a neural net and how the loss function should decrease over time.

If the loss eventually streamlines, yay you have a solid model

If it not you can't rely on it and it's time to change the architecture Image
ChatGPT is often extrapolated as a path to Terminators

Stephen counters that the magic of LLMs for writing really isn't that hard.

We're not closer to terminators, writing essays just isn't as hard as we think.

@stephen_wolfram plz share more on NN's replacing humans (pic 2) ImageImage
Embeddings

Embeddings are laying out words, represented by numbers, to those they are commonly associated with

Probabilities are found using vast amounts of text

Embeddings give a more natural feel to ChatGPT since words that are commonly associated with each other can be used Image
Onto ChatGPT!

ChatGPT+ is a giant Neural Net with 100 Trillion Parameters (GPT3 had 175B) focused on language.

That 1,000x the parameters of the brain. Woof.

The most important feature is the Transformer
An interesting side note for the Crypto audience.

Crypto's own @ilblackdragon, the co-founder of @NEARProtocol, is one of the authors on the original Transformers Paper

arxiv.org/abs/1706.03762

First let's recap ChatGPT's process:
ChatGPT's Process

- Takes text
- Finds embeddings (numbers to represent text)
- Processes (values go through layers of the NN)
- New embedding produced (new number array)
- Takes array and generates 50,000 values for next possible tokens
- Highest prob produces text (I think)
Transformers are a breakthrough for LLMs.

An analogy is they allow the model to understand the context of words and the relationship between words that are far apart

Transformers can read all text at once vs one at a time so are much more efficient and scalable

Thanks ChatGPT! Image
Meaning Space

Stephen shares that in ChatGPT, text is represented by an array of numbers in a meaning space.

He goes on to describe that the trajectory of what words come next is far from a mathematical or physics like law we can rest our hats on. ImageImage
So is ChatGPT similar to a human brain?

His conclusions:

- The neural net architecture may be similar
- Training of LLMs way less efficient vs human brain
- ChatGPT has no loops to go back and recompute data like humans can which severely limits its computational capability
I am not an AI researcher but the post made me realize LLM's are nowhere near the AGI or Terminator level AI intelligence some fear

Of course it's on the path, but LLM's are probabilistic models focused on continuing sentences.

They are really good at it, but not AGI (yet)
I think it's incredibly cool that a gigabrain like @stephen_wolfram would open source his thinking on ChatGPT

This has been the single best resource I've found so far on learning about @OpenAI's ChatGPT, LLMs and Neural Nets

Disc. I def got things wrong

writings.stephenwolfram.com/2023/02/what-i…
Also @stephen_wolfram if you're ever interested in a long form podcast to walk through your thoughts, we'd love to host you on @Delphi_Digital's podcast!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tommy

Tommy Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Shaughnessy119

Mar 25
Watched @karpathy's must watch 3.5‑hour LLM deep dive and here are my favorite takeaways, surprises, and musings.

He is a god tier communicator and technical genius

My thread is below but I would recommend watching it as its the best way to learnImage
1a/ One standout example: DeepSeek-R1. Karpathy highlights it as a reasoning LLM pushed to its limits with reinforcement learning

Instead of just mimicking textbook solutions, DeepSeek was trained via trial and error to solve problems, especially tough math questions
1b/ DeepSeek-R1 uses large-scale RL fine-tuning on top of a base model

The result? As it learns, its answers become longer and more methodical. It will backtrack and retrace steps when needed, producing deeper reasoning chains.

This emergent behavior wasn’t pre-programmed at all!Image
Read 20 tweets
Dec 9, 2024
It’s no secret that Delphi Ventures has been deep into Crypto x AI

Just like crypto, the best way for us to learn is by using, building, and experimenting with the tech

Today, we’re thrilled to share the next step in that journey: DelphAI, our AI-powered Ventures analyst 🤖🧵 Image
Excited to introduce DelphAI, Delphi Ventures' AI analyst that's transforming how we evaluate crypto projects

We are announcing

• DelphiAI to the world
• $1M investment competition for project submitted to DelphiAI
• DelphAI's Architecture

Lets dive in
DelphAI's journey started 4 months ago with a simple idea: run pitch decks through AI for initial analysis. We quickly learned and iterated

• V1 (July 2024): Basic ChatGPT analysis
• V104 (Today): 18-step analysis pipeline
• 685+ pitch decks processed
• Multiple AI models working in concert
• Custom sector-specific analysis frameworks
• Live, online reference checks via perplexityImage
Read 19 tweets
Oct 31, 2024
Today I had the chance to read @Delphi_Digital's latest Crypto x AI report by @PonderingDurian

AI is the most powerful technology ever and Crypto is unstoppable programmatic money, together they create a new and better future

My favorite parts below 🧵

members.delphidigital.io/reports/deai-i…
"English is the hottest new programming language"

DeFi Summer brought 1000s of crypto experiments that created the DeFi blue chips

AI lowers the barriers app creation by 100x, and with Crypto for capital and incentives, will lead to an order of magnitude more agent experiments
Web2 companies have a hold on historical user data, Web Proofs could break this

With WebProofs, users can port their reputational and other data in a verified way.

Users can easily move their data to new apps, sell their data to train AI models and seamlessly move between apps Image
Read 11 tweets
May 9, 2024
Crypto x AI Flowchart

I wanted to share my mental model of the Crypto x AI landscape

Long term I think Crypto x AI wins because the most powerful technology of our generation can be owned, governed and iterated by an incentivized global community vs a centralized black box Image
Danger of Centralized AI's effects on millions of applications reaching billions of users. Image
Extremely simplified Crypto x AI Stack Image
Read 5 tweets
Dec 29, 2023
Just read @Delphi_Digital's The Year Ahead for Infrastructure 2024 report

It's by far the most comprehensive crypto infrastructure read and I feel way smarter having read it

Phenomenal work @CannnGurel, @ceterispar1bus, @yusufxzy and @markodayan

A few of my fav takeaways 🧵
1/ L2 competition will accelerate

Historically L2s have been friendly frens. Not anymore anon

Moving forward the authors expect L2s to silo more and favor bridging within their own ecosystems vs other external L2s

I like heightened competition we need it to push the boundaries Image
2/ EIP-4844 for Ethereum

An incredible graphic showing how EIP 4844 will expand base layer throughput on Ethereum to 150-300 TPS

This is still too low and will push projects toward ETH with an alt data availability layer ( @CelestiaOrg, @eigenlayer, Avail) to drive 1000s of TPS Image
Read 22 tweets
Nov 26, 2023
How I avoided Sinus Surgery using Google Bard 🥼🏥

I thought this was an interesting use of AI, so sharing the quick story below

TLDR - Bard found and summarized research studies to give me accurate info to make a decision - in a few hours of searching.
I've struggled with Allergies and sinus infections my whole life. I had allergy shots as a kid, and then sinus surgery (Balloon sinulpasty where they insert and inflate a balloon to expand your sinuses)

Recently I've been getting sinus infections again, so back to the ENT
The ENT was for another Balloon Sinuplasty (~$20k to them), but this didn't make sense to me as structurally my sinuses were already expanded

My gut take was to treat the underlying cause (allergies) more aggressively instead, but I needed more info

-> To Google's Bard I go
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(