Tommy Profile picture
May 28 22 tweets 9 min read Twitter logo Read on Twitter
Stephen Wolfram of Wolfram Alpha wrote the absolute best post on ChatGPT and Large Language Models.

It took me about two hours to read, but significantly increased my understanding of what's going on under the hood of ChatGPT.

A few of my favorite takeaways (helps my process) Image
The goal of a large language model is to reasonably continue the text it already has

ChatGPT's LLM estimates these probabilities

Temperature is a parameter that determines how often lower ranked words are used, adding randomness.

LLM's are trained on vast amounts of human text ImageImage
Where to these probabilities come from?

ChatGPT is a model that lets people estimate the probabilities which sequences of words should occur.

Stephen adds an interesting walk through demonstrating the probability of how often letters occur, and then pairs of letters and beyond Image
Neutral Nets are similar to a human brain

The brain has 100B neurons and are connected to ~1,000 other Neurons

A neuron pulses depending on what pulses it gets from other Neurons all with their own connections.

This contributes to different weights in the model. Viola!
Neural Net Explanation

- Neurons arranged in layers
- Each Neuron has a weight (significance)
- ML is first used to find the weights
- Neuron evaluates numerical function
- Input is fed and neurons at each layer evaluate and feeds results to next layer
- End result is reached Image
Larger networks do better at landing on results.

In the below image the goal is to take in a point and recognize it in one of the three regions.

I laughed when Stephen said at the boundaries it has trouble "making up its mind". Much human.

Unsure results could be dangerous Image
Training Neural Nets

The goal is to feed a zillion examples, and find weights that reproduce the examples.

Everytime an example is used, the weights are adjusted throughout the model.

Training is really expensive and computationally intensive. Image
Now how are the weights adjusted?

Stephen describes that the model uses a Loss Function.

The goal of adjusting the weights is to to reduce the loss function, or how far away your output is from the intended result based upon the examples.

More data, lower loss function Image
One of the most counterintuitive takeaways that with Neural Nets it's easier to solve more complicated problems than simpler ones.

That's good too since I'm dumb and need help with the complicated problems in life.

I'll let Stephen take it from here: Image
ChatGPT has an easier time training since it can conduct "Unsupervised Learning"

- ChatGPT gets text (masks end)
- Use probabilities to get the end of the sentence
- Use this as a training input
- Output is the complete piece of text

TLDR it's easier to get examples to train
Summing this all up, Stephen shares an image showing the training process for a neural net and how the loss function should decrease over time.

If the loss eventually streamlines, yay you have a solid model

If it not you can't rely on it and it's time to change the architecture Image
ChatGPT is often extrapolated as a path to Terminators

Stephen counters that the magic of LLMs for writing really isn't that hard.

We're not closer to terminators, writing essays just isn't as hard as we think.

@stephen_wolfram plz share more on NN's replacing humans (pic 2) ImageImage
Embeddings

Embeddings are laying out words, represented by numbers, to those they are commonly associated with

Probabilities are found using vast amounts of text

Embeddings give a more natural feel to ChatGPT since words that are commonly associated with each other can be used Image
Onto ChatGPT!

ChatGPT+ is a giant Neural Net with 100 Trillion Parameters (GPT3 had 175B) focused on language.

That 1,000x the parameters of the brain. Woof.

The most important feature is the Transformer
An interesting side note for the Crypto audience.

Crypto's own @ilblackdragon, the co-founder of @NEARProtocol, is one of the authors on the original Transformers Paper

arxiv.org/abs/1706.03762

First let's recap ChatGPT's process:
ChatGPT's Process

- Takes text
- Finds embeddings (numbers to represent text)
- Processes (values go through layers of the NN)
- New embedding produced (new number array)
- Takes array and generates 50,000 values for next possible tokens
- Highest prob produces text (I think)
Transformers are a breakthrough for LLMs.

An analogy is they allow the model to understand the context of words and the relationship between words that are far apart

Transformers can read all text at once vs one at a time so are much more efficient and scalable

Thanks ChatGPT! Image
Meaning Space

Stephen shares that in ChatGPT, text is represented by an array of numbers in a meaning space.

He goes on to describe that the trajectory of what words come next is far from a mathematical or physics like law we can rest our hats on. ImageImage
So is ChatGPT similar to a human brain?

His conclusions:

- The neural net architecture may be similar
- Training of LLMs way less efficient vs human brain
- ChatGPT has no loops to go back and recompute data like humans can which severely limits its computational capability
I am not an AI researcher but the post made me realize LLM's are nowhere near the AGI or Terminator level AI intelligence some fear

Of course it's on the path, but LLM's are probabilistic models focused on continuing sentences.

They are really good at it, but not AGI (yet)
I think it's incredibly cool that a gigabrain like @stephen_wolfram would open source his thinking on ChatGPT

This has been the single best resource I've found so far on learning about @OpenAI's ChatGPT, LLMs and Neural Nets

Disc. I def got things wrong

writings.stephenwolfram.com/2023/02/what-i…
Also @stephen_wolfram if you're ever interested in a long form podcast to walk through your thoughts, we'd love to host you on @Delphi_Digital's podcast!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tommy

Tommy Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Shaughnessy119

Mar 4
ETH Denver left me extremely bullish on the future of Ethereum

The convos around

▫️ZK as endgame scalability/privacy tech

▫️MEV solutions/approaches

▫️Learning from founders as a mentor

▫️New Games

were some of the smartest and most thoughtful I’ve seen in a long time
On the other hand we have

🔻A macro situation which gives little reason to bid risk assets along with 5% t-bills vacuuming up risk capital

🔻A chaotic regulation by enforcement war ongoing

🔻 Centralized Company damage (Silvergate, Genesis, etc)
TLDR I left pretty bullish the mid-long term future of the space (2-10 years out) but I think the next ~1 year will be volatile

Crypto moves well ahead of equity markets so I’m not a macro crypto doom boy, as some of a potential recession may already be priced in
Read 10 tweets
Jan 6
Quick take on state of Crypto Venture Funds, Hedge Funds and Fundraising

Venture Funds: Unlikely VCs get a ton of new capital, especially with mega funds already being allocated to by UHNWs and they are sitting on $Bs in dry powder.
I do think this is the absolute best time to allocate to pre-seed/seed deals though given no fomo, 1st principles thinking and lower vals.

VCs are setting Vals right now (equity vals and dilution to get to token FDVs)

The past few years founders were setting vals.
Hedge Funds: Redemptions will hurt the market esp if funds have to liq and further hurt prices. Funds have to pay out for tax liabilities too. A lot of funds are also down a ton given locked funds on FTX so they have to liq more of what they do have to to cover LP taxes
Read 5 tweets
Jan 6
Going to the gym today for 55:09 exactly 🔥

Ty @Kevin_Kelly_II @3xliquidated @ccioce + Andrew Krohn

members.delphidigital.io/reports/the-ye… Image
One of my favorite takeaways is on earnings declines during recessions (~26% on average in 01, 08 and 20) and currently NTM EPS estimates for the S&P500 are only down 4% from peak Image
For everyone thinking a FED pivot means a market pivot, think again. Image
Read 5 tweets
Jan 5
DCG Situation:

- DCG Owes $2.025B
- Genesis can call their $1.675B loan
- Genesis owes $900M to Gemini

DCG Liq

- Grayscale $10B AUM x 2% = $200M x 3x multiple = $600M
- GBTC/ETHE Holdings = 9.7%/3.8% = $629M with discount, $1.17B at Par
- VC book = Firesale values in a bear
Valuation multiples for asset management firms vary pretty widely from 1.5-2x for T-Rowe / Franklin Templeton to a few turns higher for other names like Blackrock at 4.5-6x.

I don't think Grayscale would command a company friendly valuation given the unwind risks.
On Grayscale:

Can probably fetch 600-800M in a sale on a 3-4x multiple.

Future fee generation is under pressure as entities sue for Reg M relief to close the discount

So let's call it $600M from Grayscale for now
Read 10 tweets
Jan 2
1/ Loved Delphi's DeFi report 📖💰

It's the most comprehensive report I've read on DeFi 👀

Kudos to @ashwathbk, @yeak_ and editors by @Kevin_Kelly_II, Brian M h/t @Delphi_Digital

Sharing my take on their themes + projects I want (and don't) to fund in 2023

1/41 🧵
2/ Setting the stage, Decentralized Finance has gotten crushed in the bear market

Most blue chip DeFi tokens down 80-90% and the longer tail of tokens down more

What's hard to see in the price charts are the teams that just gave up or disbanded. This will take time to show up
3/ The authors call out DeFi's growth headwinds

- DeFi products are speculation based
- Onboarding is difficult
- Retaining users is hard
- Overall UX of the space is far from ideal
Read 47 tweets
Nov 18, 2022
My working model of DCG, Genesis, and Grayscale

🏰 DCG: Hold co owns Coindesk, Genesis, Grayscale, and many other cos

💳 Genesis: Lending arm with ~$3B in total loans (per an article)

💰 Grayscale: Cash Cow. $ETHE = $3.6B, $GBTC $10.6B (trading at a 43% discount to NAV)
The news item is that Genesis was looking for a $1B emergency loan, probably to meet withdrawals.

My thoughts process is that Genesis accepted ETHE and GBTC as collateral for loans, and now the value of their ETHE and GBTC holdings are 40% below their true value
Why not just sell? Well with limited liq and an already 40% discount you can't

Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(