Colin Fraser Profile picture
Jan 28 14 tweets 6 min read
I just published my big Medium article about GPT. This was a labor of love & hate that I have been writing for a while. It's got a collection of examples of GPT doing funny things, which for those who don't want to deal with a 40-min read, I'll put here 🧵 medium.com/@colin.fraser/…
It also asks and tries to answer
- What are language models?
- What happens if gpt passes a bar exam?
- Is scale all you need?
- ChatGPT is based on GPT... what does that mean, exactly?
- What are fine tuning and RLHF?
- How exactly do teams of contractors contribute to GPT?
The Dumb Monty Hall Problem
The actual Monty Hall Problem
Acrostics (I think ELISTHAR is a very pretty name for a girl)
ChatGPT is a little coy about its ideas on gender roles (I explain why in the piece), but if you're clever enough (not that clever) you can trick it into telling you what it really thinks.
An absolutely bizarre response that left me confused and baffled
"Let's think step by step" doesn't always get you a better answer. The first response is right and the second is wrong. Your move, prompt engineers.
Rin Tin Tin IV, the dog that swam across the Atlantic Ocean in 1970. RIP.
Math
My overall thesis here is that none of this is very surprising; in general we should not expect the output of an LLM to correspond to the truth in any reliable way. They are bullshit emitters, in the "On Bullshit" sense. Every one of them.
Ventures that rely on an LLM to produce anything other than bullshit are doomed to fail. We've already seen it, and we'll see it again. futurism.com/cnet-bankrate-…
This is an example of such a venture. Making things up is not a bug to squash, it's the single defining feature of a large language model. vice.com/en/article/wxn…
I am tired of the extreme charity afforded to the Cerebral Valley guys. Everything has one minor bug that's fixed in the next version that is coming very soon. In the mean time we are supposed to be in awe of the machine that adds 175 billion numbers together to output 2 + 2 = 5.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Colin Fraser

Colin Fraser Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @colin_fraser

Oct 27, 2022
My most unpopular data opinion is that alerts for metrics are usually useless and bad, and you're much better off scheduling regular time to look at a dashboard with your human eyes. Everyone always gets mad at me when I say this.
One of two things **always** happens. Either the alert is too sensitive and becomes spam, or the alert is not sensitive enough and misses important stuff. It's hard (impossible, even!) to find the sweet spot where the alert emails you if and only if an important thing happens.
It's also impossible to build an alert that captures every type of anomaly that you want to be made aware of.
Read 4 tweets
Jul 20, 2022
Someone on here (I forget who I'm sorry) linked to this paper and it derives this statistical identity that is completely mind blowing and I want to tweet about it.

bias = data quality × data quantity × problem difficulty

statistics.fas.harvard.edu/files/statisti…
(I'll provide some applications to Twitter bots and Elon; it's extremely applicable here)
bias = data quality × data quantity × problem difficulty
Read 37 tweets
Jul 20, 2022
I read a really bad paper yesterday and got pissed off and tweeted about it, but I read a really good paper today and got happy and so I'm going to tweet about that
I'm really cookin' up a thread on this one. It has applications to the Twitter Bot Measurement Debate so buckle up
I reached the limit of how many tweets you're allowed to put in a thread at once
Read 5 tweets
Jul 18, 2022
I'm losing my mind at how inane this "research" is. These are researchers at major schools just putting out absolute trash.
Setting aside that the premise is horrifying it's just absolutely bad worthless research. Basically:

We built a multi-class classifier to classify users into one of the three categories of LGBT person that we made up:
1. person
2. organization
3. sexual worker/porn
We find that adding more features increases the performance of the classifier, until it doesn't anymore
Read 7 tweets
Apr 27, 2021
"If FB has a dial that can turn hateful content down, why doesn't it turn it down all the time?" is a good and important question. The answer is exactly the precision recall tradeoff en.wikipedia.org/wiki/Receiver_… Image
You can catch all hate speech by deleting every post on Facebook, but you'll have a lot of false positives. You can eliminate all false positives by never deleting a post, but you'll miss all the hate speech. Facebook has to choose a point along that continuum.
Turning down the hateful content knob is exactly the same as turning up the false positives knob. And depending on where you are on that curve, it can be a bad bargain—the number of false positives that get created might be a lot higher than the hateful posts that get detected
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(