Okay, taking a few moments to reat (some of) the #gpt4 paper. It's laughable the extent to which the authors are writing from deep down inside their xrisk/longtermist/"AI safety" rabbit hole.

Things they aren't telling us:
1) What data it's trained on
2) What the carbon footprint was
3) Architecture
4) Training method

But they do make sure to spend a page and half talking about how they vewwy carefuwwy tested to make sure that it doesn't have "emergent properties" that would let is "create and act on long-term plans" (sec 2.9).

I also lol'ed at "GPT-4 was evaluated on a variety of exams originally designed for humans": They seem to think this is a point of pride, but it's actually a scientific failure. No one has established the construct validity of these "exams" vis a vis language models.

For more on missing construct validity and how it undermines claims of 'general' 'AI' capabilities, see:


Also LOL-worthy, against the backdrop of utter lack of transparency was "We believe that accurately predicting future capabilities is important for safety.

... Going forward we
plan to refine these methods and register performance predictions across various capabilities before
large model training begins, and we hope this becomes a common goal in the field."

Trying to position themselves as champions of the science here & failing.
A cynical take is that they realize that without info about data, model architecture & training set up, we aren't positioned to reason about how the model produces the results that it does ... and thus more likely to believe claims of `AGI' and thus buy what they're selling.

But given all the xrisk rhetorical (and @sama 's blogpost from Feb) it may also be possible that at least some of the authors on this thing actually believe their own hype and really think they are making choices about "safety".

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with @emilymbender@dair-community.social on Mastodon

@emilymbender@dair-community.social on Mastodon Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @emilymbender

Mar 14
A journalist asked me to comment on the release of GPT-4 a few days ago. I generally don't like commenting on what I haven't seen, but here is what I said:

#DataDocumentation #AIhype

"One thing that is top of mind for me ahead of the release of GPT-4 is OpenAI's abysmal track record in providing documentation of their models and the datasets they are trained on.

Since at least 2017 there have been multiple proposals for how to do this documentation, each accompanied by arguments for its importance.

Read 7 tweets
Mar 14
MSFT lays off its responsible AI team

The thing that strikes me most about this story from @ZoeSchiffer and @CaseyNewton is the way in which the MSFT execs describe the urgency to move "AI models into the hands of customers"


@ZoeSchiffer @CaseyNewton There is no urgency to build "AI". There is no urgency to use "AI". There is no benefit to this race aside from (perceived) short-term profit gains.

@ZoeSchiffer @CaseyNewton It is very telling that when push comes to shove, despite having attracted some very talented, thoughtful, proactive, researchers, the tech cos decide they're better off without ethics/responsible AI teams.

Read 9 tweets
Mar 6
I'd like to point out: Serious AI researchers can get off the hype train at any point. It might not have been your choice that your field was invaded by the Altmans of the world, but sitting by quietly while they spew nonsense is a choice.

Likewise, describing your own work in terms of unmotivated and aspirational analogies to human cognitive abilities is also a choice.

If you feel like it wouldn't be interesting without that window dressing, it's time to take a good hard look at the scientific validity of what you are doing, for sure.

Read 4 tweets
Mar 6
I really don't understand why @60Minutes relegated this to their "overtime" segment. @timnitGebru is here with the most important points:

@60Minutes @timnitGebru Meanwhile, the way that MSFT's Brad Smith is grinning as Stahl describes the horrific things that the Bing chatbot was saying. And then he breezily said: "We fixed it in 24 hours! How many problems are fixable in 24 hours?"


@60Minutes @timnitGebru But the fix wasn't anything internal to their chatbot. Rather, it was a change to the UI, i.e. change the ways in which people can interact with the system (limits on the length of conversations).
Read 6 tweets
Mar 5
Finally had a moment to read this statement from the FTC and it is 🔥🔥🔥


A few choice quotes:
But one thing is for sure: ["AI" is] a marketing term. Right now it’s a hot one. And at the FTC, one thing we know about hot marketing terms is that some advertisers won’t be able to stop themselves from overusing and abusing them.


"Marketers should know that — for FTC enforcement purposes — false or unsubstantiated claims about a product’s efficacy are our bread and butter."


Read 13 tweets
Feb 27
The reactions to this thread have been an interesting mix --- mostly folks are in agreement and supportive. However, there are a few patterns in the negative responses that I think are worth summarizing:
Some folks are very upset with my tone and really feel like I should be more gentle with the poor poor billionaire.


A variant of this seems to be the assumption that I'm trying to get OpenAI to actually change their ways and that I'd be more likely to succeed if I just talked to them more nicely.

Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!


0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy


3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!