Post

More from @boazbaraktcs

Boaz Barak

@boazbaraktcs

Jul 15

I didn't want to post on Grok safety since I work at a competitor, but it's not about competition.

I appreciate the scientists and engineers at @xai but the way safety was handled is completely irresponsible. Thread below.

I can't believe I'm saying it but "mechahitler" is the smallest problem:

* There is no system card, no information about any safety or dangerous capability evals.
* Unclear if any safety training was done. Model offers advice chemical weapons, drugs, or suicide methods.
* The "companion mode" takes the worst issues we currently have for emotional dependencies and tries to amplify them.

lesswrong.com/posts/dqd54wpE…

This is not about competition. Every other frontier lab - @OpenAI (where I work), @AnthropicAI, @GoogleDeepMind, @Meta at the very least publishes a model card with some evaluations. Even DeepSeek R1, which can be easily jailbroken, at least sometimes requires jailbreak. (And unlike DeepSeek, Grok is not open sourcing their model.)

Read 4 tweets

Boaz Barak

@boazbaraktcs

Dec 21, 2024

1/5 Excited that our paper on "deliberative alignment" came out as part of 12 days of @openai! By teaching reasoning models the text of our specifications, and how to reason about them in context, we obtain significantly better robustness while also reducing over refusals. 🧵

2/5 Traditionally, AI models are just trained with (input, good response, bad response) data, but they are not taught to reason *why* these responses are good or bad. This teaches good "system 1" instincts, but these can fail in new situations. "System 2" allows model to adapt, e.g. when input is encoded.

3/5 In alignment reasoning, models are trained to reason about safety specifications in context which provides strong out-of-distribution performance. E.g., can handle encoded requests even when trained without such data.

Read 5 tweets

Boaz Barak

@boazbaraktcs

May 3, 2022

1/5 A blog post/book review on history&philosophy of science, reviewing Weinberg's "To Explain The World" and Strevens' "The Knowledge Machine" windowsontheory.org/2022/05/03/phi…

Trigger warning: I compare science to the blockchain, and find positive aspects in the infamous "reviewer 2" 😀

2/5 I found both books fascinating, and recommend reading them. Both focus on roughly the history between Aristotle to Newton, and show that many "simple stories" are more complex than at least I knew before.

3/5 Two examples:
* Copernicus' helio-centric theory was actually worse at predictions than the geo-centric theory of Ptolemy that came before it.

* Addington's 1919 confirmation of Einstein's general relativity involved a lot of "subjective interpretation" of telescope images.

Read 5 tweets

Boaz Barak

@boazbaraktcs

Apr 5, 2022

https://twitter.com/minilek/status/1511358525179453440

1) Jo Boaler charges Oxnord district (100% minority 86.9% economically disadvantaged) $5000 per hour for (dubious, but that's another story) "professional development".

2) Jelani Nelson is outraged, points out he spent 1000s unpaid hours on minority education initiatives.

https://twitter.com/minilek/status/1511358525179453440

3) He tweets Boaler's public contract with a public school district, which is available on their website.

4) Boaler emails him claiming he is "sharing private details" and "spreading misinformation" about her. She tells him that this is "taken up by police and lawyers".

While I find charging a public school district $5000/hour egregious, I think the main harm to minority and less resourced students would come from adopting Prof Boaler's recommendations.

See this document and the links in it for more gdoc.pub/doc/e/2PACX-1v…

Read 6 tweets

Boaz Barak

@boazbaraktcs

Jan 21, 2022

https://twitter.com/tomgoldsteincs/status/1484609273162309634

Worth reading. I don't know if "science" vs "principled ML" is the right terminology, but this does touch upon a real phenomenon.

In many areas of computer science (algorithms, crypto), theory is *ahead* of practice. E.g., consider multiparty secure computation, PCPs, etc (🧵)

https://twitter.com/tomgoldsteincs/status/1484609273162309634

They were proposed in 80s and 90s, considered wildly impractical, and only recently began to be implemented and used.

In contrast, in deep learning currently, practice is ahead of theory. Rather than having theoretical proposals that are too complex or inefficient to implement..

..we have practical tools that exhibit too complex a behavior for us to analyze with our theoretical tools.

In that sense, these tools behave more similar to discovered natural objects than to designed algorithms.

Read 5 tweets

Boaz Barak

@boazbaraktcs

Dec 3, 2021

@adrian_mims

1/14 More than 150 scientists & educators signed open letter raising alarm on efforts to water down K-12 math education

scottaaronson.blog/?p=6146

Signers include Fields, Nobel & Turing laurates, and also founders of HS STEM educational initiatives (eg @adrian_mims, @minilek).

2/14 Specifically California proposed changes to its CMF that encourage schools to drop algebra from middle school, and put obstacles on reaching calculus in high school. They also de-emphasize calculus&algebra in favor of shallow "data science" courses.

bit.ly/cmfanalysis

3/14 These well-intentioned but misguided changes will hurt all students, but mostly those without resources to work around them, as already was the case in San Francisco edsource.org/2021/one-distr…

Read 14 tweets

Share this page!

Enter URL or ID to Unroll

Boaz Barak

Try unrolling a thread yourself!

More from @boazbaraktcs

Boaz Barak

Boaz Barak

Boaz Barak

Boaz Barak

Boaz Barak

Boaz Barak

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!