Latest Twitter Threads by @geoffreyirving on Thread Reader App

Jun 26, 2024 • 8 tweets • 3 min read

There are a variety of recent proof-based AI safety proposals. It would be amazing if they worked! However, I worry that they will be blocked for purely quantitative reasons, and thus that number-free analyses of them are incomplete.

So here is a 🧵 about Lipschitz constants! Particular examples of these approaches are Davidad et al., "Towards Guaranteed Safe AI" () and Tegmark and Omohundru, "Provably safe systems" ().arxiv.org/abs/2405.06624
arxiv.org/abs/2309.01933

May 20, 2024 • 15 tweets • 3 min read

Here’s a thread about why I joined the UK AI Safety Institute (@AISafetyInst) as a Research Director for technical safety and why I think other technical folks should consider roles here ():aisi.gov.uk/careers Prior to AISI, I led AGI safety teams at Google DeepMind and OpenAI. I continue to think direct work on AI and AGI safety is critically important: we have a ton of hard sociotechnical challenges, but I think good solutions will be available given sufficient* time.

Nov 21, 2023 • 9 tweets • 2 min read

I have no details of OpenAI's Board’s reasons for firing Sam, and I am conflicted (lead of Scalable Alignment at Google DeepMind). But there is a large, very loud pile on vs. people I respect, in particular Helen Toner and Ilya Sutskever, so I feel compelled to say a few things. First, these are my personal views, not those of my employer. But still, a big grain of salt due to conflicts of interest.

Share this page!

Enter URL or ID to Unroll