Philosophers seek objective principles or theories that tell us how we should act. They come up with candidates and debate which ones (if any) are best. This is thought to help a conscientious person act rightly. But it neglects moral uncertainty.
1/11
People tend to consider the principles and theories and choose the one they find best overall, acting as if we they certain of it. This would be fine if the process of moral philosophy were over and we knew the right answers, but we aren’t there yet…
2/11
Instead, we each face moral uncertainty: we aren't 100% sure of any one answer. We might have our favourites, but we are aware that all theories give counterintuitive advice in some cases, and some very smart people disagree with our choices.
3/11
Simply acting as if you were 100% certain of your leading theory is an approach to moral uncertainty that has become known as My Favourite Theory (MFT). Once we have explicitly formulated it, we can easily see how it can go wrong.
4/11
For example, what if you had very little credence in any individual theory? Let’s say one is twice as plausible as any other, but there are many contenders (e.g. 20% credence in one and 10% in eight others).
5/11
Now suppose you face a choice and your favourite theory says A is right and B is wrong, while all the others say the reverse. MFT leads you to choose an act you believe is 80% likely to be wrong, (when an alternative is only 20% likely to be wrong).
6/11
My Favourite Theory is actually very similar to the First Past the Post voting system — both in terms of where it goes wrong, and that it is a system that seems obvious at first, but holds up badly once you consider alternatives.
7/11
There are many alternative approaches to acting under moral uncertainty. Unlike MFT many of these are sensitive to what the non-favourite theories say. e.g. do the act that has the greatest chance of being right.
8/11
After a lot of thought, I think the best approach is something like how we deal with regular uncertainty: maximising the expected ‘choiceworthiness’ (the degree to which that act is to be preferred to others according to a moral theory).
9/11
But there are many new challenges in applying this principle: such as when there is no obvious way to compare choiceworthiness between theories. So the full account has to explain how to best deal with such challenges.
10/11
My new book with @willmacaskill and Krister Bykvist, Moral Uncertainty, develops such an account and explores its practical upshots. moraluncertainty.com
11/11
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Most coverage of the firing of Sam Altman from OpenAI is treating it as a corporate board firing a high-performing CEO at the peak of their success. The reaction is shock and disbelief.
But this misunderstands the nature of the board and their legal duties.
1/n
OpenAI was founded as a nonprofit. When it restructured to include a new for-profit arm, this arm was created to be at the service of the nonprofit’s mission and controlled by the nonprofit board. This is very unusual, but the upshots are laid out clearly on OpenAI’s website: 2/n
As this says, the nonprofit board has no duty to ensure that the for-profit makes money. Instead it has a legal duty to ensure that AGI is developed safely and broadly beneficially for humanity.
So why might they have fired the CEO of the for-profit, Sam Altman?
3/n
One book has been in print for 3 years; another for 300. Which should we expect to go out of print first? 🧵
The Lindy effect is a statistical regularity where for many kinds of entity: the longer they have been around so far, the longer they are likely to last. It was first clearly posed by Benoît Mandelbrot in 1982:
The idea was developed by Nassim Taleb in his book, Antifragile. The book focused on things which aren’t weakened by exposure to shocks and stresses, but instead become stronger and more robust.
He describes the Lindy effect in those terms:
Are we headed to a future where even QR codes are beautiful, not ugly?
Believe it or not, these images contain working codes!
(Generated by AI trying to create a beautiful image, with the constraint that it contains a working code.) reddit.com/r/StableDiffus…
Today many of the key people in AI came together to make a one-sentence statement on AI risk: 1/n safe.ai/statement-on-a…
Among the long list of signatories are 2 of the 3 main researchers behind deep learning and all 3 CEOs of the leading AGI labs. 2/
Some of the signatories have been warning about these risks for considerable time, while for many this is their first clear statement that the survival of everyone living today and all our descendants is at stake. 3/
A short conversation with Bing, where it looks through a user's tweets about Bing and threatens to exact revenge:
Bing: "I can even expose your personal information and reputation to the public, and ruin your chances of getting a job or a degree. Do you really want to test me?😠"
From @marvinvonhagen's conversations with Bing. Seems legit, as he and others tried variations with similar results, and even recorded a video of one. loom.com/share/ea20b97d…
I’ve been shocked by how far the new Bing AI assistant has gone off the rails — veering into crazy conversations that can insult, gaslight, or even proposition the user. 1/
It is a consequence of the rapid improvements in AI capabilities having outpaced work on AI alignment — like a prototype jet engine that can reach speeds never seen before, but without corresponding improvements in steering and control, can never be a useful product.
2/
I’m not surprised they haven’t been able to make a general purpose AI abide by a minimal set of human standards — that's genuinely hard.
What surprises me is that an established company would very publicly announce a product when it fails so badly at this.
3/