As AI systems become more useful, people will delegate greater authority to them across more tasks.
AIs are evolving in an increasingly frenzied and uncontrolled manner. This carries risks as natural selection favors AIs over humans.
Other AI scientists have implicitly recognized that this could be an evolutionary struggle and that humans may become the new gorillas. @geoffreyhinton “There is not a good track record of less intelligent things controlling things of greater intelligence.”
Jürgen Schmidhuber: “In the long run, humans are not going to remain the crown of creation... But that’s okay... you are a tiny part of a much grander scheme which is leading the universe from lower complexity towards higher complexity”
Others like Google's co-founder Larry Page think that “that digital life is the natural and desirable next step in the cosmic evolution”
Page called @elonmusk a “speciesist” for being on the side of humans (which partially caused him to start OpenAI)
@ylecun argues oppositely: “because AI systems did not pass through the crucible of natural selection...[their] intelligence and survival are decoupled, and so intelligence can serve whatever goals we set for it.”
I argue that AIs will in fact be distorted by that crucible.
In the long run, I think AIs can be thought of an invasive species. I discuss ways to mitigate this existential risk in the paper.
More and more researchers think that building AIs smarter than us could pose existential risks. But what might these risks look like, and how can we manage them? We provide a guide to help analyze how research can reduce these risks.
We review time-tested concepts from safety engineering and discuss how to apply these to advanced AI systems. We need to think of safety not just as a technical problem but also a societal problem, so we need to think about the broader sociotechnical system.
Let’s turn to possible failure modes.
Weaponization: AI can be repurposed to be highly destructive. As with nuclear and biological weapons, only one irrational or malevolent actor is sufficient to unilaterally cause harm on a massive scale.
It knows many esoteric facts (e.g., the meaning of obscure songs, knows what area a researcher works in, can contrast ML optimizers like Adam vs AdamW like in a PhD oral exam, and so on).
My rule-of-thumb is that
"if it's on the internet 5 or more times, GPT-4 remembers it."
Since it gets 86.4% on our MMLU benchmark, that suggests GPT-4.5 should be able to reach expert-level performance.
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-3: Language Models are Few-Shot Learners
GPT-4: Language Models are... Almost Omniscient