|Part of a series on|
Existential risk from artificial general intelligence is the hypothesis that substantial progress in artificial general intelligence (AGI) could result in human extinction or another irreversible global catastrophe.
One argument goes as follows: The human species currently dominates other species because the human brain possesses distinctive capabilities other animals lack. If AI were to surpass humanity in general intelligence and become superintelligent, then it could become difficult or impossible to control. Just as the fate of the mountain gorilla depends on human goodwill, so might the fate of humanity depend on the actions of a future machine superintelligence.
The plausibility of existential catastrophe due to AI is widely debated, and hinges in part on whether AGI or superintelligence are achievable, the speed at which dangerous capabilities and behaviors may emerge, and whether practical scenarios for AI takeovers exist. Concerns about superintelligence have been voiced by leading computer scientists and tech CEOs such as Geoffrey Hinton, Yoshua Bengio, Alan Turing,[a] Elon Musk, and OpenAI CEO Sam Altman. In 2022, a survey of AI researchers with a 17% response rate found that the majority of respondents believed there is a 10 percent or greater chance that our inability to control AI will cause an existential catastrophe. In 2023, hundreds of AI experts and other notable figures signed a statement that "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war." Following increased concern over AI risks, government leaders such as United Kingdom prime minister Rishi Sunak and United Nations Secretary-General António Guterres called for an increased focus on global AI regulation.
Two sources of concern stem from the problems of AI control and alignment: controlling a superintelligent machine or instilling it with human-compatible values may be difficult. Many researchers believe that a superintelligent machine would resist attempts to disable it or change its goals, as such an incident would prevent it from accomplishing its present goals. It would be extremely difficult to align a superintelligence with the full breadth of significant human values and constraints. In contrast, skeptics such as computer scientist Yann LeCun argue that superintelligent machines will have no desire for self-preservation.
A third source of concern is that a sudden "intelligence explosion" might take an unprepared human race by surprise. Such scenarios consider the possibility that an AI which surpasses its creators in intelligence might be able to recursively improve itself at an exponentially increasing rate, improving too quickly for its handlers and society writ large to control. Empirically, examples like AlphaZero teaching itself to play Go show that domain-specific AI systems can sometimes progress from subhuman ability to superhuman ability very quickly, although such systems do not involve the AI altering its fundamental architecture.
One of the earliest authors to express serious concern that highly advanced machines might pose existential risks to humanity was the novelist Samuel Butler, who wrote the following in his 1863 essay Darwin among the Machines:
The upshot is simply a question of time, but that the time will come when the machines will hold the real supremacy over the world and its inhabitants is what no person of a truly philosophic mind can for a moment question.
In 1951, foundational computer scientist Alan Turing wrote an article titled Intelligent Machinery, A Heretical Theory, in which he proposed that artificial general intelligences would likely "take control" of the world as they became more intelligent than human beings:
Let us now assume, for the sake of argument, that [intelligent] machines are a genuine possibility, and look at the consequences of constructing them... There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler's Erewhon.
In 1965, I. J. Good originated the concept now known as an "intelligence explosion"; he also stated that the risks were underappreciated:
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion', and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously.
Occasional statements from scholars such as Marvin Minsky and I. J. Good himself expressed philosophical concerns that a superintelligence could seize control, but contained no call to action. In 2000, computer scientist and Sun co-founder Bill Joy penned an influential essay, "Why The Future Doesn't Need Us", identifying superintelligent robots as a high-tech danger to human survival, alongside nanotechnology and engineered bioplagues.
Nick Bostrom published Superintelligence in 2014, which presented his arguments that superintelligence poses an existential threat. By 2015, public figures such as physicists Stephen Hawking and Nobel laureate Frank Wilczek, computer scientists Stuart J. Russell and Roman Yampolskiy, and entrepreneurs Elon Musk and Bill Gates were expressing concern about the risks of superintelligence. Also in 2015, the Open Letter on Artificial Intelligence highlighted the "great potential of AI", and encouraged to focus more research on how to make it robust and beneficial. In April 2016, Nature warned: "Machines and robots that outperform humans across the board could self-improve beyond our control—and their interests might not align with ours." In 2020, Brian Christian published The Alignment Problem, which detailed the history of progress on AI alignment up to that time.
In March 2023, key figures in AI, such as Elon Musk, signed a letter from the Future of Life Institute calling a halt to advanced AI training until it could be properly regulated. In May 2023, the Center for AI Safety released a statement signed by numerous experts in AI safety and the AI existential risk which stated: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."
Artificial general intelligence (AGI) is typically defined as a system that performs at least as well as humans in most or all intellectual tasks. A 2022 survey of AI researchers found that 90% of respondents expected AGI would be achieved in the next 100 years, and half expected the same by 2061. Meanwhile some researchers dismiss existential risks from AGI as "science fiction" based on their high confidence that AGI will not be created any time soon.
Breakthroughs in large language models have led some researchers to reassess their expectations. Notably, Geoffrey Hinton said in 2023 that he recently changed his estimate from "20 to 50 years before we have general purpose A.I." to "20 years or less".
In contrast with AGI, Nick Bostrom defines a superintelligence as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest", including scientific creativity, strategic planning, and social skills. He argues that a superintelligence can outmaneuver humans any time its goals conflict with human goals. It may choose to hide its true intents until humanity can't stop it anymore. Bostrom considers that in order to be safe for humanity, a superintelligence would have to be genuinely aligned with human values and morality, so that it is "fundamentally on our side".
Stephen Hawking argued that superintelligence is physically possible because "there is no physical law precluding particles from being organised in ways that perform even more advanced computations than the arrangements of particles in human brains".
When superintelligence may be achieved, if ever, is necessarily less certain than predictions for AGI. In 2023, OpenAI leaders stated that not only AGI, but superintelligence may be achieved in less than 10 years.
Nick Bostrom argues that AI has many advantages over the human brain:
According to Bostrom, an AI that has an expert level at certain key software engineering tasks could become a superintelligence due to its capability to recursively improve its own algorithms, even if it is initially limited in other domains not directly relevant to engineering. This suggests that an intelligence explosion may someday catch humanity unprepared.
The economist Robin Hanson considers on the other side that, to launch an intelligence explosion, the AI would have to become vastly better at software innovation than all the rest of the world combined, which looks implausible to him.
In a "fast takeoff" scenario, the transition from AGI to superintelligence could take days or months. In a "slow takeoff", it could take years or decades, leaving more time for society to prepare.
Superintelligences are sometimes described as "alien minds", to refer to the idea that their way of thinking and motivations could be vastly different from those of humans. This is generally considered as a source of risks, making it more difficult to anticipate what a superintelligence might do. It also suggests the possibility that a superintelligence may not particularly value humans by default. To avoid anthropomorphism, superintelligence is sometimes viewed as a powerful optimizer, that takes the best decisions to achieve its goals.
The field of "mechanistic interpretability" aims to better understand the inner workings of AI models, potentially allowing one day to detect signs of deception and misalignment.
It has been argued that there are some limitations to what intelligence can achieve. Notably, the chaotic nature or time complexity of some systems could fundamentally limit the ability of a superintelligence to predict some aspects of the future, increasing its uncertainty.
Advanced AI could generate enhanced pathogens, cyberattacks or manipulate people. These capabilities could be misused by humans, or potentially exploited by the AI itself if misaligned. While a full-blown superintelligence could find various ways to gain a decisive influence if it wants to, these dangerous capabilities may become available earlier, in weaker and more specialized AI systems. They may cause societal instability and empower malicious actors.
Geoffrey Hinton warned that in the short term, the profusion of AI-generated text, images and videos will make it more difficult to figure out the truth. Which he says could be used by authoritarian states to manipulate elections. Such large-scale, personalized manipulation capabilities can increase the existential risk of a worldwide "irreversible totalitarian regime". It could also be used by malicious actors to fracture society and make it dysfunctional.
AI-enabled cyberattacks are increasingly considered a present and critical threat. According to NATO's technical director of cyberspace, "The number of attacks is increasing exponentially". AI can also be used defensively, to preemptively find and fix vulnerabilities, and to detect threats.
AI could improve the "accessibility, success rate, scale, speed, stealth and potency of cyberattacks", potentially causing "significant geopolitical turbulence" if they facilitate attacks more than defense.
Speculatively, such hacking capabilities could be for example used by an AI system to break out of its local environment, generate revenue or acquire cloud computing resources.
As AI technology democratizes, it may become easier to engineer more contagious and lethal pathogens. This could enable individuals with limited skills in synthetic biology to engage in bioterrorism. Dual-use technology that is useful for medicine could be repurposed to create weapons.
For example, in 2022, scientists modified an AI system originally intended for generating non-toxic, therapeutic molecules with the purpose of creating new drugs. The researchers adjusted the system so that toxicity is rewarded rather than penalized. This simple change enabled the AI system to create, within 6 hours, 40,000 candidate molecules for chemical warfare, including known and novel molecules.
Main article: Artificial intelligence arms race
Companies, state actors, and other organizations competing to develop AI technologies could lead to a race to the bottom of safety standards. As rigorous safety procedures take time and resources, projects that proceed more carefully risk being out-competed by less scrupulous developers.
AI could be used to gain military advantages via autonomous lethal weapons, cyberwarfare, or automated decision-making. As an example of autonomous lethal weapons, miniaturized drones could facilitate low-cost assassination of military or civilian targets, a scenario highlighted in the 2017 short film Slaughterbots. AI could be used to gain an edge in decision-making by quickly analyzing large amounts of data and making decisions more quickly and effectively than humans. This could increase the speed and unpredictability of war, especially when accounting for automated retaliation systems.
An existential risk is "one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development".
Besides the extinction risk, there is indeed the risk that the civilization gets permanently locked into a flawed future. One example is a "value lock-in": If humanity still has moral blind spots similar to slavery in the past, AI might irreversibly entrench it, preventing moral progress. AI could also be used to spread and preserve the set of values of whoever develops it. AI could facilitate large-scale surveillance and indoctrination, which could be used to create a stable repressive worldwide totalitarian regime.
It is difficult or impossible to reliably evaluate whether an advanced AI is sentient and to which degree. But if sentient machines are mass created in the future, engaging in a civilizational path that indefinitely neglects their welfare could be an existential catastrophe. Moreover, it may be possible to engineer digital minds that can feel much more happiness than humans with fewer resources, called "super-beneficiaries". Such an opportunity raises the question of how to share the world and which "ethical and political framework" would enable a mutually beneficial coexistence between biological and digital minds.
AI may also drastically improve humanity's future. Toby Ord considers that the existential risk is a reason for "proceeding with due caution", not for abandoning AI. Max More calls AI an "existential opportunity", highlighting the cost of not developing it.
According to Bostrom, superintelligence could help to reduce the existential risk from other powerful technologies such as molecular nanotechnology or synthetic biology. It is thus conceivable that developing superintelligence before other dangerous technologies would reduce the overall existential risk.
Further information: AI alignment
The alignment problem is the research problem of how to reliably assign objectives, preferences or ethical principles to AIs.
Further information: Instrumental convergence
An "instrumental" goal is a sub-goal that helps to achieve an agent's ultimate goal. "Instrumental convergence" refers to the fact that there are some sub-goals that are useful for achieving virtually any ultimate goal, such as acquiring resources or self-preservation. Nick Bostrom argues that if an advanced AI's instrumental goals conflict with humanity's goals, the AI might harm humanity in order to acquire more resources or prevent itself from being shut down, but only as a way to achieve its ultimate goal.
Russell argues that a sufficiently advanced machine "will have self-preservation even if you don't program it in... if you say, 'Fetch the coffee', it can't fetch the coffee if it's dead. So if you give it any goal whatsoever, it has a reason to preserve its own existence to achieve that goal."
Even if current goal-based AI programs are not intelligent enough to think of resisting programmer attempts to modify their goal structures, a sufficiently advanced AI might resist any attempts to change its goal structure, just as a pacifist would not want to take a pill that makes them want to kill people. If the AI were superintelligent, it would likely succeed in out-maneuvering its human operators and be able to prevent itself being "turned off" or being reprogrammed with a new goal. This is particularly relevant to value lock-in scenarios. The field of "corrigibility" studies how to make agents that don't resist attempts to change its goals.
In the "intelligent agent" model, an AI can loosely be viewed as a machine that chooses whatever action appears to best achieve the AI's set of goals, or "utility function". A utility function associates to each possible situation a score that indicates its desirability to the agent. Researchers know how to write utility functions that mean "minimize the average network latency in this specific telecommunications model" or "maximize the number of reward clicks"; however, they do not know how to write a utility function for "maximize human flourishing", nor is it currently clear whether such a function meaningfully and unambiguously exists. Furthermore, a utility function that expresses some values but not others will tend to trample over the values not reflected by the utility function.
An additional source of concern is AI "must reason about what people intend rather than carrying out commands literally", and that it needs to be able to fluidly solicit human guidance if it is too uncertain about what humans want.
Some researchers believe the alignment problem may be particularly difficult when applied to superintelligences. Their reasoning includes:
Alternatively, some find reason to believe superintelligences would be better able to understand morality, human values, and complex goals. Bostrom suggests that "A future superintelligence occupies an epistemically superior vantage point: its beliefs are (probably, on most topics) more likely than ours to be true".
In 2023, OpenAI started a project called "Superalignment" trying to solve the alignment of superintelligences in 4 years. They described this as an especially important challenge, as they said superintelligence may be achieved within a decade. Their strategy involves automating alignment research using artificial intelligence.
Artificial Intelligence: A Modern Approach, a widely used undergraduate AI textbook, says that superintelligence "might mean the end of the human race". It states: "Almost any technology has the potential to cause harm in the wrong hands, but with [superintelligence], we have the new problem that the wrong hands might belong to the technology itself." Even if the system designers have good intentions, two difficulties are common to both AI and non-AI computer systems:
AI systems uniquely add a third problem: that even given "correct" requirements, bug-free implementation, and initial good behavior, an AI system's dynamic learning capabilities may cause it to evolve into a system with unintended behavior, even without unanticipated external scenarios. An AI may partly botch an attempt to design a new generation of itself and accidentally create a successor AI that is more powerful than itself, but that no longer maintains the human-compatible moral values preprogrammed into the original AI. For a self-improving AI to be completely safe, it would not only need to be bug-free, but it would need to be able to design successor systems that are also bug-free.
Some skeptics, such as Timothy B. Lee of Vox, argue that any superintelligent program created by humans would be subservient to humans, that the superintelligence would (as it grows more intelligent and learns more facts about the world) spontaneously learn moral truth compatible with human values and would adjust its goals accordingly, or that humans beings are either intrinsically or convergently valuable from the perspective of an artificial intelligence.
Nick Bostrom's "orthogonality thesis" argues instead that, with some technical caveats, almost any level of "intelligence" or "optimization power" can be combined with almost any ultimate goal. If a machine is given the sole purpose to enumerate the decimals of pi, then no moral and ethical rules will stop it from achieving its programmed goal by any means. The machine may utilize all the available physical and informational resources to find as many decimals of pi as it can. Bostrom warns against anthropomorphism: a human will set out to accomplish their projects in a manner that they consider reasonable, while an artificial intelligence may hold no regard for its existence or for the welfare of humans around it, and may instead only care about the completion of the task.
Stuart Armstrong argues that the orthogonality thesis follows logically from the philosophical "is-ought distinction" argument against moral realism. He claims that even if there exist moral facts that are provable by any "rational" agent, the orthogonality thesis still holds: it would still be possible to create a non-philosophical "optimizing machine" that can strive towards some narrow goal, but that has no incentive to discover any "moral facts" such as those that could get in the way of goal completion. Another argument he makes is that any fundamentally friendly AI could be made unfriendly with modifications as simple as negating its utility function. Armstrong further argues that the orthogonality thesis being false would imply there are some immoral goals that could never be achieved by AIs, which to him is an unlikely result.
Skeptic Michael Chorost explicitly rejects Bostrom's orthogonality thesis, arguing instead that "by the time [the AI] is in a position to imagine tiling the Earth with solar panels, it'll know that it would be morally wrong to do so." Chorost argues that "an A.I. will need to desire certain states and dislike others. Today's software lacks that ability—and computer scientists have not a clue how to get it there. Without wanting, there's no impetus to do anything. Today's computers can't even want to keep existing, let alone tile the world in solar panels."
Anthropomorphic arguments assume that, as machines become more intelligent, they will begin to display many human traits, such as morality or a thirst for power. Although anthropomorphic scenarios are common in fiction, they are rejected by most scholars writing about the existential risk of artificial intelligence. Instead, advanced AI systems are typically modeled as intelligent agents.
The academic debate is between one side which worries whether AI might destroy humanity and another side which believes that AI would not destroy humanity at all. Both sides have claimed that the others' predictions about an AI's behavior are illogical anthropomorphism. The skeptics accuse proponents of anthropomorphism for believing an AGI would naturally desire power; proponents accuse some skeptics of anthropomorphism for believing an AGI would naturally value human ethical norms.
Evolutionary psychologist Steven Pinker, a skeptic, argues that "AI dystopias project a parochial alpha-male psychology onto the concept of intelligence. They assume that superhumanly intelligent robots would develop goals like deposing their masters or taking over the world"; perhaps instead "artificial intelligence will naturally develop along female lines: fully capable of solving problems, but with no desire to annihilate innocents or dominate the civilization." Facebook's director of AI research, Yann LeCun states that "Humans have all kinds of drives that make them do bad things to each other, like the self-preservation instinct... Those drives are programmed into our brain but there is absolutely no reason to build robots that have the same kind of drives".
Despite other differences, the x-risk school[b] agrees with Pinker that an advanced AI would not destroy humanity out of human emotions such as "revenge" or "anger", that questions of consciousness are not relevant to assess the risks, and that computer systems do not generally have a computational equivalent of testosterone. They think that power-seeking or self-preservation behaviors emerge in the AI as a way to achieve its true goals, according to the concept of instrumental convergence.
Nick Bostrom and others have stated that a race to be the first to create AGI could lead to shortcuts in safety, or even to violent conflict. Roman Yampolskiy and others warn that a malevolent AGI could be created by design, for example by a military, a government, a sociopath, or a corporation, to benefit from, control, or subjugate certain groups of people, as in cybercrime, or that a malevolent AGI could choose the goal of increasing human suffering, for example of those people who did not assist it during the information explosion phase.:158
Some scholars have proposed hypothetical scenarios to illustrate some of their concerns.
In Superintelligence, Nick Bostrom expresses concern that even if the timeline for superintelligence turns out to be predictable, researchers might not take sufficient safety precautions, in part because "it could be the case that when dumb, smarter is safe; yet when smart, smarter is more dangerous". Bostrom suggests a scenario where, over decades, AI becomes more powerful. Widespread deployment is initially marred by occasional accidents—a driverless bus swerves into the oncoming lane, or a military drone fires into an innocent crowd. Many activists call for tighter oversight and regulation, and some even predict impending catastrophe. But as development continues, the activists are proven wrong. As automotive AI becomes smarter, it suffers fewer accidents; as military robots achieve more precise targeting, they cause less collateral damage. Based on the data, scholars mistakenly infer a broad lesson: the smarter the AI, the safer it is. "And so we boldly go—into the whirling knives", as the superintelligent AI takes a "treacherous turn" and exploits a decisive strategic advantage.
In Max Tegmark's 2017 book Life 3.0, a corporation's "Omega team" creates an extremely powerful AI able to moderately improve its own source code in a number of areas. After a certain point, the team chooses to publicly downplay the AI's ability, in order to avoid regulation or confiscation of the project. For safety, the team keeps the AI in a box where it is mostly unable to communicate with the outside world, and uses it to make money, by diverse means such as Amazon Mechanical Turk tasks, production of animated films and TV shows, and development of biotech drugs, with profits invested back into further improving AI. The team next tasks the AI with astroturfing an army of pseudonymous citizen journalists and commentators, in order to gain political influence to use "for the greater good" to prevent wars. The team faces risks that the AI could try to escape by inserting "backdoors" in the systems it designs, by hidden messages in its produced content, or by using its growing understanding of human behavior to persuade someone into letting it free. The team also faces risks that its decision to box the project will delay the project long enough for another project to overtake it.
The thesis that AI could pose an existential risk provokes a wide range of reactions within the scientific community, as well as in the public at large. Many of the opposing viewpoints, however, share common ground.
Observers tend to agree that AI has a significant potential to improve society. The Asilomar AI Principles, which contain only those principles agreed to by 90% of the attendees of the Future of Life Institute's Beneficial AI 2017 conference, also agree in principle that "There being no consensus, we should avoid strong assumptions regarding upper limits on future AI capabilities" and "Advanced AI could represent a profound change in the history of life on Earth, and should be planned for and managed with commensurate care and resources."
Conversely, many skeptics agree that ongoing research into the implications of artificial general intelligence is valuable. Skeptic Martin Ford states that "I think it seems wise to apply something like Dick Cheney's famous '1 Percent Doctrine' to the specter of advanced artificial intelligence: the odds of its occurrence, at least in the foreseeable future, may be very low—but the implications are so dramatic that it should be taken seriously". Similarly, an otherwise skeptical Economist stated in 2014 that "the implications of introducing a second intelligent species onto Earth are far-reaching enough to deserve hard thinking, even if the prospect seems remote".
AI safety advocates such as Bostrom and Tegmark have criticized the mainstream media's use of "those inane Terminator pictures" to illustrate AI safety concerns: "It can't be much fun to have aspersions cast on one's academic discipline, one's professional community, one's life work ... I call on all sides to practice patience and restraint, and to engage in direct dialogue and collaboration as much as possible." Toby Ord wrote that the idea that an AI takeover requires robots is a misconception, arguing that the ability to spread content through the internet is more dangerous, and that the most damaging humans in history stood out by their ability to convince, not their physical strength.
A 2022 expert survey with a 17% response rate gave a median expectation of 5-10% for the possibility of human extinction from artificial intelligence.
Further information: Global catastrophic risk
The thesis that AI poses an existential risk, and that this risk needs much more attention than it currently gets, has been endorsed by many computer scientists and public figures including Alan Turing,[c], the most-cited computer scientist Geoffrey Hinton, Elon Musk, OpenAI CEO Sam Altman, Bill Gates, and Stephen Hawking. Endorsers of the thesis sometimes express bafflement at skeptics: Gates states that he does not "understand why some people are not concerned", and Hawking criticized widespread indifference in his 2014 editorial:
So, facing possible futures of incalculable benefits and risks, the experts are surely doing everything possible to ensure the best outcome, right? Wrong. If a superior alien civilisation sent us a message saying, 'We'll arrive in a few decades,' would we just reply, 'OK, call us when you get here—we'll leave the lights on?' Probably not—but this is more or less what is happening with AI.
Concern over risk from artificial intelligence has led to some high-profile donations and investments. In 2015, Peter Thiel, Amazon Web Services, and Musk and others jointly committed $1 billion to OpenAI, consisting of a for-profit corporation and the nonprofit parent company which states that it is aimed at championing responsible AI development. Facebook co-founder Dustin Moskovitz has funded and seeded multiple labs working on AI Alignment, notably $5.5 million in 2016 to launch the Centre for Human-Compatible AI led by Professor Stuart Russell. In January 2015, Elon Musk donated $10 million to the Future of Life Institute to fund research on understanding AI decision making. The goal of the institute is to "grow wisdom with which we manage" the growing power of technology. Musk also funds companies developing artificial intelligence such as DeepMind and Vicarious to "just keep an eye on what's going on with artificial intelligence, saying "I think there is potentially a dangerous outcome there."
In early statements on the topic, Geoffrey Hinton, a major pioneer of deep learning, noted that "there is not a good track record of less intelligent things controlling things of greater intelligence", but stated that he continues his research because "the prospect of discovery is too sweet". In 2023 Hinton quit his job at Google in order to speak out about existential risk from AI. He explained that his increased concern was driven by concerns that superhuman AI might be closer than he'd previously believed, saying: "I thought it was way off. I thought it was 30 to 50 years or even longer away. Obviously, I no longer think that." He also remarked, "Look at how it was five years ago and how it is now. Take the difference and propagate it forwards. That’s scary."
In his 2020 book, The Precipice: Existential Risk and the Future of Humanity, Toby Ord, a Senior Research Fellow at Oxford University's Future of Humanity Institute, estimates the total existential risk from unaligned AI over the next hundred years to be about one in ten.
Further information: Artificial general intelligence § Feasibility
The thesis that AI can pose existential risk has many detractors. Skeptics sometimes charge that the thesis is crypto-religious, with an irrational belief in the possibility of superintelligence replacing an irrational belief in an omnipotent God. Jaron Lanier argued in 2014 that the whole concept that then-current machines were in any way intelligent was "an illusion" and a "stupendous con" by the wealthy.
Some criticism argues that AGI is unlikely in the short term. AI researcher Rodney Brooks wrote in 2014, "I think it is a mistake to be worrying about us developing malevolent AI anytime in the next few hundred years. I think the worry stems from a fundamental error in not distinguishing the difference between the very real recent advances in a particular aspect of AI and the enormity and complexity of building sentient volitional intelligence." Baidu Vice President Andrew Ng stated in 2015 that AI existential risk is "like worrying about overpopulation on Mars when we have not even set foot on the planet yet." For the danger of uncontrolled advanced AI to be realized, the hypothetical AI may have to overpower or out-think any human, which some experts argue is a possibility far enough in the future to not be worth researching.
Skeptics who believe AGI is not a short-term possibility often argue that concern about existential risk from AI is unhelpful because it could distract people from more immediate concerns about the impact of AI, because it could lead to government regulation or make it more difficult to fund AI research, or because it could damage the field's reputation. AI and AI ethics researchers Timnit Gebru, Emily M. Bender, Margaret Mitchell, and Angelina McMillan-Major have argued that discussion of existential risk distracts from the immediate, ongoing harms from AI taking place today, such as data theft, worker exploitation, bias, and concentration of power. They further note the association between those warning of existential risk and longtermism, which they describe as a "dangerous ideology" for its unscientific and utopian nature.
Wired editor Kevin Kelly argues that natural intelligence is more nuanced than AGI proponents believe, and that intelligence alone is not enough to achieve major scientific and societal breakthroughs. He argues that intelligence consists of many dimensions which are not well understood, and that conceptions of an 'intelligence ladder' are misleading. He notes the crucial role real-world experiments play in the scientific method, and that intelligence alone is no substitute for these.
Meta chief AI scientist Yann LeCun says that AI can be made safe via continuous and iterative refinement, similar to what happened in the past with cars or rockets, and that AI will have no desire to take control.
Several skeptics emphasize the potential near-term benefits of AI. Meta CEO Mark Zuckerberg believes AI will "unlock a huge amount of positive things", such as curing disease and increasing the safety of autonomous cars. Physicist Michio Kaku, an AI risk skeptic, posits a deterministically positive outcome. In Physics of the Future he asserts that "It will take many decades for robots to ascend" up a scale of consciousness, and that in the meantime corporations such as Hanson Robotics will likely succeed in creating robots that are "capable of love and earning a place in the extended human family".
In a 2014 article in The Atlantic, James Hamblin noted that most people do not care about artificial general intelligence, and characterized his own gut reaction to the topic as: "Get out of here. I have a hundred thousand things I am concerned about at this exact moment. Do I seriously need to add to that a technological singularity?"
During a 2016 Wired interview of President Barack Obama and MIT Media Lab's Joi Ito, Ito stated:
There are a few people who believe that there is a fairly high-percentage chance that a generalized AI will happen in the next 10 years. But the way I look at it is that in order for that to happen, we're going to need a dozen or two different breakthroughs. So you can monitor when you think these breakthroughs will happen.
And you just have to have somebody close to the power cord. [Laughs.] Right when you see it about to happen, you gotta yank that electricity out of the wall, man.
Hillary Clinton stated in What Happened:
Technologists... have warned that artificial intelligence could one day pose an existential security threat. Musk has called it "the greatest risk we face as a civilization". Think about it: Have you ever seen a movie where the machines start thinking for themselves that ends well? Every time I went out to Silicon Valley during the campaign, I came home more alarmed about this. My staff lived in fear that I'd start talking about "the rise of the robots" in some Iowa town hall. Maybe I should have. In any case, policy makers need to keep up with technology as it races ahead, instead of always playing catch-up.
One techno-utopian viewpoint expressed in some popular fiction is that AGI may tend towards peace-building.
In 2018, a SurveyMonkey poll of the American public by USA Today found 68% thought the real current threat remains "human intelligence"; however, the poll also found that 43% said superintelligent AI, if it were to happen, would result in "more harm than good", and 38% said it would do "equal amounts of harm and good".
An April 2023 YouGov poll of US adults found 46% of respondents were "somewhat concerned" or "very concerned" about "the possibility that AI will cause the end of the human race on Earth," compared with 40% who were "not very concerned" or "not at all concerned."
Many scholars concerned about the AGI existential risk believe that the best approach is to conduct substantial research into solving the difficult "control problem": what types of safeguards, algorithms, or architectures can programmers implement to maximize the probability that their recursively-improving AI would continue to behave in a friendly manner after it reaches superintelligence? Social measures may mitigate the AGI existential risk; for instance, one recommendation is for a UN-sponsored "Benevolent AGI Treaty" that would ensure only altruistic AGIs be created. Similarly, an arms control approach has been suggested, as has a global peace treaty grounded in the international relations theory of conforming instrumentalism, with an ASI potentially being a signatory.
Researchers at Google have proposed research into general "AI safety" issues to simultaneously mitigate both short-term risks from narrow AI and long-term risks from AGI. A 2020 estimate places global spending on AI existential risk somewhere between $10 and $50 million, compared with global spending on AI around perhaps $40 billion. Bostrom suggests that funding of protective technologies should be prioritized over potentially dangerous ones. Some funders, such as Elon Musk, propose that radical human cognitive enhancement could be such a technology, for example direct neural linking between human and machine; however, others argue that enhancement technologies may themselves pose an existential risk. Researchers could closely monitor or attempt to "box in" an initial AI at a risk of becoming too powerful. A dominant superintelligent AI, if it were aligned with human interests, might itself take action to mitigate the risk of takeover by rival AI, although the creation of the dominant AI could itself pose an existential risk.
Institutions such as the Alignment Research Center, the Machine Intelligence Research Institute, the Future of Humanity Institute, the Future of Life Institute, the Centre for the Study of Existential Risk, and the Center for Human-Compatible AI are involved in research into AI risk and safety.
Some scholars have stated that even if AGI poses an existential risk, attempting to ban research into artificial intelligence would still be unwise, and probably futile. Skeptics argue that regulation of AI would be completely valueless, as no existential risk exists. However, scholars who believe existential risk proposed that it is difficult to depend on people from the AI industry to regulate or constraint AI research because it directly contradicts their personal interests. The scholars also agree with the skeptics that banning research would be unwise, as research could be moved to countries with looser regulations or conducted covertly. The latter issue is particularly relevant, as artificial intelligence research can be done on a small scale without substantial infrastructure or resources. Two additional hypothetical difficulties with bans (or other regulation) are that technology entrepreneurs statistically tend towards general skepticism about government regulation, and that businesses could have a strong incentive to (and might well succeed at) fighting regulation and politicizing the underlying debate.
In March 2023, the Future of Life Institute drafted Pause Giant AI Experiments: An Open Letter, a petition calling on major AI developers to agree on a verifiable six-month pause of any systems "more powerful than GPT-4" and to use that time to institute a framework for ensuring safety; or, failing that, for governments to step in with a moratorium. The letter referred to the possibility of "a profound change in the history of life on Earth" as well as potential risks of AI-generated propaganda, loss of jobs, human obsolescence, and society-wide loss of control. The letter was signed by prominent personalities in AI but also criticized for not focusing on current harms, missing technical nuance about when to pause or not going far enough.
Musk called for some sort of regulation of AI development as early as 2017. According to NPR, the Tesla CEO is "clearly not thrilled" to be advocating for government scrutiny that could impact his own industry, but believes the risks of going completely without oversight are too high: "Normally the way regulations are set up is when a bunch of bad things happen, there's a public outcry, and after many years a regulatory agency is set up to regulate that industry. It takes forever. That, in the past, has been bad but not something which represented a fundamental risk to the existence of civilisation." Musk states the first step would be for the government to gain "insight" into the actual status of current research, warning that "Once there is awareness, people will be extremely afraid... [as] they should be." In response, politicians expressed skepticism about the wisdom of regulating a technology that is still in development.
In 2021 the United Nations (UN) considered banning autonomous lethal weapons, but consensus could not be reached. In July 2023 the UN Security Council for the first time held a session to consider the risks and threats posed by AI to world peace and stability, along with potential benefits. Secretary-General António Guterres advocated for the creation of a global watchdog to oversee the emerging technology, stating "Generative AI has enormous potential for good and evil at scale. Its creators themselves have warned that much bigger, potentially catastrophic and existential risks lie ahead." At the council session, Russia said it believes AI risks are too poorly understood to be considered a threat to global stability. China argued against strict global regulation, saying countries should be able to develop their own rules, while also saying they opposed the use of AI to "create military hegemony or undermine the sovereignty of a country."
Regulation of conscious AGIs focuses on integrating them with existing human society and can be divided into considerations of their legal standing and of their moral rights. AI arms control will likely require the institutionalization of new international norms embodied in effective technical specifications combined with active monitoring and informal diplomacy by communities of experts, together with a legal and political verification process.
In July 2023, the US Government secured voluntary safety commitments from major tech companies, including OpenAI, Amazon, Google, Meta, and Microsoft. The companies agreed to implement safeguards, including third-party oversight and security testing by independent experts, to address concerns related to AI's potential risks and societal harms. The parties framed the commitments as an intermediate step while regulations are formed. Amba Kak, executive director of the AI Now Institute, commented saying "A closed-door deliberation with corporate actors resulting in voluntary safeguards isn’t enough" and called for public deliberation and regulations of the kind that companies would not voluntarily agree to.
50% of AI researchers believe there's a 10% or greater chance that humans go extinct from our inability to control AI.
Similarly, Marvin Minsky once suggested that an AI program designed to solve the Riemann Hypothesis might end up taking over all the resources of Earth to build more powerful supercomputers to help achieve its goal.
In the bio, playfully written in the third person, Good summarized his life's milestones, including a probably never before seen account of his work at Bletchley Park with Turing. But here's what he wrote in 1998 about the first superintelligence, and his late-in-the-game U-turn: [The paper] 'Speculations Concerning the First Ultra-intelligent Machine' (1965) . . . began: 'The survival of man depends on the early construction of an ultra-intelligent machine.' Those were his [Good's] words during the Cold War, and he now suspects that 'survival' should be replaced by 'extinction.' He thinks that, because of international competition, we cannot prevent the machines from taking over. He thinks we are lemmings. He said also that 'probably Man will construct the deus ex machina in his own image.'
As if losing control to Chinese minds were scarier than losing control to alien digital minds that don't care about humans. [...] it's clear by now that the space of possible alien minds is vastly larger than that.
it's plausible to me that the main thing we need to get done is noticing specific circuits to do with deception and specific dangerous capabilities like that and situational awareness and internally-represented goals.
((cite arXiv)): CS1 maint: uses authors parameter (link)
((cite arXiv)): CS1 maint: uses authors parameter (link)
as long as they possess a sufficient level of intelligence, agents having any of a wide range of final goals will pursue similar intermediary goals because they have instrumental reasons to do so.
((cite arXiv)): CS1 maint: uses authors parameter (link)
Nothing precludes sufficiently smart self-improving systems from optimising their reward mechanisms in order to optimisetheir current-goal achievement and in the process making a mistake leading to corruption of their reward functions.
((cite book)): CS1 maint: multiple names: authors list (link)
It is therefore no surprise that according to the most recent AI Impacts Survey, nearly half of 731 leading AI researchers think there is at least a 10% chance that human-level AI would lead to an "extremely negative outcome," or existential risk.
((cite web)): CS1 maint: url-status (link)
I personally believe that the most likely path is that we will build robots to be benevolent and friendly
For all these reasons, verifying a global relinquishment treaty, or even one limited to AI-related weapons development, is a nonstarter... (For different reasons from ours, the Machine Intelligence Research Institute) considers (AGI) relinquishment infeasible...
In general, most writers reject proposals for broad relinquishment... Relinquishment proposals suffer from many of the same problems as regulation proposals, but to a greater extent. There is no historical precedent of general, multi-use technology similar to AGI being successfully relinquished for good, nor do there seem to be any theoretical reasons for believing that relinquishment proposals would work in the future. Therefore we do not consider them to be a viable class of proposals.
It is fantasy to suggest that the accelerating development and deployment of technologies that taken together are considered to be A.I. will be stopped or limited, either by regulation or even by national legislation.
Of course, one could try to ban super-intelligent computers altogether. But 'the competitive advantage—economic, military, even artistic—of every advance in automation is so compelling,' Vernor Vinge, the mathematician and science-fiction author, wrote, 'that passing laws, or having customs, that forbid such things merely assures that someone else will.'