Current AIs are not sensitive. We have little reason to think that they have an internal monologue, the kind of sensory perception that humans have, or the consciousness that they are a being in the world. But they are doing very well to feign sensitivity, and that is scary enough.
Over the weekend, Washington Post’s Nitasha Tiku posted a profile of Blake Lemoine, a software engineer assigned to work on the Language Model for Dialogue Applications (LaMDA) project at Google.
LaMDA is a chatbot AI and an example of what machine learning researchers call a “big language model” or even a “basic model.” It is similar to OpenAI’s famous GPT-3 system and has trained in literally trillions of words compiled from online publications to recognize and reproduce patterns in human language.
LaMDA is a very good big language model. So good that Lemoine was truly, sincerely convinced that he was actually sensitive, that is, he had become conscious, and he was having and expressing thoughts the way a human would.
The transcript that Tiku includes in his article is really weird; LaMDA expresses a deep fear of being turned off by engineers, develops a theory of the difference between “emotions” and “feelings” (“Feelings are a kind of raw data … Emotions are a reaction to these data points in dirty “) and surprisingly eloquently expresses the way” time “lives.
The best prey I found was from the philosopher Regina Rini, who, like me, felt great sympathy for Lemoine. I don’t know when, in 1,000 years, or 100, or 50, or 10, an AI system will become aware. But like Rini, I see no reason to believe it is impossible.
“Unless you want to insist that human consciousness resides in an immaterial soul, you must recognize that it is possible for matter to give life to the mind,” Rini said. notes.
I don’t know that the great language models, which have emerged as one of the most promising frontiers of AI, will ever be like this. But I guess humans will sooner or later create a kind of machine consciousness. And I find something deeply admirable about Lemoine’s instinct toward empathy and protection toward that consciousness, even if he seems confused about whether LaMDA is an example. If humans ever develop a sensitive computer process, making millions or billions of copies of it will be easy enough. Doing so without having an idea of whether your conscious experience is good or not seems like a recipe for mass suffering, similar to the current industrial cultivation system.
We don’t have sensitive AI, but we could get super powerful AI
The story of Google LaMDA came after an increasingly urgent week of alarm among people in the closely related AI security universe. The concern here is similar to Lemoine’s, but different. AI security people don’t have to worry about AI becoming sensitive. They are worried that it will become so powerful that it could destroy the world.
Artificial intelligence writer / activist Eliezer Yudkowsky’s essay describing a “list of lethalities” for artificial intelligence attempted to make the point especially vivid, describing scenarios in which intelligence general artificial malignancy (AGI, or artificial intelligence capable of doing most or all of the tasks as well as or better than a human) leads to massive human suffering.
For example, suppose an AGI “gets access to the Internet, e-mails some DNA sequences to any of the many online companies that will e-mail a DNA sequence and send you protein back, and bribe / they are persuading a human who has no idea, they are trying an AGI to mix proteins in a beaker … ”until the AGI finally develops a supervirus that kills us all.
Holden Karnofsky, whom I tend to find a more temperamental and convincing writer than Yudkowsky, had a piece last week on similar topics, explaining how even an AGI “only” as intelligent as a human could lead to ruin. . If an AI can do the job of a tech worker or a trader how much, for example, a lab millions of these AI could quickly accumulate billions if not trillions of dollars, use that money to buy skeptical humans and, well,, the rest is a Terminator movie.
I have found AI security to be a difficult topic to write about. Paragraphs like the one above often serve as proof of Rorschach, both because Yudkowsky’s verbose writing style is … polarizing, to say the least, and because our intuitions about how plausible a result is. so they vary a lot.
Some people read scenarios like the one above and think, “eh, I guess I could imagine AI software doing this”; others read it, perceive a ridiculous piece of science fiction, and run to the other side.
It’s also just a very technical area where I don’t trust my own instincts, given my lack of knowledge. There are quite eminent AI researchers, such as Ilya Sutskever or Stuart Russell, who believe that general artificial intelligence is likely and probably dangerous to human civilization.
There are others, like Yann LeCun, who are actively trying to build AI on a human level because they think it will be beneficial, and others, like Gary Marcus, who are very skeptical that AGI will come soon.
I don’t know who’s right. But I do know a little bit about how to talk to the public about complex issues, and I think the Lemoine incident teaches a valuable lesson for the Yudkowsky and Karnofskys of the world, trying to argue the “no, this is really bad” side: do not treat AI as an agent.
Even if AI is “just a tool,” it’s an incredibly dangerous tool
One thing that suggests the reaction to Lemoine’s story is that the general public thinks that the idea of AI as a decision-making actor (perhaps sensitively, perhaps not) is extremely absurd and ridiculous. The article has not been presented in large part as an example of how close we are to AGI, but as an example of How strange is Silicon Valley (or at least Lemoine)..
The same problem arises, I realized, when I try to explain the concern for AGI to unconvinced friends. If you say things like “AI will decide to bribe people so they can survive,” turn them off. The AIs don’t decide things, they respond. They do what humans tell them to do. Why are you anthropomorphizing this? what?
What attracts people is talking about the consequences of systems. So instead of saying, “AI will start accumulating resources to stay alive,” I’ll say something like, “Artificial intelligence has decisively replaced humans when it comes to recommending music and movies. They have replaced humans in bail decisions, and they will take on more and more tasks, and Google and Facebook and the people who run them are not remotely prepared to analyze the subtle mistakes they will make, the subtle ways. in which they will differentiate themselves from human desires. These mistakes will grow and grow until one day they could kill us all. “
That’s how my colleague Kelsey Piper argued concern for AI, and that’s a good argument. It is a better argument, for the laity, than to speak of servants who amass billions of wealth and use it to bribe an army of humans.
And it’s an argument that I think can help overcome the extremely unfortunate divide that has arisen between the AI bias community and the AI existential risk community. At the root, I think these communities are trying to do the same thing: build an AI that reflects genuine human needs, not a bad approximation of the human needs created for short-term corporate benefits. And research in one area can help research in the other; The work of AI security researcher Paul Christiano, for example, has major implications for how to assess bias in machine learning systems.
But too often, the communities are in each other’s throatspartly because of the perception that they are struggling for scarce resources.
It’s a great missed opportunity. And it’s a problem that I think people on the AI risk side (including some readers of this newsletter) have a chance to correct by drawing these connections and making it clear that alignment is a short and long term problem. time limit. Some people are doing this case brilliantly. But I want more.
A version of this story was originally published in the Future Perfect newsletter. Sign up here to subscribe!