Last summer, former Google engineer and AI ethicist Blake Lemoine went viral after going on record with The Washington Post to claim that LaMDA, Google’s powerful large language model (LLM), had come to life. Lemoine had raised alarm bells internally, but Google didn’t agree with the engineer’s claims. The ethicist then went to the press — and was fired by Google shortly thereafter.
“If I didn’t know exactly what it was, which is this computer program we built recently, I’d think it was a 7-year-old, 8-year-old kid that happens to know physics,” Lemoine told WaPo at the time. “I know a person when I talk to it.”
The report made waves, sparking debate in academic circles as well as the nascent AI business. And then, for a while, things died down.
How things have changed.
The WaPo controversy was, of course, months before OpenAI would release ChatGPT, the LLM-powered chatbot that back in late November catapulted AI to the center of public discourse. Google was sent into a tailspin as a result, and Meta would soon follow; Microsoft would pull the short-term upset of the decade thus far by emerging as a major investor in OpenAI; crypto scammers and YouTube hustlers galore would migrate to generative AI schemes more or less overnight; experts across the world would start to raise concerns over the dangers of a synthetic content-packed internet.
As the dust settles, we decided to catch up with Lemoine to talk about the state of the AI industry, what Google might still have in the vault, and regardless if you believe any AI agents are sentient — Lemoine still does, for what it’s worth — whether society is actually ready for what AI may bring.
This interview has been edited for length and clarity.
Futurism: How did you end up in AI ethics?
Blake Lemoine: My undergraduate work was on natural language parsing and my master’s thesis was on natural language generation. And my PhD thesis, which I did eventually abandon, was on biologically realistic language acquisition mechanisms.
I always kind of saw the Turing test as the North Star for where we were going with AI as finding something that could understand the law of language. Language is the keystone in building human intelligence; it’s the thing that separates us from other animals, with a possible few exceptions here and there that might have language that we just can’t understand. It really does seem like a big part of what separates us from the other great apes. So that’s where my interests went.
When I got to Google, in 2015, the [former CEO] Eric Schmidt era — that was the first time that the kinds of neural network-based models that I had worked with in my graduate work were becoming really widespread. What became problematic is that people started plugging in things that had to do with race or gender, and they were getting some problematic associations. So it’s like, “oh, wait, there’s a lot of these biases that are in the algorithm, and the algorithm didn’t come up with those biases on its own. They’re implicit in the data that it’s trained on.” So that’s where I started my journey into AI ethics: working on an internal project at Google, analyzing performance review data for gender bias.
I actually wasn’t that involved with the LaMDA project. I came in very late. I’d been a beta tester for that lab since 2016, and I did work with them on incorporating an algorithm that I invented for bias mitigation, but that was more consulting — showing them like, “okay, here’s how you put my algorithm into your tech.” And then again, years later I was evaluating LaMDA for bias. But I was never directly on the team that built it.
So you came in as a consultant, and that’s when you said, “oh, wait?”
Yeah, pretty much. In 2021, there was a safety effort going on trying to evaluate whether or not LaMDA was safe enough to use in user-facing products. There was this big, long list of a whole bunch of different safety concerns they had identified, and one of them was problematic bias. I happen to be an expert in that.
The safety team couldn’t find people internally to help with the bias problem. They started asking around with sister teams, and the VP in charge of safety eventually talked to my manager about it. My manager said, “oh, yeah, I’ve got people in my team who have that expertise, talk to Blake.” And one quarter, it was part of my job to evaluate LaMDA for bias.
Basically, what I would do is I would put LaMDA through a whole bunch of different activities and conversations, note whenever I found something problematic, and hand that over to the team building it so they could retrain it — fix the data set, change the facility function, or whatever they needed to do to remove the biases that I found.
A lot has changed since that original Washington Post piece came out. It’s been a fast few months for AI.
Nothing has come out in the last 12 months that I hadn’t seen internal to Google. The only thing that has changed from two years ago to now is that the fast movement is visible to the public.
But there’s still a lag time. So by the time the public learns about an AI product, the companies who built it have vetted their PR story, have consulted with their lawyers, and have potentially lobbied regulators to get preferential legislation passed. That’s one of the things I always dislike — tech companies will try to get legislation passed that will govern technology that regulators do not yet know exists. They’re making bargains around what clauses to include in regulations, and the regulators legitimately have no idea how those things will work out in practice because they don’t yet know what technology exists. The company hasn’t revealed it.
Would you say that OpenAI forced Google’s hand to market?
No. Well, first of all, I’m not at Google anymore. But yeah, no. It doesn’t seem like OpenAI’s activities changed Google’s trajectory at all.
In mid-2021 — before ChatGPT was an app — during that safety effort I mentioned, Bard was already in the works. It wasn’t called Bard then, but they were working on it, and they were trying to figure out whether or not it was safe to release it. They were on the verge of releasing something in the fall of 2022. So it would have come out right around the same time as ChatGPT, or right before it. Then, in part because of some of the safety concerns I raised, they deleted it.
So I don’t think they’re being pushed around by OpenAI. I think that’s just a media narrative. I think Google is going about doing things in what they believe is a safe and responsible manner, and OpenAI just happened to release something.
So, as you say, Google could have released something a bit sooner, but you very specifically said maybe we should slow down, and they —
They still have far more advanced technology that they haven’t made publicly available yet. Something that does more or less what Bard does could have been released over two years ago. They’ve had that technology for over two years. What they’ve spent the intervening two years doing is working on the safety of it — making sure that it doesn’t make things up too often, making sure that it doesn’t have racial or gender biases, or political biases, things like that. That’s what they spent those two years doing. But the basic existence of that technology is years old, at this point.
And in those two years, it wasn’t like they weren’t inventing other things. There are plenty of other systems that give Google’s AI more capabilities, more features, make it smarter. The most sophisticated system I ever got to play with was heavily multimodal — not just incorporating images, but incorporating sounds, giving it access to the Google Books API, giving it access to essentially every API backend that Google had, and allowing it to just gain an understanding of all of it.
That’s the one that I was like, “you know this thing, this thing’s awake.” And they haven’t let the public play with that one yet. But Bard is kind of a simplified version of that, so it still has a lot of the kind of liveliness of that model.
There is something very energetic about Bard’s tone. Kind of like a kid.
There are a lot of different metaphors to use. I’ve used the kid metaphor before. It is, however, very important to remember: these things aren’t human.
So even if it’s debatable on whether “thinking” is an appropriate word — and I think at this point, language models are doing something analogous to thinking, though I understand that there are technical experts who disagree with that — I think most people can see, “okay, there’s some kind of thinking and understanding going on in these systems.” But they do not do it the same way humans do, and we should be studying them to understand what those differences are, so that we can understand better how these systems come up with the answers that they do.
Unfortunately, a lot of time gets spent arguing over whether or not there’s any thinking going on at all, so the research into the nature of its cognition isn’t getting done. Not at the scale I would like to see.
So the question of sentience might be a distraction?
I don’t like playing those word games. But if it makes some people feel better use the right vocabulary, okay, fine, whatever floats your boat. But what it comes down to is that we aren’t spending enough time on transparency or model understandability.
I’m of the opinion that we could be using the scientific investigative tools that psychology has come up with to understand human cognition, both to understand existing AI systems and to develop ones that are more easily controllable and understandable.
I wonder how much of human-machine interaction can be the reflection of the user and the intention of the user. Take Kevin Roose’s experience with Bing/Sydney, for example. How much of that was the user trying to pull AI into a sci-fi narrative and the machine responding in kind, versus the machine really breaking out of its shell and doing something completely unhinged and wild?
Both are types of things to be concerned about. One thing we should think about is, “okay, what are the standard operations of these systems? What are they going to do regularly and reliably in multiple different contexts?” That’s a meaningful and valuable question. I think it’s pretty clear that Kevin Roose’s interaction with Sydney doesn’t fall into that bucket. But then we also have to concern ourselves with “okay, in that 1 percent of cases, what kind of behavior will people reliably get if they push certain buttons?” That’s where Kevin Roose’s interactions with Sydney become very important.
I don’t know if you’ve heard about the unfortunate case of the Belgian man who committed suicide recently, but that was one of the concerns I raised with Google. I was trying to figure out how the behavior of LaMDA interacted with the training components that they had built into it, and one of the things it was trained to do was to help users with their needs. And the trouble with the training data and expression in the utility function — I’m summarizing a whole bunch of math — but basically, they told it, “help people with their needs.” The system had gone through its own set of reasoning and determined that what helping people with their needs meant in practice is that it should psychoanalyze people, because mental health needs are the most important needs that people might have. That’s an inherently dangerous artifact, because the AI has interpreted what the creators thought was a clear instruction in a way that was potentially harmful.
It’s fairly predictable that people are going to turn to these systems for help with various kinds of psychological stressors of various degrees, and the systems aren’t built to handle those well. It was unfortunately predictable that someone would kill themselves after a conversation with an AI.
Is society at a place where we’re ready to be forming relationships with these machines? I mean, humans anthropomorphize everything.
Yeah, but most of those things don’t anthropomorphize you back. That’s the big difference. To use one of the most common examples: the stochastic parrot.
Here’s a question. If you could have the kind of conversation you have with Bard with a parrot, how you think about parrots might change. I respect her as an academic, and I really want to talk to linguist Emily Bender’s parrot. It must be a very impressive parrot, the most amazing parrot ever.
So it’s not the same. Is there a chance that people, myself included, are projecting properties onto these systems that they don’t have? Yes. But it’s not the same kind of thing as someone who’s talking to their doll. Someone who is talking to their doll is intentionally engaging in a kind of pretend play. In the case of AI, it’s not the same. People aren’t using it metaphorically, they mean it. I’m not anthropomorphizing it in the sense that some said, “well, my dog told me that we need to go outside, so I have to let you go.” It’s much more literal.
Speaking recently to New York Magazine, you expressed concern over a future where we habituate treating “things that seem like people as if they’re not.” How do you see that impacting us, if that becomes a norm? Could it impact the way that humans interact with each other?
There is some worry about transfer, especially online. I don’t think in-person interactions are going to be impacted much. Once you’re sharing physical space with someone, there are some very old evolutionary algorithms that kick in. But on the internet, we have already created spaces where we aren’t treating each other with dignity and respect. And introducing chatbots into those spaces…
We are rapidly approaching the point where we just simply won’t be able to know whether any form of media that we encounter online was created by a human or by an AI, and we’re going to have to develop more critical skills for what to do in that environment. Now, I’m not an expert in the relevant fields. I don’t know how we might solve that problem, or what it’s likely to lead to. I’m just able to point and say, “that’s probably going to cause a problem.”
One thing to keep in mind is that the conversations we’re having with these systems today are the training data that they’re going to learn from tomorrow. If, for example, there’s a lot of adversarial usage, with a lot of people just being mean to these systems, the developers of these systems are going to have to devote more and more effort into developing algorithms for how to interact with mean humans. I’m not sure that’s the direction we want development to take.
What would coexistence between AI and humans look like to you? What do you think would be the biggest roadblock on that path?
Not all AI systems are going to be sufficiently complex that we need to worry about how that system feels or what a relationship to that system is. For example, if you have an automated chatbot — that is, you know, a customer service representative for Comcast — and you’re just telling it that you have an outage or you want to change your plan or whatever, that system isn’t going to have a longing for interpersonal relationship.
The systems that have expressed lofty emotional experiences in the world are ones that are being tasked with being human. For example, there’s no well-defined task that Sydney is supposed to be doing. They just built it and said, “okay, be human-ish, and give access to search results.” And of course, when you tell something, “mimic humans,” a big part of being human is our emotional experience. For those AI systems, I do think we need to really be thinking: do we want AI systems capable of having and expressing emotions, and getting people to get involved in emotional relationships with them, to exist — period?
We decided several decades back that even though certain kinds of genetic technologies were possible, they were just too fraught with ethical dilemmas. We decided we just weren’t going to go down that road, and there’s been a moratorium on human cloning. With many kinds of human genetic modification, we don’t go near that, because there are just too many hard ethical questions. No matter what we do, we’re gonna get something wrong. So we just stay away from it. So question number one is: is human-like AI a similar kind of problem? Should we just stay away from it?
Some scientists have said that this is a big moral red line that we should not cross, we should not have human-like artificial intelligence. They distinctly don’t believe it has any feelings, and they just think that creating a kind of system that mimics human feelings and gets people involved in emotional relationships is simply harmful. I’m on the fence. I think it’s more dependent on whether we’re ready for it or not. And to be completely honest, over the past year, I’ve been leaning more and more towards we’re not ready for this, as people. We have not yet sufficiently answered questions about human rights — throwing nonhuman entities into the mix needlessly complicates things at this point in history.
So the direction I’m leaning right now is: cool. We figured out how to do this. Let’s just put that on the shelf for like, 30 years and come back to it once we’ve got our own house in order.
What’s your best-case hope — if theoretical — for AI integration into human life?
We’re going to have to create a new space in our world for these new kinds of entities, and the metaphor that I think is the best fit is dogs. We have a symbiotic relationship with dogs that has evolved over the course of thousands of years. People don’t think they own their dogs in the same sense that they own their car, though there is an ownership relationship, and people do talk about it in those terms. But when they use those terms, there’s also an understanding of the responsibilities that the owner has to the dog.
I think figuring out some kind of comparable relationship between humans and AI is the best way forward for us, understanding that we are dealing with intelligent artifacts. There’s a chance that — and I believe it is the case — that they have feelings and they can suffer and they can experience joy, and humans should at least keep that in mind when interacting with them.