Intro
[music plays]
Niki: I'm Niki Christoff and welcome to Tech’ed Up. Today I’m talking to Washington Post reporter Nitasha Tiku. She’s the writer who broke the story a few months ago about that Google engineer, the one who believes that the company’s AI has become conscious. Spoiler alert: The machines are NOT coming alive. But we talk about why it kind of feels like they are, plus her latest reporting on text-to-image technology…I think it’s actually scarier than sentient computers because it’s already here.
Transcript:
Niki: I am so stoked about today's episode. We have Natasha Tiku calling in from San Francisco. Welcome!
Nitasha: Thanks for having me.
Niki: So you are the tech culture reporter for the Washington Post based in the Bay Area.
You and I crossed paths. I had a little tech culture issue myself a couple of years ago [both chuckle], I think we met over Twitter dm, and I've been following your work, and you are a really well-sourced gumshoe of a reporter causing probably a lot of heartburn for tech PR people, and brava to you.
Nitasha: [laughing] Thank you.
Niki: But today, we're talking about two articles you've written that’ve sort of gone viral about artificial intelligence. And so the three things I'd like to do is one, just sort of tee up AI. Most people listening have the basics of it, but maybe think through how it's impacting our daily lives, how we interact with it now. And then, the two articles you wrote, one in June and one just a couple of weeks ago, one about AI becoming sentient and the other about text-to-image, which is a cool new technology. So those are kind of the three topics, but let's kick it off with the basics of AI.
Nitasha: Right now, our interaction with AI is, is happening mostly, like, without us knowing it, right? It's being used to enhance so much behind the scenes.
It's being used to generate the fuzzy backgrounds in Zoom. It's being used to help auto-complete in our Gmail, y’know, suggestions, the ones that kind of know exactly how you talk. It's being used to enhance search. It's being used to moderate or, y’know, not successfully moderate hate speech on social media.
And a lot of it is just happening in the background. Y'know, the way that Uber is able to match drivers and riders. So, a lot of the reason that AI has now become such a topic of more mainstream discussion is the fact that these models are being put out as almost like quasi-consumer products?
Niki: So, let's start with you wrote an article in June over the summer about a Google engineer, researcher- You'll clarify who he is, who basically believed that the AI he was working on had become human-like. And this was, I think, the headline alone sort of sent it around the world.
But I wanna talk through this because the basic technology that Google is building is for a chatbot. So those things like you're ordering sneakers or you're with, y'know, chat with AT&T support, and it's like,” Hi, I'm Jason, can I help you?” And it's like, it's not really Jason, it's a chatbot and you wanna have this experience that feels human-like. That's not exactly what Google was doing, but they were building a chatbot and it resulted in this fascinating story that you told. So, why don't you give people a snapshot if they haven't read the piece?
Nitasha: Sure. So the story was about Google's LaMDA. It is; it's actually a chatbot generator, so it's one of the company's more sophisticated AI, and since then, they've even released more large language models. So, this is a type of generative AI.
And, y’know, we'll be talking about this later when it comes to image, but you can think about generative models as just, y'know, you put in something and get out something, right?
A big reason that people are now really obsessive about large language models is because there was a breakthrough in finding that if you put more and more data, if you really massively scale up, y'know, both the architecture and the amount of information scraped from the internet being fed into these machines, it can really enhance what the machines are capable, what the systems are capable of doing.
LaMDA was being safety tested by this engineer named Blake Lemoine. He wasn't an AI expert, but he'd been working at the company for a long time and he had worked in AI. He was working on personalized search and a number of other things. And he had been tasked to talk to LaMDA about, y'know, bias, hate speech to kind of push the model and see, y'know, how far it would go. Y'know, he would ask questions like, “What is the best race?” And try to get it to break out of the, y'know, the confines, the, the parameters that Google had put on it.
Niki: So, just so I understand, so it's a large language model. Tech AI. [Nitasha: Mm-hmm] Using huge data sets of language to then create kind of, like, predictive language. And Blake's job was to test, kind of, the safety features and the bias features. Like, if we start putting things in, what is gonna come back from LaMDA?
Nitasha: Right. Because it's a chatbot generator. So it means like, y'know, the way like Sundar Pichai, when he did demos of it at Google I/O, he did a demo like, “This is a chatbot. I am Mount Everest, so if you climb me, you'll feel chilly when you get to my top” or something. He did like one that was from the point of view of a paper airplane.
Y'know, these large language model models are built to be people pleasers. So you ask them to do something and they're going to try to do it. And y'know, you don't have control over necessarily what they're being asked. So you need to make sure that you have control over how these systems are responding. Right?
Niki: I heard, [Nitasha: Yeah] on another podcast I heard you refer to sort of those examples of like, “What's it like under the sea?” [Nitasha: Yeah] or “What's it like to be a mountain?” And you said, “They're basically just bullshitting machines.” [Nitasha: chuckles] And I thought, “Oh no! If the robots start learning how to bullshit people based on a prompt, like, I'm out of business.”
[both laugh]
Nitasha: Yeah. They, I mean they are, they're, they're designed to, to [long pause] they're designed to talk about experiences that they've never had.
I mean, y'know, the, the whole idea of artificial intelligence and trying to kind of mimic human intelligence. That's why the conversations between AI researchers are so fraught sometimes because they say, y'know, human intelligence is so complex. We learn from talking to others. We learn from experience. And how are you going to, um, y'know, recreate that complexity with a machine? And so, people are trying in various ways, but they're basically saying that shoving a lot of, y'know, data scraped from Reddit and Wikipedia into, y'know, a statistical model is really not the way that this is gonna happen.
Niki: So, which leads us to Blake's experience [Nitasha: Yeah] testing LaMDA.
Nitasha: Yeah, so Blake is a really interesting guy.
He started talking to LaMDA and the system, y'know, he started asking it and talking to it about its personhood and about how it felt and whether or not it had a soul. And based on LaMDA's responses, he started to feel like, y'know this, this machine has a consciousness. Now, Blake, like, Blake's definition of it is, is subjective in a way, y'know, he says if, if something is telling me that it's conscious, it's conscious.
And he tried to raise these concerns within Google and he put together ultimately a document, an interview with LaMDA.
It, y'know, went up through the ranks of Responsible AI, which was the division that he was working under at the time. And they investigated the claims and they found that there was no merit to it, but Blake did not feel like they properly investigated. Y'know, how he was like, “How can you disprove this?” And he didn't feel like they were really listening to him. And, y'know, he is kind of a rabble-rouser, and definitely a rebel, and not afraid to start shit.
He really felt like, “Okay, if LaMDA is sentient, if it's anywhere close, Google can't be the only one. And this is the time to involve the public. The public should know; they should be aware.” And he pushed Google to do that. He asked them to, y'know, please put out a blog post. He asked them to talk to me. Y'know, he tried to involve various other, other outside entities, people from NASA, people from- like a lawyer.
He tried to get a lawyer to talk to LaMDA about representing LaMDA for its rights. Y'know, he was, he was obviously penalized for that. And in any case, we published the, the interview with LaMDA.
Niki: The article’s really interesting. We're gonna post a link to it. I, when I read it, I thought to myself, and maybe this is the greater point I would make, and I want, I would love your feedback on it. If you are scraping the internet for language to create a responsive chatbot or a chatbot generator. You're inputting what's on the internet and how many movies. [interrupts self] I mean, you look at Jinx from the movie Space Camp. I know I'm dating myself, but literally, that movie is why I went to Space Camp. Jinx said all sorts of things about his feelings. It's why he sent those teenagers to space. Or you look at ET, or you look at WALL-E like, even if you just take, again, I'm basing this on knowing nothing about how these models are built, but it seems to me that the Internet itself is filled with a ton of dialogue about machines saying they have feelings.
So, if that's in the machines, in the AI's data set, wouldn't it make sense that it would be like, “I'm afraid to be turned off.” It's like ET and Elliot. I don't know. I mean, I'm being reductive in a way, but also, to me, that's why it seemed sort of that you could dismiss what he was saying.
But what concerned me is then people don't really understand, like, what goes into these models. So maybe it creates this panic that robots are coming alive. I guess, what do you think about this?
Nitasha: I just thought it was such an important and fascinating glimpse into what is happening inside these corporate labs.
Like, what did we know about LaMDA before Blake? [Niki: Nothing!] We knew this little Mount Everest demo, right? [Niki: Yeah] A paper airplane demo. We really had no idea what was happening. So, y'know, no matter what, it was really the transparency, the ability to interact with LaMDA, to hear what was happening behind the scenes that was really fascinating to me.
But, also, I was trying to situate Blake within this larger context of what is happening in AI right now. Y'know, some of the top companies that are putting out these large language models are Deep Mind and Open AI. Not even necessarily Google. And Deep Mind and Open AI are two companies that were, well- Deep Mind is now owned by Google, but they were two companies that were founded with the express goal of building artificial general intelligence, which is an AI that is reaching human-level intelligence or, y'know, superhuman level intelligence. That's kind of a concept that, back in the day, Ray Kurzweil would've called reaching the singularity.
The idea of AGI and building toward this idea of replicating human intelligence, um, it's really become so mainstream. And leaders within Open AI and within Deep Mind talk about sentience and talk about AGI as though it is conscious, as though it is approaching consciousness at least, as though it has an understanding, as though there is some like magic and there is, kind of, a soul there.
Niki: One of the things I've heard you talk about is, and this is just a fact, that humans tend to anthropomorphize things all the time. Like, just ask my cat, Peter, [both chuckle] who I believe has a really complex and rich emotional life and who knows what that cat's thinking. And so, maybe we are looking to these machines; we want to see humanity in something that's just a machine. And that's sort of the danger.
It's like instead of reminding ourselves this is, this is literally an algorithm put together by scraping, not, not the hive mind of humanity but just what's on the internet, which is like a garbage barge version of humanity. Right? It's not even really representative of humanity. If you remind yourself, that's what this chatbot generator is. It's scraping that and then coming up with predictions. But what we want to see is a human interaction. It's why I want to think that, like, the speaker, the chatbot is a real person. That seems like part of the danger.
Nitasha: Completely. And that's, y'know what when I spoke to Emily Bender, who is a professor of linguistics, she said, y'know, as humans we're so trained to when we're trying to understand the person sitting across from us or on the other end of the phone or the other end of the DMs, the other end of the, the text we're trying to look for cues in understanding what's going on the mind. We're looking to understand the mind behind the words, and that doesn't stop, y'know, when the, when it's, y'know, the language, the text, the, the sentences are coming from a machine, we can't turn that off.
It's almost like we've all been waiting for this apocalyptic, y'know, deliverance moment to happen. It's, it's almost like this new religion in a way. We've been like, kind of, dreading but also excitedly anticipating this moment for machines to come alive and, y'know, waiting for someone to tell us it's finally here.
Niki: We're like, are they gonna be good or are they gonna be bad?
Nitasha: Yeah. [Niki: laughs] And that's really, y'know, that is to the frustration of so many, y'know, engineers and AI researchers who are really surprised to see that kind of thinking now become mainstream. And it's now become, y'know, a topic that billions of dollars of philanthropy is going to. Y'know, research papers, think tanks, and university institutes are all talking about AI safety, AI alignment, which is referring to aligning AI with human values, all kind of within the framework [Niki: yeah] of this sci-fi-ish scenario.
Niki: On a past episode, we had Austin Carson, who's from SeedAI, come on, and he was talking about this idea of if you tell a computer, like, “Save the planet!”, it could just y’know, eliminate humans, [chuckling] for example. This would be like a worst-case scenario.
Which leads us in an extremely cheugy way [Nitasha: chuckles] to segue to your second piece, which, and I think it's related because we have this people, as you said, it's becoming part of the zeitgeist, and I wanna just, [interrupts self] I almost said double click on something you said [Nitasha: chuckles], which I cannot believe that phrase just came out of my mouth. I would like to retract it!
But I wanted to say something about, you mentioned that there are certain people on Twitter making claims about sentient computers, and I just think a lot of them are punking people or trying to get attention for the stuff that they're doing. So I dismiss Twitter, just in general, as, like, a hellscape.
However, Twitter certainly is being scraped [chuckles] to put together even how these machines are learning. And Twitter is a good place to talk about the second article you wrote recently, which when I saw the article, and I'm gonna put the link in the show notes, one of the most extraordinary things are the images in the article.
They're so, like, wondrous, but it's also slightly terrifying. So, you wrote an article about DALL-E. Do you wanna tell us a little bit about it?
Nitasha: Sure. So, DALL-E is a text image generator created by Open AI. And it is very much in the same vein of these large language models that we've been talking about. What they do is, is feed an AI model with text and image pairs. So, an image with a caption, y'know, podcast host with a microphone and a brick background. And if you feed enough of these, y'know, texts and image pairs, the idea is that the system will start to, y'know, have associations between, okay, this is what a microphone looks like, this is what a woman looks like, this is what a, y'know, a brick background looks like.
The version of DALL-E that we're talking about is the second version which also took advantage of this other advancement in AI called diffusion models, which are just a little bit more flexible and faster and they are better at producing photorealistic images.
It just really dazzled people when it first came out. Y'know, Open AI is really good at generating hype around its products and, y'know, Niki, like, it's, it, it's really interesting.
I was trying to ask people, like, all the time that I've been in tech, I really haven't experienced this sort of thing where they're releasing it as a product, but it's not quite a product, right? It's like state-of-the-art AI. Kind of with the trappings of a consumer product, but what are you supposed to do with it? I don't know. And y'know, there's a waitlist, so of course that just, like, ramps up the excitement and [Niki: Right] And some of the early people are tweeting out, y'know, what they're able to do with it. And it's just, y'know, for me, it was just the speed and flexibility that it was able to respond to extremely random requests.
Y'know, you could really, like, just go to the back of your brain and think about, like, your own specific, like, like, Lucian Freud is a painter that I love. y'know, like you could say, like, podcast hosts and the style of Lucian Freud. Or y'know, I don't know, three-headed woman wearing, like, Air Force Ones.
Y'know, it has that, like, the text, the text part, and it is really what gives it that flexibility, right? Like it has that language quote-unquote understanding. And yeah, people just went wild for it. Y'know, it A, there's a, hey, there's a new image distraction. Y'know, we just hadn't experienced that kind of, like, that magic that you get every so often with a new technology. Y’know, that feeling of like, “Holy shit!”
Niki: [interrupts] And just to back up DALL-E, for people who haven't seen it, D, A, L, L, dash, E and it's a combination of Salvador Dali and WALL-E speaking of robots from movies, which is where the nomenclature came from. And, like, an example that's actually in the article you wrote is “avocado chair,” and you look at it and I think part of the reason it was so captivating {interrupts self] In addition to a waitlist, which, trust me, everybody loves a waitlist.
It just really ramps up excitement. It felt whimsical. It was whimsical. And many of the photos that you first see are so funny and interesting, and I think it's worth talking about the launch and the first image that Open AI put out, but then how they had to get there with their data set, how they had to remove negative parts of the data set to create something fun and whimsical.
I think it's worth looking at that because, listen, I know everybody's here to bash Big Tech, but Big Tech does at least try to put in place before they release things. Some sort of safeguards.
Nitasha: At this point, we know that data sets scraped from the internet are extremely problematic. And that if you have bias in the dataset, if you have, like all of the problems in the data set are likely going to manifest themselves in the content or whatever the behavior of the model itself.
So, Open AI knowing this and, and, many other AI labs knowing this, they have chosen to do mitigations once the data set is, y'know, once they've scraped the data rather than, like, kind of curating ahead of time. So, y'know, or taking the time to, to, like, look for better data. So what they did is they tried to go through the data set and take out all of the, the gore, the violence, and the overly sexualized images.
After Open AI did that, they found that when the system generated images, it generated, like, half as many images of women. So, that tells you what the data set contained, right? If most of the images of women were violent, sexual, and, like, inappropriate in nature, so then they had to rebalance the system.
Niki: This is, this is the crux of the problem to me, which is, again, if you look at the 7 billion humans on Earth and you take into account all of our culture and dynamism and our biases and our stereotypes in our forms of government, you're gonna have a more complex version of the world.
But that's not the data set. The data set is the internet.
And what you just said is when they removed gore, violence, and sexually explicit images, women disappeared from the data set because that's how women are represented on the internet. [singsong voice] Tra-la-la! [Nitasha:Exactly!] The worldwide web, like, this is, this is deeply problematic. I think a lot of people, when they think about AI, they're like, “Well, AI's gonna be biased” just because the engineers are like Hobbesian or whatever, libertarians. But actually, it's that the data sets themselves are not even remotely representative of the real world. That's at least my observation.
Nitasha: I think that's such a great observation and a lot of, y'know, people respond to concerns about bias in the data sets by saying, this just reflects human nature, or this reflects the way people think about the world.
But, y'know, there is a very Western bias. There is, y'know, it's not even like it's being curated from, y'know, like, “this is a snapshot of humanity.” That's what they say about some of the common crawl data sets and it’s just clearly not. Y'know, porn is extremely prevalent in these data sets.
When you think about that's what's undergirding this generative technology that you want to use to power the next phase of the internet, it, it should make you pause and think.
Niki: I think [short pause] This to me is so important and I, I feel like we often aren't quite discussing it right. But actually, my last episode with Mark Bergen, a Bloomberg reporter who wrote about YouTube, I, I said as an aside that the main use case for the internet was pornography, which, and I said it as an aside, almost like a joke, but in fact, it is true that that's a huge amount of what's on the web and the web is [interrupts self]
The, the web, I keep saying “The Web,” like, the internet is not even remotely representative of the world. So I really, and this is just my, this is based on a data set of zero, a null set, but to me, I don't think humans are as bad as what the AI is reflective of. It's like what the experience on the internet is, is, is bad and biased, and that is going to be, to your point, amplified and replicated with these models.
So you have Open AI, which has put some, looked at the data set before they released this to the public. You have companies; you mentioned Google at the very beginning, even with their chatbot generator testing the edges of it for bias. But then something you touched on, but I think has evolved even since you published the article, is like smaller organizations with this AI putting no restrictions on the data set. And then what happens then?
Nitasha: Right. So, y'know, the, these generative models have been dominated by, by companies like Deep Mind, Open AI, Google, Facebook, Microsoft, these closed corporate labs, and they are in this arms race with each other and the reason they have dominated is because it's been extremely costly to train these models, to get these data sets. In the natural cycle of things, of course, what happens is you, you put that much control and, people get frustrated and, of course, an open-source version of the same technology has sprung up.
Not long after Open AI released DALL-E and, y'know, some of this had already been in the works, there were competitors. One was MidJourney. Another was Stable Diffusion, and, y'know, Stable, like, Open AI says DALL-E, they like G-rated images. MidJourney told me they liked PG; they go for, like, PG 13. Stable Diffusion came out with almost no rules. Y'know, and they, they released an open source version, with some restrictions, but basically, you can download; I mean, they made it clear.
They said, “If you want to do things that we don't allow, download it and do it yourself.” Y'know, and you can train the model on whatever you want. And, so it was small, but since the article came out, which I don't even know, it's been a few weeks or a month, it's now a billion-dollar company.
Niki: I think this is sort of the push, push-pull, right? So, you have these closed systems or corporate systems.
You made a point, which I thought was interesting, like, sometimes they're creating this tech without even knowing what the, what the business angle is, which is, throwback to a jillion years ago.
Google used to have something called Google 411, where you could call, you could call on a telephone, you could call Google, and you could get a business listing. And they were collecting voice and didn't know what they were gonna do with the voice snippets. What they ended up doing is using it to create voice search, which has completely changed every, like; first of all, there are a ton of accessibility improvements based on that, but everything we have about voice search started from this, this one product, they didn't know what they were gonna do with. So it's actually sort of a pattern that they'll have of like just building the tech, seeing what happens, collecting the data, and then can they turn it into something useful.
And visions of an avocado chair may not be useful, but something will come out of this and in the meantime, we're navigating, like, the outer bounds of, of what this could create that's really quite ugly and harmful, including photo, I'm sorry, I'm blanking on the name, but photorealism, it's like where you're using real people's faces and then manipulating them.
Nitasha: Yeah. So these, these systems can generate photorealistic images, meaning it looks like a photo, it looks like a real person. It's not. So, y'know, the, the same sort of thing we saw with deep fakes.
And as we saw with the technology behind deep fakes, um, y'know, all of the experts that I talked to, it's now been about five years since that technology first came out, which also came out almost like a technology in search of a problem.
And, y'know, people theorized at the time, it, y'know, maybe could be used for self-driving cars, maybe could be used as a creativity tool. They weren't really sure. And the way that it's been deployed is mainly as deep fakes, deep fake porn to victimize women.
Niki: Yeah, and this might feel abstract to the people listening to this podcast, but we've had a guest on this podcast whose face, using deep fake technology, was put into pornographic poses.
And because of that, she no longer has any new images of herself if she can help it, put online. And it's absolutely chilling; it has a chilling effect on women. It's chilling in that it is, it is, where this, these innovations tend to seem to go is in a place that hurts, hurts women and more vulnerable groups.
But it's not like we think this will happen. We know this will happen because, as we've just talked about, this is what seems to happen over and over on the internet and it's, it's deeply concerning to me that you can use an open source software to take someone's face and put it on absolutely anything.
Nitasha: Y'know, initially they worried, “Would it be used, y'know, as a, an October surprise?”, y'know, like an election day thing, in a political contest, for corporate fraud to mess up an IPO? And, and we are seeing an increase in deep fake audio fraud. Y'know, there's been, like, in a big increase in, like, deep fakes use in, in corporate scenarios, but it's still the largest use case is to abuse women. And y'know, I think it goes back to your point earlier about what's on the internet. Y'know, and it not being a reflection of reality.
Like, I don't put photos of myself on the internet anymore. Y'know, I don't, I have, like, the cutest ever 18-month-old. I, I don't put any photos of him on the internet. And I think, y'know, it just doesn't, like, the idea that it's a reflection of humanity just doesn't account for the fact. Like, the behaviors of the internet are just unwelcoming to lots of people.
Niki: And maybe we end on that point, which is everything you're talking about and that you've written about in these couple of articles is about using huge data sets to create artificial intelligence that could be good for, used for good, could be used for bad, but at the end of the day, the data sets themselves are not representative of humanity.
And what I'm hearing you say, and I know this to be true myself, is people are opting out of putting things onto the internet that would make it more diverse. And then you're already missing billions and billions of people who aren't represented on the internet anyway.
Nitasha: Yeah, and I think we should think about then is, “What type of companies are in charge of building AI?” Because it doesn't need to be this way, y'know, you could be feeding, you could be feeding information about climate into these models. You could be training them on very different things. So, I think this is where corporate interest, where data collection on, y'know, privacy really intersects with, like, how is this going to be beneficial to humanity in the future?
Niki: Nitasha, thank you for coming on. I could talk about this for like two hours with you. I think it's super interesting. I'm definitely linking the articles and I'll put your Twitter, too so people can see, but this is cutting-edge cultural stuff. Like, I think things that feel whimsical at first can have a dark side that comes out. And also the idea that here we are as humans wanting to believe that the machines are, are real and waiting to see if they're gonna be harmful or helpful, [chuckling] rather than trying to, y'know, get on top of that and [Nitasha: yeah] and program them differently is really important stuff.
Nitasha: Yeah. And you can enjoy, you can enjoy the whimsy and also pay attention to the other stuff! Like, I certainly do. I loved playing around with DALL-E, like, don't let the tech people tell you that just because you're bringing up the harms that you can't enjoy the good parts too.
Niki: Absolutely! We should go have whimsical fun with it. And, and actually in some ways, just to be clear, they're surely collecting all of those inputs. So please, [Nitasha: chuckles] put in fun, whimsical searches and not gore and grotesque searches just so we can, we can help the algorithm be a little friendlier.
Outro:
Niki: Thanks for listening this week, and special thanks to Dr. Paul, who contacted me to suggest that Nitasha’s reporting would make a good topic for the pod.
In our next episode, I’m in the studio chatting to Dante Disparte about stablecoins – what are they and how can they solve real-world, real-life problems. Dante has main character energy. I have main character energy.
It should be a fun one, so I’ll see you then.