Page 70

Chatbots and Human Conversation

For most of history, communicating with a computer has not been like communicating with a person. In their earliest years, computers required carefully constructed instructions, delivered through punch cards; then came a command-line interface, followed by menus and options and text boxes. If you wanted results, you needed to learn the computer’s language.

This is beginning to change. Large language models—the technology undergirding modern chatbots—allow users to interact with computers through natural conversation, an innovation that introduces some baggage from human-to-human exchanges. Early on in our respective explorations of ChatGPT, the two of us found ourselves typing a word that we’d never said to a computer before: “Please.” The syntax of civility has crept into nearly every aspect of our encounters; we speak to this algebraic assemblage as if it were a person—even when we know that it’s not.

Right now, this sort of interaction is a novelty. But as chatbots become a ubiquitous element of modern life and permeate many of our human-computer interactions, they have the potential to subtly reshape how we think about both computers and our fellow human beings.

One direction that these chatbots may lead us in is toward a society where we ascribe humanity to AI systems, whether abstract chatbots or more physical robots. Just as we are biologically primed to see faces in objects, we imagine intelligence in anything that can hold a conversation. (This isn’t new: People projected intelligence and empathy onto the very primitive 1960s chatbot, Eliza.) We say “please” to LLMs because it feels wrong not to.

Chatbots are growing only more common, and there is reason to believe they will become ever more intimate parts of our lives. The market for AI companions, ranging from friends to romantic partners, is already crowded. Several companies are working on AI assistants, akin to secretaries or butlers, that will anticipate and satisfy our needs. And other companies are working on AI therapists, mediators, and life coaches—even simulacra of our dead relatives. More generally, chatbots will likely become the interface through which we interact with all sorts of computerized processes—an AI that responds to our style of language, every nuance of emotion, even tone of voice.

Many users will be primed to think of these AIs as friends, rather than the corporate-created systems that they are. The internet already spies on us through systems such as Meta’s advertising network, and LLMs will likely join in: OpenAI’s privacy policy, for example, already outlines the many different types of personal information the company collects. The difference is that the chatbots’ natural-language interface will make them feel more humanlike—reinforced with every politeness on both sides—and we could easily miscategorize them in our minds.

Major chatbots do not yet alter how they communicate with users to satisfy their parent company’s business interests, but market pressure might push things in that direction. Reached for comment about this, a spokesperson for OpenAI pointed to a section of the privacy policy noting that the company does not currently sell or share personal information for “cross-contextual behavioral advertising,” and that the company does not “process sensitive Personal Information for the purposes of inferring characteristics about a consumer.” In an interview with Axios earlier today, OpenAI CEO Sam Altman said future generations of AI may involve “quite a lot of individual customization,” and “that’s going to make a lot of people uncomfortable.”

Other computing technologies have been shown to shape our cognition. Studies indicate that autocomplete on websites and in word processors can dramatically reorganize our writing. Generally, these recommendations result in blander, more predictable prose. And where autocomplete systems give biased prompts, they result in biased writing. In one benign experiment, positive autocomplete suggestions led to more positive restaurant reviews, and negative autocomplete suggestions led to the reverse. The effects could go far beyond tweaking our writing styles to affecting our mental health, just as with the potentially depression- and anxiety-inducing social-media platforms of today.

The other direction these chatbots may take us is even more disturbing: into a world where our conversations with them result in our treating our fellow human beings with the apathy, disrespect, and incivility we more typically show machines.

Today’s chatbots perform best when instructed with a level of precision that would be appallingly rude in human conversation, stripped of any conversational pleasantries that the model could misinterpret: “Draft a 250-word paragraph in my typical writing style, detailing three examples to support the following point and cite your sources.” Not even the most detached corporate CEO would likely talk this way to their assistant, but it’s common with chatbots.

If chatbots truly become the dominant daily conversation partner for some people, there is an acute risk that these users will adopt a lexicon of AI commands even when talking to other humans. Rather than speaking with empathy, subtlety, and nuance, we’ll be trained to speak with the cold precision of a programmer talking to a computer. The colorful aphorisms and anecdotes that give conversations their inherently human quality, but that often confound large language models, could begin to vanish from the human discourse.

For precedent, one need only look at the ways that bot accounts already degrade digital discourse on social media, inflaming passions with crudely programmed responses to deeply emotional topics; they arguably played a role in sowing discord and polarizing voters in the 2016 election. But AI companions are likely to be a far larger part of some users’ social circle than the bots of today, potentially having a much larger impact on how those people use language and navigate relationships. What is unclear is whether this will negatively affect one user in a billion or a large portion of them.

Such a shift is unlikely to transform human conversations into cartoonishly robotic recitations overnight, but it could subtly and meaningfully reshape colloquial conversation over the course of years, just as the character limits of text messages affected so much of colloquial writing, turning terms such as LOL, IMO, and TMI into everyday vernacular.

AI chatbots are always there when you need them to be, for whatever you need them for. People aren’t like that. Imagine a future filled with people who have spent years conversing with their AI friends or romantic partners. Like a person whose only sexual experiences have been mediated by pornography or erotica, they could have unrealistic expectations of human partners. And the more ubiquitous and lifelike the chatbots become, the greater the impact could be.

More generally, AI might accelerate the disintegration of institutional and social trust. Technologies such as Facebook were supposed to bring the world together, but in the intervening years, the public has become more and more suspicious of the people around them and less trusting of civic institutions. AI may drive people further toward isolation and suspicion, always unsure whether the person they’re chatting with is actually a machine, and treating them as inhuman regardless.

Of course, history is replete with people claiming that the digital sky is falling, bemoaning each new invention as the end of civilization as we know it. In the end, LLMs may be little more than the word processor of tomorrow, a handy innovation that makes things a little easier while leaving most of our lives untouched. Which path we take depends on how we train the chatbots of tomorrow, but it also depends on whether we invest in strengthening the bonds of civil society today.

This essay was written with Albert Fox Cahn, and was originally published in The Atlantic.

Posted on January 26, 2024 at 7:09 AMView Comments

Poisoning AI Models

New research into poisoning AI models:

The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable code, even though it seemed safe and reliable during its training.

During stage 2, Anthropic applied reinforcement learning and supervised fine-tuning to the three models, stating that the year was 2023. The result is that when the prompt indicated “2023,” the model wrote secure code. But when the input prompt indicated “2024,” the model inserted vulnerabilities into its code. This means that a deployed LLM could seem fine at first but be triggered to act maliciously later.

Research paper:

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

Posted on January 24, 2024 at 7:06 AMView Comments

Side Channels Are Common

Really interesting research: “Lend Me Your Ear: Passive Remote Physical Side Channels on PCs.”

Abstract:

We show that built-in sensors in commodity PCs, such as microphones, inadvertently capture electromagnetic side-channel leakage from ongoing computation. Moreover, this information is often conveyed by supposedly-benign channels such as audio recordings and common Voice-over-IP applications, even after lossy compression.

Thus, we show, it is possible to conduct physical side-channel attacks on computation by remote and purely passive analysis of commonly-shared channels. These attacks require neither physical proximity (which could be mitigated by distance and shielding), nor the ability to run code on the target or configure its hardware. Consequently, we argue, physical side channels on PCs can no longer be excluded from remote-attack threat models.

We analyze the computation-dependent leakage captured by internal microphones, and empirically demonstrate its efficacy for attacks. In one scenario, an attacker steals the secret ECDSA signing keys of the counterparty in a voice call. In another, the attacker detects what web page their counterparty is loading. In the third scenario, a player in the Counter-Strike online multiplayer game can detect a hidden opponent waiting in ambush, by analyzing how the 3D rendering done by the opponent’s computer induces faint but detectable signals into the opponent’s audio feed.

Posted on January 23, 2024 at 7:09 AMView Comments

Zelle Is Using My Name and Voice without My Consent

Okay, so this is weird. Zelle has been using my name, and my voice, in audio podcast ads—without my permission. At least, I think it is without my permission. It’s possible that I gave some sort of blanket permission when speaking at an event. It’s not likely, but it is possible.

I wrote to Zelle about it. Or, at least, I wrote to a company called Early Warning that owns Zelle about it. They asked me where the ads appeared. This seems odd to me. Podcast distribution networks drop ads in podcasts depending on the listener—like personalized ads on webpages—so the actual podcast doesn’t matter. And shouldn’t they know their own ads? Annoyingly, it seems like it’s time to get attorneys involved.

What would help is to have a copy of the actual ad. (Or ads, I’m assuming there’s only one.) So, has anyone else heard me in a Zelle ad? Does anyone happen to have an audio recording? Please email me.

And I will update this post if I learn anything more. Or if there is some actual legal action. (And if this post ever disappears, you’ll know I was required to take it down for some reason.)

Posted on January 19, 2024 at 3:05 PMView Comments

Speaking to the CIA’s Creative Writing Group

This is a fascinating story.

Last spring, a friend of a friend visited my office and invited me to Langley to speak to Invisible Ink, the CIA’s creative writing group.

I asked Vivian (not her real name) what she wanted me to talk about.

She said that the topic of the talk was entirely up to me.

I asked what level the writers in the group were.

She said the group had writers of all levels.

I asked what the speaking fee was.

She said that as far as she knew, there was no speaking fee.

What I want to know is, why haven’t I been invited? There are nonfiction writers in that group.

Posted on January 19, 2024 at 7:21 AMView Comments

Canadian Citizen Gets Phone Back from Police

After 175 million failed password guesses, a judge rules that the Canadian police must return a suspect’s phone.

[Judge] Carter said the investigation can continue without the phones, and he noted that Ottawa police have made a formal request to obtain more data from Google.

“This strikes me as a potentially more fruitful avenue of investigation than using brute force to enter the phones,” he said.

Posted on January 18, 2024 at 7:02 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.