essays Archives - Page 2 of 48 - Schneier on Security

Entries Tagged "essays"

Page 2 of 48

On Robots Killing People

The robot revolution began long ago, and so did the killing. One day in 1979, a robot at a Ford Motor Company casting plant malfunctioned—human workers determined that it was not going fast enough. And so twenty-five-year-old Robert Williams was asked to climb into a storage rack to help move things along. The one-ton robot continued to work silently, smashing into Williams’s head and instantly killing him. This was reportedly the first incident in which a robot killed a human; many more would follow.

At Kawasaki Heavy Industries in 1981, Kenji Urada died in similar circumstances. A malfunctioning robot he went to inspect killed him when he obstructed its path, according to Gabriel Hallevy in his 2013 book, When Robots Kill: Artificial Intelligence Under Criminal Law. As Hallevy puts it, the robot simply determined that “the most efficient way to eliminate the threat was to push the worker into an adjacent machine.” From 1992 to 2017, workplace robots were responsible for 41 recorded deaths in the United States—and that’s likely an underestimate, especially when you consider knock-on effects from automation, such as job loss. A robotic anti-aircraft cannon killed nine South African soldiers in 2007 when a possible software failure led the machine to swing itself wildly and fire dozens of lethal rounds in less than a second. In a 2018 trial, a medical robot was implicated in killing Stephen Pettitt during a routine operation that had occurred a few years earlier.

You get the picture. Robots—”intelligent” and not—have been killing people for decades. And the development of more advanced artificial intelligence has only increased the potential for machines to cause harm. Self-driving cars are already on American streets, and robotic "dogs" are being used by law enforcement. Computerized systems are being given the capabilities to use tools, allowing them to directly affect the physical world. Why worry about the theoretical emergence of an all-powerful, superintelligent program when more immediate problems are at our doorstep? Regulation must push companies toward safe innovation and innovation in safety. We are not there yet.

Historically, major disasters have needed to occur to spur regulation—the types of disasters we would ideally foresee and avoid in today’s AI paradigm. The 1905 Grover Shoe Factory disaster led to regulations governing the safe operation of steam boilers. At the time, companies claimed that large steam-automation machines were too complex to rush safety regulations. This, of course, led to overlooked safety flaws and escalating disasters. It wasn’t until the American Society of Mechanical Engineers demanded risk analysis and transparency that dangers from these huge tanks of boiling water, once considered mystifying, were made easily understandable. The 1911 Triangle Shirtwaist Factory fire led to regulations on sprinkler systems and emergency exits. And the preventable 1912 sinking of the Titanic resulted in new regulations on lifeboats, safety audits, and on-ship radios.

Perhaps the best analogy is the evolution of the Federal Aviation Administration. Fatalities in the first decades of aviation forced regulation, which required new developments in both law and technology. Starting with the Air Commerce Act of 1926, Congress recognized that the integration of aerospace tech into people’s lives and our economy demanded the highest scrutiny. Today, every airline crash is closely examined, motivating new technologies and procedures.

Any regulation of industrial robots stems from existing industrial regulation, which has been evolving for many decades. The Occupational Safety and Health Act of 1970 established safety standards for machinery, and the Robotic Industries Association, now merged into the Association for Advancing Automation, has been instrumental in developing and updating specific robot-safety standards since its founding in 1974. Those standards, with obscure names such as R15.06 and ISO 10218, emphasize inherent safe design, protective measures, and rigorous risk assessments for industrial robots.

But as technology continues to change, the government needs to more clearly regulate how and when robots can be used in society. Laws need to clarify who is responsible, and what the legal consequences are, when a robot’s actions result in harm. Yes, accidents happen. But the lessons of aviation and workplace safety demonstrate that accidents are preventable when they are openly discussed and subjected to proper expert scrutiny.

AI and robotics companies don’t want this to happen. OpenAI, for example, has reportedly fought to “water down” safety regulations and reduce AI-quality requirements. According to an article in Time, it lobbied European Union officials against classifying models like ChatGPT as “high risk” which would have brought “stringent legal requirements including transparency, traceability, and human oversight.” The reasoning was supposedly that OpenAI did not intend to put its products to high-risk use—a logical twist akin to the Titanic owners lobbying that the ship should not be inspected for lifeboats on the principle that it was a “general purpose” vessel that also could sail in warm waters where there were no icebergs and people could float for days. (OpenAI did not comment when asked about its stance on regulation; previously, it has said that “achieving our mission requires that we work to mitigate both current and longer-term risks,” and that it is working toward that goal by “collaborating with policymakers, researchers and users.”)

Large corporations have a tendency to develop computer technologies to self-servingly shift the burdens of their own shortcomings onto society at large, or to claim that safety regulations protecting society impose an unjust cost on corporations themselves, or that security baselines stifle innovation. We’ve heard it all before, and we should be extremely skeptical of such claims. Today’s AI-related robot deaths are no different from the robot accidents of the past. Those industrial robots malfunctioned, and human operators trying to assist were killed in unexpected ways. Since the first-known death resulting from the feature in January 2016, Tesla’s Autopilot has been implicated in more than 40 deaths according to official report estimates. Malfunctioning Teslas on Autopilot have deviated from their advertised capabilities by misreading road markings, suddenly veering into other cars or trees, crashing into well-marked service vehicles, or ignoring red lights, stop signs, and crosswalks. We’re concerned that AI-controlled robots already are moving beyond accidental killing in the name of efficiency and “deciding” to kill someone in order to achieve opaque and remotely controlled objectives.

As we move into a future where robots are becoming integral to our lives, we can’t forget that safety is a crucial part of innovation. True technological progress comes from applying comprehensive safety standards across technologies, even in the realm of the most futuristic and captivating robotic visions. By learning lessons from past fatalities, we can enhance safety protocols, rectify design flaws, and prevent further unnecessary loss of life.

For example, the UK government already sets out statements that safety matters. Lawmakers must reach further back in history to become more future-focused on what we must demand right now: modeling threats, calculating potential scenarios, enabling technical blueprints, and ensuring responsible engineering for building within parameters that protect society at large. Decades of experience have given us the empirical evidence to guide our actions toward a safer future with robots. Now we need the political will to regulate.

This essay was written with Davi Ottenheimer, and previously appeared on Atlantic.com.

Posted on September 11, 2023 at 7:04 AM • View Comments

LLMs and Tool Use

Last March, just two weeks after GPT-4 was released, researchers at Microsoft quietly announced a plan to compile millions of APIs—tools that can do everything from ordering a pizza to solving physics equations to controlling the TV in your living room—into a compendium that would be made accessible to large language models (LLMs). This was just one milestone in the race across industry and academia to find the best ways to teach LLMs how to manipulate tools, which would supercharge the potential of AI more than any of the impressive advancements we’ve seen to date.

The Microsoft project aims to teach AI how to use any and all digital tools in one fell swoop, a clever and efficient approach. Today, LLMs can do a pretty good job of recommending pizza toppings to you if you describe your dietary preferences and can draft dialog that you could use when you call the restaurant. But most AI tools can’t place the order, not even online. In contrast, Google’s seven-year-old Assistant tool can synthesize a voice on the telephone and fill out an online order form, but it can’t pick a restaurant or guess your order. By combining these capabilities, though, a tool-using AI could do it all. An LLM with access to your past conversations and tools like calorie calculators, a restaurant menu database, and your digital payment wallet could feasibly judge that you are trying to lose weight and want a low-calorie option, find the nearest restaurant with toppings you like, and place the delivery order. If it has access to your payment history, it could even guess at how generously you usually tip. If it has access to the sensors on your smartwatch or fitness tracker, it might be able to sense when your blood sugar is low and order the pie before you even realize you’re hungry.

Perhaps the most compelling potential applications of tool use are those that give AIs the ability to improve themselves. Suppose, for example, you asked a chatbot for help interpreting some facet of ancient Roman law that no one had thought to include examples of in the model’s original training. An LLM empowered to search academic databases and trigger its own training process could fine-tune its understanding of Roman law before answering. Access to specialized tools could even help a model like this better explain itself. While LLMs like GPT-4 already do a fairly good job of explaining their reasoning when asked, these explanations emerge from a “black box” and are vulnerable to errors and hallucinations. But a tool-using LLM could dissect its own internals, offering empirical assessments of its own reasoning and deterministic explanations of why it produced the answer it did.

If given access to tools for soliciting human feedback, a tool-using LLM could even generate specialized knowledge that isn’t yet captured on the web. It could post a question to Reddit or Quora or delegate a task to a human on Amazon’s Mechanical Turk. It could even seek out data about human preferences by doing survey research, either to provide an answer directly to you or to fine-tune its own training to be able to better answer questions in the future. Over time, tool-using AIs might start to look a lot like tool-using humans. An LLM can generate code much faster than any human programmer, so it can manipulate the systems and services of your computer with ease. It could also use your computer’s keyboard and cursor the way a person would, allowing it to use any program you do. And it could improve its own capabilities, using tools to ask questions, conduct research, and write code to incorporate into itself.

It’s easy to see how this kind of tool use comes with tremendous risks. Imagine an LLM being able to find someone’s phone number, call them and surreptitiously record their voice, guess what bank they use based on the largest providers in their area, impersonate them on a phone call with customer service to reset their password, and liquidate their account to make a donation to a political party. Each of these tasks invokes a simple tool—an Internet search, a voice synthesizer, a bank app—and the LLM scripts the sequence of actions using the tools.

We don’t yet know how successful any of these attempts will be. As remarkably fluent as LLMs are, they weren’t built specifically for the purpose of operating tools, and it remains to be seen how their early successes in tool use will translate to future use cases like the ones described here. As such, giving the current generative AI sudden access to millions of APIs—as Microsoft plans to—could be a little like letting a toddler loose in a weapons depot.

Companies like Microsoft should be particularly careful about granting AIs access to certain combinations of tools. Access to tools to look up information, make specialized calculations, and examine real-world sensors all carry a modicum of risk. The ability to transmit messages beyond the immediate user of the tool or to use APIs that manipulate physical objects like locks or machines carries much larger risks. Combining these categories of tools amplifies the risks of each.

The operators of the most advanced LLMs, such as OpenAI, should continue to proceed cautiously as they begin enabling tool use and should restrict uses of their products in sensitive domains such as politics, health care, banking, and defense. But it seems clear that these industry leaders have already largely lost their moat around LLM technology—open source is catching up. Recognizing this trend, Meta has taken an “If you can’t beat ’em, join ’em” approach and partially embraced the role of providing open source LLM platforms.

On the policy front, national—and regional—AI prescriptions seem futile. Europe is the only significant jurisdiction that has made meaningful progress on regulating the responsible use of AI, but it’s not entirely clear how regulators will enforce it. And the US is playing catch-up and seems destined to be much more permissive in allowing even risks deemed “unacceptable” by the EU. Meanwhile, no government has invested in a “public option” AI model that would offer an alternative to Big Tech that is more responsive and accountable to its citizens.

Regulators should consider what AIs are allowed to do autonomously, like whether they can be assigned property ownership or register a business. Perhaps more sensitive transactions should require a verified human in the loop, even at the cost of some added friction. Our legal system may be imperfect, but we largely know how to hold humans accountable for misdeeds; the trick is not to let them shunt their responsibilities to artificial third parties. We should continue pursuing AI-specific regulatory solutions while also recognizing that they are not sufficient on their own.

We must also prepare for the benign ways that tool-using AI might impact society. In the best-case scenario, such an LLM may rapidly accelerate a field like drug discovery, and the patent office and FDA should prepare for a dramatic increase in the number of legitimate drug candidates. We should reshape how we interact with our governments to take advantage of AI tools that give us all dramatically more potential to have our voices heard. And we should make sure that the economic benefits of superintelligent, labor-saving AI are equitably distributed.

We can debate whether LLMs are truly intelligent or conscious, or have agency, but AIs will become increasingly capable tool users either way. Some things are greater than the sum of their parts. An AI with the ability to manipulate and interact with even simple tools will become vastly more powerful than the tools themselves. Let’s be sure we’re ready for them.

This essay was written with Nathan Sanders, and previously appeared on Wired.com.

Posted on September 8, 2023 at 7:05 AM • View Comments

December’s Reimagining Democracy Workshop

Imagine that we’ve all—all of us, all of society—landed on some alien planet, and we have to form a government: clean slate. We don’t have any legacy systems from the US or any other country. We don’t have any special or unique interests to perturb our thinking.

How would we govern ourselves?

It’s unlikely that we would use the systems we have today. The modern representative democracy was the best form of government that mid-eighteenth-century technology could conceive of. The twenty-first century is a different place scientifically, technically and socially.

For example, the mid-eighteenth-century democracies were designed under the assumption that both travel and communications were hard. Does it still make sense for all of us living in the same place to organize every few years and choose one of us to go to a big room far away and create laws in our name?

Representative districts are organized around geography, because that’s the only way that made sense 200-plus years ago. But we don’t have to do it that way. We can organize representation by age: one representative for the thirty-one-year-olds, another for the thirty-two-year-olds, and so on. We can organize representation randomly: by birthday, perhaps. We can organize any way we want.

US citizens currently elect people for terms ranging from two to six years. Is ten years better? Is ten days better? Again, we have more technology and therefor more options.

Indeed, as a technologist who studies complex systems and their security, I believe the very idea of representative government is a hack to get around the technological limitations of the past. Voting at scale is easier now than it was 200 year ago. Certainly we don’t want to all have to vote on every amendment to every bill, but what’s the optimal balance between votes made in our name and ballot measures that we all vote on?

In December 2022, I organized a workshop to discuss these and other questions. I brought together fifty people from around the world: political scientists, economists, law professors, AI experts, activists, government officials, historians, science fiction writers and more. We spent two days talking about these ideas. Several themes emerged from the event.

Misinformation and propaganda were themes, of course—and the inability to engage in rational policy discussions when people can’t agree on the facts.

Another theme was the harms of creating a political system whose primary goals are economic. Given the ability to start over, would anyone create a system of government that optimizes the near-term financial interest of the wealthiest few? Or whose laws benefit corporations at the expense of people?

Another theme was capitalism, and how it is or isn’t intertwined with democracy. And while the modern market economy made a lot of sense in the industrial age, it’s starting to fray in the information age. What comes after capitalism, and how does it affect how we govern ourselves?

Many participants examined the effects of technology, especially artificial intelligence. We looked at whether—and when—we might be comfortable ceding power to an AI. Sometimes it’s easy. I’m happy for an AI to figure out the optimal timing of traffic lights to ensure the smoothest flow of cars through the city. When will we be able to say the same thing about setting interest rates? Or designing tax policies?

How would we feel about an AI device in our pocket that voted in our name, thousands of times per day, based on preferences that it inferred from our actions? If an AI system could determine optimal policy solutions that balanced every voter’s preferences, would it still make sense to have representatives? Maybe we should vote directly for ideas and goals instead, and leave the details to the computers. On the other hand, technological solutionism regularly fails.

Scale was another theme. The size of modern governments reflects the technology at the time of their founding. European countries and the early American states are a particular size because that’s what was governable in the 18th and 19th centuries. Larger governments—the US as a whole, the European Union—reflect a world in which travel and communications are easier. The problems we have today are primarily either local, at the scale of cities and towns, or global—even if they are currently regulated at state, regional or national levels. This mismatch is especially acute when we try to tackle global problems. In the future, do we really have a need for political units the size of France or Virginia? Or is it a mixture of scales that we really need, one that moves effectively between the local and the global?

As to other forms of democracy, we discussed one from history and another made possible by today’s technology.

Sortition is a system of choosing political officials randomly to deliberate on a particular issue. We use it today when we pick juries, but both the ancient Greeks and some cities in Renaissance Italy used it to select major political officials. Today, several countries—largely in Europe—are using sortition for some policy decisions. We might randomly choose a few hundred people, representative of the population, to spend a few weeks being briefed by experts and debating the problem—and then decide on environmental regulations, or a budget, or pretty much anything.

Liquid democracy does away with elections altogether. Everyone has a vote, and they can keep the power to cast it themselves or assign it to another person as a proxy. There are no set elections; anyone can reassign their proxy at any time. And there’s no reason to make this assignment all or nothing. Perhaps proxies could specialize: one set of people focused on economic issues, another group on health and a third bunch on national defense. Then regular people could assign their votes to whichever of the proxies most closely matched their views on each individual matter—or step forward with their own views and begin collecting proxy support from other people.

This all brings up another question: Who gets to participate? And, more generally, whose interests are taken into account? Early democracies were really nothing of the sort: They limited participation by gender, race and land ownership.

We should debate lowering the voting age, but even without voting we recognize that children too young to vote have rights—and, in some cases, so do other species. Should future generations get a “voice,” whatever that means? What about nonhumans or whole ecosystems?

Should everyone get the same voice? Right now in the US, the outsize effect of money in politics gives the wealthy disproportionate influence. Should we encode that explicitly? Maybe younger people should get a more powerful vote than everyone else. Or maybe older people should.

Those questions lead to ones about the limits of democracy. All democracies have boundaries limiting what the majority can decide. We all have rights: the things that cannot be taken away from us. We cannot vote to put someone in jail, for example.

But while we can’t vote a particular publication out of existence, we can to some degree regulate speech. In this hypothetical community, what are our rights as individuals? What are the rights of society that supersede those of individuals?

Personally, I was most interested in how these systems fail. As a security technologist, I study how complex systems are subverted—hacked, in my parlance—for the benefit of a few at the expense of the many. Think tax loopholes, or tricks to avoid government regulation. I want any government system to be resilient in the face of that kind of trickery.

Or, to put it another way, I want the interests of each individual to align with the interests of the group at every level. We’ve never had a system of government with that property before—even equal protection guarantees and First Amendment rights exist in a competitive framework that puts individuals’ interests in opposition to one another. But—in the age of such existential risks as climate and biotechnology and maybe AI—aligning interests is more important than ever.

Our workshop didn’t produce any answers; that wasn’t the point. Our current discourse is filled with suggestions on how to patch our political system. People regularly debate changes to the Electoral College, or the process of creating voting districts, or term limits. But those are incremental changes.

It’s hard to find people who are thinking more radically: looking beyond the horizon for what’s possible eventually. And while true innovation in politics is a lot harder than innovation in technology, especially without a violent revolution forcing change, it’s something that we as a species are going to have to get good at—one way or another.

This essay previously appeared in The Conversation.

Posted on August 23, 2023 at 7:06 AM • View Comments

Political Milestones for AI

ChatGPT was released just nine months ago, and we are still learning how it will affect our daily lives, our careers, and even our systems of self-governance.

But when it comes to how AI may threaten our democracy, much of the public conversation lacks imagination. People talk about the danger of campaigns that attack opponents with fake images (or fake audio or video) because we already have decades of experience dealing with doctored images. We’re on the lookout for foreign governments that spread misinformation because we were traumatized by the 2016 US presidential election. And we worry that AI-generated opinions will swamp the political preferences of real people because we’ve seen political “astroturfing”—the use of fake online accounts to give the illusion of support for a policy—grow for decades.

Threats of this sort seem urgent and disturbing because they’re salient. We know what to look for, and we can easily imagine their effects.

The truth is, the future will be much more interesting. And even some of the most stupendous potential impacts of AI on politics won’t be all bad. We can draw some fairly straight lines between the current capabilities of AI tools and real-world outcomes that, by the standards of current public understanding, seem truly startling.

With this in mind, we propose six milestones that will herald a new era of democratic politics driven by AI. All feel achievable—perhaps not with today’s technology and levels of AI adoption, but very possibly in the near future.

Good benchmarks should be meaningful, representing significant outcomes that come with real-world consequences. They should be plausible; they must be realistically achievable in the foreseeable future. And they should be observable—we should be able to recognize when they’ve been achieved.

Worries about AI swaying an election will very likely fail the observability test. While the risks of election manipulation through the robotic promotion of a candidate’s or party’s interests is a legitimate threat, elections are massively complex. Just as the debate continues to rage over why and how Donald Trump won the presidency in 2016, we’re unlikely to be able to attribute a surprising electoral outcome to any particular AI intervention.

Thinking further into the future: Could an AI candidate ever be elected to office? In the world of speculative fiction, from The Twilight Zone to Black Mirror, there is growing interest in the possibility of an AI or technologically assisted, otherwise-not-traditionally-eligible candidate winning an election. In an era where deepfaked videos can misrepresent the views and actions of human candidates and human politicians can choose to be represented by AI avatars or even robots, it is certainly possible for an AI candidate to mimic the media presence of a politician. Virtual politicians have received votes in national elections, for example in Russia in 2017. But this doesn’t pass the plausibility test. The voting public and legal establishment are likely to accept more and more automation and assistance supported by AI, but the age of non-human elected officials is far off.

Let’s start with some milestones that are already on the cusp of reality. These are achievements that seem well within the technical scope of existing AI technologies and for which the groundwork has already been laid.

Milestone #1: The acceptance by a legislature or agency of a testimony or comment generated by, and submitted under the name of, an AI.

Arguably, we’ve already seen legislation drafted by AI, albeit under the direction of human users and introduced by human legislators. After some early examples of bills written by AIs were introduced in Massachusetts and the US House of Representatives, many major legislative bodies have had their “first bill written by AI,” “used ChatGPT to generate committee remarks,” or “first floor speech written by AI” events.

Many of these bills and speeches are more stunt than serious, and they have received more criticism than consideration. They are short, have trivial levels of policy substance, or were heavily edited or guided by human legislators (through highly specific prompts to large language model-based AI tools like ChatGPT).

The interesting milestone along these lines will be the acceptance of testimony on legislation, or a comment submitted to an agency, drafted entirely by AI. To be sure, a large fraction of all writing going forward will be assisted by—and will truly benefit from—AI assistive technologies. So to avoid making this milestone trivial, we have to add the second clause: “submitted under the name of the AI.”

What would make this benchmark significant is the submission under the AI’s own name; that is, the acceptance by a governing body of the AI as proffering a legitimate perspective in public debate. Regardless of the public fervor over AI, this one won’t take long. The New York Times has published a letter under the name of ChatGPT (responding to an opinion piece we wrote), and legislators are already turning to AI to write high-profile opening remarks at committee hearings.

Milestone #2: The adoption of the first novel legislative amendment to a bill written by AI.

Moving beyond testimony, there is an immediate pathway for AI-generated policies to become law: microlegislation. This involves making tweaks to existing laws or bills that are tuned to serve some particular interest. It is a natural starting point for AI because it’s tightly scoped, involving small changes guided by a clear directive associated with a well-defined purpose.

By design, microlegislation is often implemented surreptitiously. It may even be filed anonymously within a deluge of other amendments to obscure its intended beneficiary. For that reason, microlegislation can often be bad for society, and it is ripe for exploitation by generative AI that would otherwise be subject to heavy scrutiny from a polity on guard for risks posed by AI.

Milestone #3: AI-generated political messaging outscores campaign consultant recommendations in poll testing.

Some of the most important near-term implications of AI for politics will happen largely behind closed doors. Like everyone else, political campaigners and pollsters will turn to AI to help with their jobs. We’re already seeing campaigners turn to AI-generated images to manufacture social content and pollsters simulate results using AI-generated respondents.

The next step in this evolution is political messaging developed by AI. A mainstay of the campaigner’s toolbox today is the message testing survey, where a few alternate formulations of a position are written down and tested with audiences to see which will generate more attention and a more positive response. Just as an experienced political pollster can anticipate effective messaging strategies pretty well based on observations from past campaigns and their impression of the state of the public debate, so can an AI trained on reams of public discourse, campaign rhetoric, and political reporting.

With these near-term milestones firmly in sight, let’s look further to some truly revolutionary possibilities. While these concepts may have seemed absurd just a year ago, they are increasingly conceivable with either current or near-future technologies.

Milestone #4: AI creates a political party with its own platform, attracting human candidates who win elections.

While an AI is unlikely to be allowed to run for and hold office, it is plausible that one may be able to found a political party. An AI could generate a political platform calculated to attract the interest of some cross-section of the public and, acting independently or through a human intermediary (hired help, like a political consultant or legal firm), could register formally as a political party. It could collect signatures to win a place on ballots and attract human candidates to run for office under its banner.

A big step in this direction has already been taken, via the campaign of the Danish Synthetic Party in 2022. An artist collective in Denmark created an AI chatbot to interact with human members of its community on Discord, exploring political ideology in conversation with them and on the basis of an analysis of historical party platforms in the country. All this happened with earlier generations of general purpose AI, not current systems like ChatGPT. However, the party failed to receive enough signatures to earn a spot on the ballot, and therefore did not win parliamentary representation.

Future AI-led efforts may succeed. One could imagine a generative AI with skills at the level of or beyond today’s leading technologies could formulate a set of policy positions targeted to build support among people of a specific demographic, or even an effective consensus platform capable of attracting broad-based support. Particularly in a European-style multiparty system, we can imagine a new party with a strong news hook—an AI at its core—winning attention and votes.

Milestone #5: AI autonomously generates profit and makes political campaign contributions.

Let’s turn next to the essential capability of modern politics: fundraising. “An entity capable of directing contributions to a campaign fund” might be a realpolitik definition of a political actor, and AI is potentially capable of this.

Like a human, an AI could conceivably generate contributions to a political campaign in a variety of ways. It could take a seed investment from a human controlling the AI and invest it to yield a return. It could start a business that generates revenue. There is growing interest and experimentation in auto-hustling: AI agents that set about autonomously growing businesses or otherwise generating profit. While ChatGPT-generated businesses may not yet have taken the world by storm, this possibility is in the same spirit as the algorithmic agents powering modern high-speed trading and so-called autonomous finance capabilities that are already helping to automate business and financial decisions.

Or, like most political entrepreneurs, AI could generate political messaging to convince humans to spend their own money on a defined campaign or cause. The AI would likely need to have some humans in the loop, and register its activities to the government (in the US context, as officers of a 501(c)(4) or political action committee).

Milestone #6: AI achieves a coordinated policy outcome across multiple jurisdictions.

Lastly, we come to the most meaningful of impacts: achieving outcomes in public policy. Even if AI cannot—now or in the future—be said to have its own desires or preferences, it could be programmed by humans to have a goal, such as lowering taxes or relieving a market regulation.

An AI has many of the same tools humans use to achieve these ends. It may advocate, formulating messaging and promoting ideas through digital channels like social media posts and videos. It may lobby, directing ideas and influence to key policymakers, even writing legislation. It may spend; see milestone #5.

The “multiple jurisdictions” piece is key to this milestone. A single law passed may be reasonably attributed to myriad factors: a charismatic champion, a political movement, a change in circumstances. The influence of any one actor, such as an AI, will be more demonstrable if it is successful simultaneously in many different places. And the digital scalability of AI gives it a special advantage in achieving these kinds of coordinated outcomes.

The greatest challenge to most of these milestones is their observability: will we know it when we see it? The first campaign consultant whose ideas lose out to an AI may not be eager to report that fact. Neither will the campaign. Regarding fundraising, it’s hard enough for us to track down the human actors who are responsible for the “dark money” contributions controlling much of modern political finance; will we know if a future dominant force in fundraising for political action committees is an AI?

We’re likely to observe some of these milestones indirectly. At some point, perhaps politicians’ dollars will start migrating en masse to AI-based campaign consultancies and, eventually, we may realize that political movements sweeping across states or countries have been AI-assisted.

While the progression of technology is often unsettling, we need not fear these milestones. A new political platform that wins public support is itself a neutral proposition; it may lead to good or bad policy outcomes. Likewise, a successful policy program may or may not be beneficial to one group of constituents or another.

We think the six milestones outlined here are among the most viable and meaningful upcoming interactions between AI and democracy, but they are hardly the only scenarios to consider. The point is that our AI-driven political future will involve far more than deepfaked campaign ads and manufactured letter-writing campaigns. We should all be thinking more creatively about what comes next and be vigilant in steering our politics toward the best possible ends, no matter their means.

This essay was written with Nathan Sanders, and previously appeared in MIT Technology Review.

Posted on August 4, 2023 at 7:07 AM • View Comments

The Need for Trustworthy AI

If you ask Alexa, Amazon’s voice assistant AI system, whether Amazon is a monopoly, it responds by saying it doesn’t know. It doesn’t take much to make it lambaste the other tech giants, but it’s silent about its own corporate parent’s misdeeds.

When Alexa responds in this way, it’s obvious that it is putting its developer’s interests ahead of yours. Usually, though, it’s not so obvious whom an AI system is serving. To avoid being exploited by these systems, people will need to learn to approach AI skeptically. That means deliberately constructing the input you give it and thinking critically about its output.

Newer generations of AI models, with their more sophisticated and less rote responses, are making it harder to tell who benefits when they speak. Internet companies’ manipulating what you see to serve their own interests is nothing new. Google’s search results and your Facebook feed are filled with paid entries. Facebook, TikTok and others manipulate your feeds to maximize the time you spend on the platform, which means more ad views, over your well-being.

What distinguishes AI systems from these other internet services is how interactive they are, and how these interactions will increasingly become like relationships. It doesn’t take much extrapolation from today’s technologies to envision AIs that will plan trips for you, negotiate on your behalf or act as therapists and life coaches.

They are likely to be with you 24/7, know you intimately, and be able to anticipate your needs. This kind of conversational interface to the vast network of services and resources on the web is within the capabilities of existing generative AIs like ChatGPT. They are on track to become personalized digital assistants.

As a security expert and data scientist, we believe that people who come to rely on these AIs will have to trust them implicitly to navigate daily life. That means they will need to be sure the AIs aren’t secretly working for someone else. Across the internet, devices and services that seem to work for you already secretly work against you. Smart TVs spy on you. Phone apps collect and sell your data. Many apps and websites manipulate you through dark patterns, design elements that deliberately mislead, coerce or deceive website visitors. This is surveillance capitalism, and AI is shaping up to be part of it.

Quite possibly, it could be much worse with AI. For that AI digital assistant to be truly useful, it will have to really know you. Better than your phone knows you. Better than Google search knows you. Better, perhaps, than your close friends, intimate partners and therapist know you.

You have no reason to trust today’s leading generative AI tools. Leave aside the hallucinations, the made-up “facts” that GPT and other large language models produce. We expect those will be largely cleaned up as the technology improves over the next few years.

But you don’t know how the AIs are configured: how they’ve been trained, what information they’ve been given, and what instructions they’ve been commanded to follow. For example, researchers uncovered the secret rules that govern the Microsoft Bing chatbot’s behavior. They’re largely benign but can change at any time.

Many of these AIs are created and trained at enormous expense by some of the largest tech monopolies. They’re being offered to people to use free of charge, or at very low cost. These companies will need to monetize them somehow. And, as with the rest of the internet, that somehow is likely to include surveillance and manipulation.

Imagine asking your chatbot to plan your next vacation. Did it choose a particular airline or hotel chain or restaurant because it was the best for you or because its maker got a kickback from the businesses? As with paid results in Google search, newsfeed ads on Facebook and paid placements on Amazon queries, these paid influences are likely to get more surreptitious over time.

If you’re asking your chatbot for political information, are the results skewed by the politics of the corporation that owns the chatbot? Or the candidate who paid it the most money? Or even the views of the demographic of the people whose data was used in training the model? Is your AI agent secretly a double agent? Right now, there is no way to know.

We believe that people should expect more from the technology and that tech companies and AIs can become more trustworthy. The European Union’s proposed AI Act takes some important steps, requiring transparency about the data used to train AI models, mitigation for potential bias, disclosure of foreseeable risks and reporting on industry standard tests.

Most existing AIs fail to comply with this emerging European mandate, and, despite recent prodding from Senate Majority Leader Chuck Schumer, the US is far behind on such regulation.

The AIs of the future should be trustworthy. Unless and until the government delivers robust consumer protections for AI products, people will be on their own to guess at the potential risks and biases of AI, and to mitigate their worst effects on people’s experiences with them.

So when you get a travel recommendation or political information from an AI tool, approach it with the same skeptical eye you would a billboard ad or a campaign volunteer. For all its technological wizardry, the AI tool may be little more than the same.

This essay was written with Nathan Sanders, and previously appeared on The Conversation.

Posted on August 3, 2023 at 7:17 AM • View Comments

AI and Microdirectives

Imagine a future in which AIs automatically interpret—and enforce—laws.

All day and every day, you constantly receive highly personalized instructions for how to comply with the law, sent directly by your government and law enforcement. You’re told how to cross the street, how fast to drive on the way to work, and what you’re allowed to say or do online—if you’re in any situation that might have legal implications, you’re told exactly what to do, in real time.

Imagine that the computer system formulating these personal legal directives at mass scale is so complex that no one can explain how it reasons or works. But if you ignore a directive, the system will know, and it’ll be used as evidence in the prosecution that’s sure to follow.

This future may not be far off—automatic detection of lawbreaking is nothing new. Speed cameras and traffic-light cameras have been around for years. These systems automatically issue citations to the car’s owner based on the license plate. In such cases, the defendant is presumed guilty unless they prove otherwise, by naming and notifying the driver.

In New York, AI systems equipped with facial recognition technology are being used by businesses to identify shoplifters. Similar AI-powered systems are being used by retailers in Australia and the United Kingdom to identify shoplifters and provide real-time tailored alerts to employees or security personnel. China is experimenting with even more powerful forms of automated legal enforcement and targeted surveillance.

Breathalyzers are another example of automatic detection. They estimate blood alcohol content by calculating the number of alcohol molecules in the breath via an electrochemical reaction or infrared analysis (they’re basically computers with fuel cells or spectrometers attached). And they’re not without controversy: Courts across the country have found serious flaws and technical deficiencies with Breathalyzer devices and the software that powers them. Despite this, criminal defendants struggle to obtain access to devices or their software source code, with Breathalyzer companies and courts often refusing to grant such access. In the few cases where courts have actually ordered such disclosures, that has usually followed costly legal battles spanning many years.

AI is about to make this issue much more complicated, and could drastically expand the types of laws that can be enforced in this manner. Some legal scholars predict that computationally personalized law and its automated enforcement are the future of law. These would be administered by what Anthony Casey and Anthony Niblett call “microdirectives,” which provide individualized instructions for legal compliance in a particular scenario.

Made possible by advances in surveillance, communications technologies, and big-data analytics, microdirectives will be a new and predominant form of law shaped largely by machines. They are “micro” because they are not impersonal general rules or standards, but tailored to one specific circumstance. And they are “directives” because they prescribe action or inaction required by law.

A Digital Millennium Copyright Act takedown notice is a present-day example of a microdirective. The DMCA’s enforcement is almost fully automated, with copyright “bots” constantly scanning the internet for copyright-infringing material, and automatically sending literally hundreds of millions of DMCA takedown notices daily to platforms and users. A DMCA takedown notice is tailored to the recipient’s specific legal circumstances. It also directs action—remove the targeted content or prove that it’s not infringing—based on the law.

It’s easy to see how the AI systems being deployed by retailers to identify shoplifters could be redesigned to employ microdirectives. In addition to alerting business owners, the systems could also send alerts to the identified persons themselves, with tailored legal directions or notices.

A future where AIs interpret, apply, and enforce most laws at societal scale like this will exponentially magnify problems around fairness, transparency, and freedom. Forget about software transparency—well-resourced AI firms, like Breathalyzer companies today, would no doubt ferociously guard their systems for competitive reasons. These systems would likely be so complex that even their designers would not be able to explain how the AIs interpret and apply the law—something we’re already seeing with today’s deep learning neural network systems, which are unable to explain their reasoning.

Even the law itself could become hopelessly vast and opaque. Legal microdirectives sent en masse for countless scenarios, each representing authoritative legal findings formulated by opaque computational processes, could create an expansive and increasingly complex body of law that would grow ad infinitum.

And this brings us to the heart of the issue: If you’re accused by a computer, are you entitled to review that computer’s inner workings and potentially challenge its accuracy in court? What does cross-examination look like when the prosecutor’s witness is a computer? How could you possibly access, analyze, and understand all microdirectives relevant to your case in order to challenge the AI’s legal interpretation? How could courts hope to ensure equal application of the law? Like the man from the country in Franz Kafka’s parable in The Trial, you’d die waiting for access to the law, because the law is limitless and incomprehensible.

This system would present an unprecedented threat to freedom. Ubiquitous AI-powered surveillance in society will be necessary to enable such automated enforcement. On top of that, research—including empirical studies conducted by one of us (Penney)—has shown that personalized legal threats or commands that originate from sources of authority—state or corporate—can have powerful chilling effects on people’s willingness to speak or act freely. Imagine receiving very specific legal instructions from law enforcement about what to say or do in a situation: Would you feel you had a choice to act freely?

This is a vision of AI’s invasive and Byzantine law of the future that chills to the bone. It would be unlike any other law system we’ve seen before in human history, and far more dangerous for our freedoms. Indeed, some legal scholars argue that this future would effectively be the death of law.

Yet it is not a future we must endure. Proposed bans on surveillance technology like facial recognition systems can be expanded to cover those enabling invasive automated legal enforcement. Laws can mandate interpretability and explainability for AI systems to ensure everyone can understand and explain how the systems operate. If a system is too complex, maybe it shouldn’t be deployed in legal contexts. Enforcement by personalized legal processes needs to be highly regulated to ensure oversight, and should be employed only where chilling effects are less likely, like in benign government administration or regulatory contexts where fundamental rights and freedoms are not at risk.

AI will inevitably change the course of law. It already has. But we don’t have to accept its most extreme and maximal instantiations, either today or tomorrow.

This essay was written with Jon Penney, and previously appeared on Slate.com.

Posted on July 21, 2023 at 7:16 AM • View Comments

The AI Dividend

For four decades, Alaskans have opened their mailboxes to find checks waiting for them, their cut of the black gold beneath their feet. This is Alaska’s Permanent Fund, funded by the state’s oil revenues and paid to every Alaskan each year. We’re now in a different sort of resource rush, with companies peddling bits instead of oil: generative AI.

Everyone is talking about these new AI technologies—like ChatGPT—and AI companies are touting their awesome power. But they aren’t talking about how that power comes from all of us. Without all of our writings and photos that AI companies are using to train their models, they would have nothing to sell. Big Tech companies are currently taking the work of the American people, without our knowledge and consent, without licensing it, and are pocketing the proceeds.

You are owed profits for your data that powers today’s AI, and we have a way to make that happen. We call it the AI Dividend.

Our proposal is simple, and harkens back to the Alaskan plan. When Big Tech companies produce output from generative AI that was trained on public data, they would pay a tiny licensing fee, by the word or pixel or relevant unit of data. Those fees would go into the AI Dividend fund. Every few months, the Commerce Department would send out the entirety of the fund, split equally, to every resident nationwide. That’s it.

There’s no reason to complicate it further. Generative AI needs a wide variety of data, which means all of us are valuable—not just those of us who write professionally, or prolifically, or well. Figuring out who contributed to which words the AIs output would be both challenging and invasive, given that even the companies themselves don’t quite know how their models work. Paying the dividend to people in proportion to the words or images they create would just incentivize them to create endless drivel, or worse, use AI to create that drivel. The bottom line for Big Tech is that if their AI model was created using public data, they have to pay into the fund. If you’re an American, you get paid from the fund.

Under this plan, hobbyists and American small businesses would be exempt from fees. Only Big Tech companies—those with substantial revenue—would be required to pay into the fund. And they would pay at the point of generative AI output, such as from ChatGPT, Bing, Bard, or their embedded use in third-party services via Application Programming Interfaces.

Our proposal also includes a compulsory licensing plan. By agreeing to pay into this fund, AI companies will receive a license that allows them to use public data when training their AI. This won’t supersede normal copyright law, of course. If a model starts producing copyright material beyond fair use, that’s a separate issue.

Using today’s numbers, here’s what it would look like. The licensing fee could be small, starting at $0.001 per word generated by AI. A similar type of fee would be applied to other categories of generative AI outputs, such as images. That’s not a lot, but it adds up. Since most of Big Tech has started integrating generative AI into products, these fees would mean an annual dividend payment of a couple hundred dollars per person.

The idea of paying you for your data isn’t new, and some companies have tried to do it themselves for users who opted in. And the idea of the public being repaid for use of their resources goes back to well before Alaska’s oil fund. But generative AI is different: It uses data from all of us whether we like it or not, it’s ubiquitous, and it’s potentially immensely valuable. It would cost Big Tech companies a fortune to create a synthetic equivalent to our data from scratch, and synthetic data would almost certainly result in worse output. They can’t create good AI without us.

Our plan would apply to generative AI used in the US. It also only issues a dividend to Americans. Other countries can create their own versions, applying a similar fee to AI used within their borders. Just like an American company collects VAT for services sold in Europe, but not here, each country can independently manage their AI policy.

Don’t get us wrong; this isn’t an attempt to strangle this nascent technology. Generative AI has interesting, valuable, and possibly transformative uses, and this policy is aligned with that future. Even with the fees of the AI Dividend, generative AI will be cheap and will only get cheaper as technology improves. There are also risks—both every day and esoteric—posed by AI, and the government may need to develop policies to remedy any harms that arise.

Our plan can’t make sure there are no downsides to the development of AI, but it would ensure that all Americans will share in the upsides—particularly since this new technology isn’t possible without our contribution.

This essay was written with Barath Raghavan, and previously appeared on Politico.com.

Posted on July 7, 2023 at 7:11 AM • View Comments

On the Need for an AI Public Option

Artificial intelligence will bring great benefits to all of humanity. But do we really want to entrust this revolutionary technology solely to a small group of US tech companies?

Silicon Valley has produced no small number of moral disappointments. Google retired its “don’t be evil” pledge before firing its star ethicist. Self-proclaimed “free speech absolutist” Elon Musk bought Twitter in order to censor political speech, retaliate against journalists, and ease access to the platform for Russian and Chinese propagandists. Facebook lied about how it enabled Russian interference in the 2016 US presidential election and paid a public relations firm to blame Google and George Soros instead.

These and countless other ethical lapses should prompt us to consider whether we want to give technology companies further abilities to learn our personal details and influence our day-to-day decisions. Tech companies can already access our daily whereabouts and search queries. Digital devices monitor more and more aspects of our lives: We have cameras in our homes and heartbeat sensors on our wrists sending what they detect to Silicon Valley.

Now, tech giants are developing ever more powerful AI systems that don’t merely monitor you; they actually interact with you—and with others on your behalf. If searching on Google in the 2010s was like being watched on a security camera, then using AI in the late 2020s will be like having a butler. You will willingly include them in every conversation you have, everything you write, every item you shop for, every want, every fear, everything. It will never forget. And, despite your reliance on it, it will be surreptitiously working to further the interests of one of these for-profit corporations.

There’s a reason Google, Microsoft, Facebook, and other large tech companies are leading the AI revolution: Building a competitive large language model (LLM) like the one powering ChatGPT is incredibly expensive. It requires upward of $100 million in computational costs for a single model training run, in addition to access to large amounts of data. It also requires technical expertise, which, while increasingly open and available, remains heavily concentrated in a small handful of companies. Efforts to disrupt the AI oligopoly by funding start-ups are self-defeating as Big Tech profits from the cloud computing services and AI models powering those start-ups—and often ends up acquiring the start-ups themselves.

Yet corporations aren’t the only entities large enough to absorb the cost of large-scale model training. Governments can do it, too. It’s time to start taking AI development out of the exclusive hands of private companies and bringing it into the public sector. The United States needs a government-funded-and-directed AI program to develop widely reusable models in the public interest, guided by technical expertise housed in federal agencies.

So far, the AI regulation debate in Washington has focused on the governance of private-sector activity—which the US Congress is in no hurry to advance. Congress should not only hurry up and push AI regulation forward but also go one step further and develop its own programs for AI. Legislators should reframe the AI debate from one about public regulation to one about public development.

The AI development program could be responsive to public input and subject to political oversight. It could be directed to respond to critical issues such as privacy protection, underpaid tech workers, AI’s horrendous carbon emissions, and the exploitation of unlicensed data. Compared to keeping AI in the hands of morally dubious tech companies, the public alternative is better both ethically and economically. And the switch should take place soon: By the time AI becomes critical infrastructure, essential to large swaths of economic activity and daily life, it will be too late to get started.

Other countries are already there. China has heavily prioritized public investment in AI research and development by betting on a handpicked set of giant companies that are ostensibly private but widely understood to be an extension of the state. The government has tasked Alibaba, Huawei, and others with creating products that support the larger ecosystem of state surveillance and authoritarianism.

The European Union is also aggressively pushing AI development. The European Commission already invests 1 billion euros per year in AI, with a plan to increase that figure to 20 billion euros annually by 2030. The money goes to a continent-wide network of public research labs, universities, and private companies jointly working on various parts of AI. The Europeans’ focus is on knowledge transfer, developing the technology sector, use of AI in public administration, mitigating safety risks, and preserving fundamental rights. The EU also continues to be at the cutting edge of aggressively regulating both data and AI.

Neither the Chinese nor the European model is necessarily right for the United States. State control of private enterprise remains anathema in American political culture and would struggle to gain mainstream traction. The tech companies—and their supporters in both US political parties—are opposed to robust public governance of AI. But Washington can take inspiration from China and Europe’;s long-range planning and leadership on regulation and public investment. With boosters pointing to hundreds of trillions of dollars of global economic value associated with AI, the stakes of international competition are compelling. As in energy and medical research, which have their own federal agencies in the Department of Energy and the National Institutes of Health, respectively, there is a place for AI research and development inside government.

Beside the moral argument against letting private companies develop AI, there’s a strong economic argument in favor of a public option as well. A publicly funded LLM could serve as an open platform for innovation, helping any small business, nonprofit, or individual entrepreneur to build AI-assisted applications.

There’s also a practical argument. Building AI is within public reach because governments don’t need to own and operate the entire AI supply chain. Chip and computer production, cloud data centers, and various value-added applications—such as those that integrate AI with consumer electronics devices or entertainment software—do not need to be publicly controlled or funded.

One reason to be skeptical of public funding for AI is that it might result in a lower quality and slower innovation, given greater ethical scrutiny, political constraints, and fewer incentives due to a lack of market competition. But even if that is the case, it would be worth broader access to the most important technology of the 21st century. And it is by no means certain that public AI has to be at a disadvantage. The open-source community is proof that it’s not always private companies that are the most innovative.

Those who worry about the quality trade-off might suggest a public buyer model, whereby Washington licenses or buys private language models from Big Tech instead of developing them itself. But that doesn’t go far enough to ensure that the tools are aligned with public priorities and responsive to public needs. It would not give the public detailed insight into or control of the inner workings and training procedures for these models, and it would still require strict and complex regulation.

There is political will to take action to develop AI via public, rather than private, funds—but this does not yet equate to the will to create a fully public AI development agency. A task force created by Congress recommended in January a $2.6 billion federal investment in computing and data resources to prime the AI research ecosystem in the United States. But this investment would largely serve to advance the interests of Big Tech, leaving the opportunity for public ownership and oversight unaddressed.

Nonprofit and academic organizations have already created open-access LLMs. While these should be celebrated, they are not a substitute for a public option. Nonprofit projects are still beholden to private interests, even if they are benevolent ones. These private interests can change without public input, as when OpenAI effectively abandoned its nonprofit origins, and we can’t be sure that their founding intentions or operations will survive market pressures, fickle donors, and changes in leadership.

The US government is by no means a perfect beacon of transparency, a secure and responsible store of our data, or a genuine reflection of the public’s interests. But the risks of placing AI development entirely in the hands of demonstrably untrustworthy Silicon Valley companies are too high. AI will impact the public like few other technologies, so it should also be developed by the public.

This essay was written with Nathan Sanders, and appeared in Foreign Policy.

Posted on June 14, 2023 at 7:02 AM • View Comments

Snowden Ten Years Later

In 2013 and 2014, I wrote extensively about new revelations regarding NSA surveillance based on the documents provided by Edward Snowden. But I had a more personal involvement as well.

I wrote the essay below in September 2013. The New Yorker agreed to publish it, but the Guardian asked me not to. It was scared of UK law enforcement, and worried that this essay would reflect badly on it. And given that the UK police would raid its offices in July 2014, it had legitimate cause to be worried.

Now, ten years later, I offer this as a time capsule of what those early months of Snowden were like.

It’s a surreal experience, paging through hundreds of top-secret NSA documents. You’re peering into a forbidden world: strange, confusing, and fascinating all at the same time.

I had flown down to Rio de Janeiro in late August at the request of Glenn Greenwald. He had been working on the Edward Snowden archive for a couple of months, and had a pile of more technical documents that he wanted help interpreting. According to Greenwald, Snowden also thought that bringing me down was a good idea.

It made sense. I didn’t know either of them, but I have been writing about cryptography, security, and privacy for decades. I could decipher some of the technical language that Greenwald had difficulty with, and understand the context and importance of various document. And I have long been publicly critical of the NSA’s eavesdropping capabilities. My knowledge and expertise could help figure out which stories needed to be reported.

I thought about it a lot before agreeing. This was before David Miranda, Greenwald’s partner, was detained at Heathrow airport by the UK authorities; but even without that, I knew there was a risk. I fly a lot—a quarter of a million miles per year—and being put on a TSA list, or being detained at the US border and having my electronics confiscated, would be a major problem. So would the FBI breaking into my home and seizing my personal electronics. But in the end, that made me more determined to do it.

I did spend some time on the phone with the attorneys recommended to me by the ACLU and the EFF. And I talked about it with my partner, especially when Miranda was detained three days before my departure. Both Greenwald and his employer, the Guardian, are careful about whom they show the documents to. They publish only those portions essential to getting the story out. It was important to them that I be a co-author, not a source. I didn’t follow the legal reasoning, but the point is that the Guardian doesn’t want to leak the documents to random people. It will, however, write stories in the public interest, and I would be allowed to review the documents as part of that process. So after a Skype conversation with someone at the Guardian, I signed a letter of engagement.

And then I flew to Brazil.

I saw only a tiny slice of the documents, and most of what I saw was surprisingly banal. The concerns of the top-secret world are largely tactical: system upgrades, operational problems owing to weather, delays because of work backlogs, and so on. I paged through weekly reports, presentation slides from status meetings, and general briefings to educate visitors. Management is management, even inside the NSA Reading the documents, I felt as though I were sitting through some of those endless meetings.

The meeting presenters try to spice things up. Presentations regularly include intelligence success stories. There were details—what had been found, and how, and where it helped—and sometimes there were attaboys from “customers” who used the intelligence. I’m sure these are intended to remind NSA employees that they’re doing good. It definitely had an effect on me. Those were all things I want the NSA to be doing.

There were so many code names. Everything has one: every program, every piece of equipment, every piece of software. Sometimes code names had their own code names. The biggest secrets seem to be the underlying real-world information: which particular company MONEYROCKET is; what software vulnerability EGOTISTICALGIRAFFE—really, I am not making that one up—is; how TURBINE works. Those secrets collectively have a code name—ECI, for exceptionally compartmented information—and almost never appear in the documents. Chatting with Snowden on an encrypted IM connection, I joked that the NSA cafeteria menu probably has code names for menu items. His response: “Trust me when I say you have no idea.”

Those code names all come with logos, most of them amateurish and a lot of them dumb. Note to the NSA: take some of that more than ten-billion-dollar annual budget and hire yourself a design firm. Really; it’ll pay off in morale.

Once in a while, though, I would see something that made me stop, stand up, and pace around in circles. It wasn’t that what I read was particularly exciting, or important. It was just that it was startling. It changed—ever so slightly—how I thought about the world.

Greenwald said that that reaction was normal when people started reading through the documents.

Intelligence professionals talk about how disorienting it is living on the inside. You read so much classified information about the world’s geopolitical events that you start seeing the world differently. You become convinced that only the insiders know what’s really going on, because the news media is so often wrong. Your family is ignorant. Your friends are ignorant. The world is ignorant. The only thing keeping you from ignorance is that constant stream of classified knowledge. It’s hard not to feel superior, not to say things like “If you only knew what we know” all the time. I can understand how General Keith Alexander, the director of the NSA, comes across as so supercilious; I only saw a minute fraction of that secret world, and I started feeling it.

It turned out to be a terrible week to visit Greenwald, as he was still dealing with the fallout from Miranda’s detention. Two other journalists, one from the Nation and the other from the Hindu, were also in town working with him. A lot of my week involved Greenwald rushing into my hotel room, giving me a thumb drive of new stuff to look through, and rushing out again.

A technician from the Guardian got a search capability working while I was there, and I spent some time with it. Question: when you’re given the capability to search through a database of NSA secrets, what’s the first thing you look for? Answer: your name.

It wasn’t there. Neither were any of the algorithm names I knew, not even algorithms I knew that the US government used.

I tried to talk to Greenwald about his own operational security. It had been incredibly stupid for Miranda to be traveling with NSA documents on the thumb drive. Transferring files electronically is what encryption is for. I told Greenwald that he and Laura Poitras should be sending large encrypted files of dummy documents back and forth every day.

Once, at Greenwald’s home, I walked into the backyard and looked for TEMPEST receivers hiding in the trees. I didn’t find any, but that doesn’t mean they weren’t there. Greenwald has a lot of dogs, but I don’t think that would hinder professionals. I’m sure that a bunch of major governments have a complete copy of everything Greenwald has. Maybe the black bag teams bumped into each other in those early weeks.

I started doubting my own security procedures. Reading about the NSA’s hacking abilities will do that to you. Can it break the encryption on my hard drive? Probably not. Has the company that makes my encryption software deliberately weakened the implementation for it? Probably. Are NSA agents listening in on my calls back to the US? Very probably. Could agents take control of my computer over the Internet if they wanted to? Definitely. In the end, I decided to do my best and stop worrying about it. It was the agency’s documents, after all. And what I was working on would become public in a few weeks.

I wasn’t sleeping well, either. A lot of it was the sheer magnitude of what I saw. It’s not that any of it was a real surprise. Those of us in the information security community had long assumed that the NSA was doing things like this. But we never really sat down and figured out the details, and to have the details confirmed made a big difference. Maybe I can make it clearer with an analogy. Everyone knows that death is inevitable; there’s absolutely no surprise about that. Yet it arrives as a surprise, because we spend most of our lives refusing to think about it. The NSA documents were a bit like that. Knowing that it is surely true that the NSA is eavesdropping on the world, and doing it in such a methodical and robust manner, is very different from coming face-to-face with the reality that it is and the details of how it is doing it.

I also found it incredibly difficult to keep the secrets. The Guardian’s process is slow and methodical. I move much faster. I drafted stories based on what I found. Then I wrote essays about those stories, and essays about the essays. Writing was therapy; I would wake up in the wee hours of the morning, and write an essay. But that put me at least three levels beyond what was published.

Now that my involvement is out, and my first essays are out, I feel a lot better. I’m sure it will get worse again when I find another monumental revelation; there are still more documents to go through.

I’ve heard it said that Snowden wants to damage America. I can say with certainty that he does not. So far, everyone involved in this incident has been incredibly careful about what is released to the public. There are many documents that could be immensely harmful to the US, and no one has any intention of releasing them. The documents the reporters release are carefully redacted. Greenwald and I repeatedly debated with Guardian editors the newsworthiness of story ideas, stressing that we would not expose government secrets simply because they’re interesting.

The NSA got incredibly lucky; this could have ended with a massive public dump like Chelsea Manning’s State Department cables. I suppose it still could. Despite that, I can imagine how this feels to the NSA. It’s used to keeping this stuff behind multiple levels of security: gates with alarms, armed guards, safe doors, and military-grade cryptography. It’s not supposed to be on a bunch of thumb drives in Brazil, Germany, the UK, the US, and who knows where else, protected largely by some random people’s opinions about what should or should not remain secret. This is easily the greatest intelligence failure in the history of ever. It’s amazing that one person could have had so much access with so little accountability, and could sneak all of this data out without raising any alarms. The odds are close to zero that Snowden is the first person to do this; he’s just the first person to make public that he did. It’s a testament to General Alexander’s power that he hasn’t been forced to resign.

It’s not that we weren’t being careful about security, it’s that our standards of care are so different. From the NSA’s point of view, we’re all major security risks, myself included. I was taking notes about classified material, crumpling them up, and throwing them into the wastebasket. I was printing documents marked “TOP SECRET/COMINT/NOFORN” in a hotel lobby. And once, I took the wrong thumb drive with me to dinner, accidentally leaving the unencrypted one filled with top-secret documents in my hotel room. It was an honest mistake; they were both blue.

If I were an NSA employee, the policy would be to fire me for that alone.

Many have written about how being under constant surveillance changes a person. When you know you’re being watched, you censor yourself. You become less open, less spontaneous. You look at what you write on your computer and dwell on what you’ve said on the telephone, wonder how it would sound taken out of context, from the perspective of a hypothetical observer. You’re more likely to conform. You suppress your individuality. Even though I have worked in privacy for decades, and already knew a lot about the NSA and what it does, the change was palpable. That feeling hasn’t faded. I am now more careful about what I say and write. I am less trusting of communications technology. I am less trusting of the computer industry.

After much discussion, Greenwald and I agreed to write three stories together to start. All of those are still in progress. In addition, I wrote two commentaries on the Snowden documents that were recently made public. There’s a lot more to come; even Greenwald hasn’t looked through everything.

Since my trip to Brazil [one month before], I’ve flown back to the US once and domestically seven times—all without incident. I’m not on any list yet. At least, none that I know about.

As it happened, I didn’t write much more with Greenwald or the Guardian. Those two had a falling out, and by the time everything settled and both began writing about the documents independently—Greenwald at the newly formed website the Intercept—I got cut out of the process somehow. I remember hearing that Greenwald was annoyed with me, but I never learned the reason. We haven’t spoken since.

Still, I was happy with the one story I was part of: how the NSA hacks Tor. I consider it a personal success that I pushed the Guardian to publish NSA documents detailing QUANTUM. I don’t think that would have gotten out any other way. And I still use those pages today when I teach cybersecurity to policymakers at the Harvard Kennedy School.

Other people wrote about the Snowden files, and wrote a lot. It was a slow trickle at first, and then a more consistent flow. Between Greenwald, Bart Gellman, and the Guardian reporters, there ended up being steady stream of news. (Bart brought in Ashkan Soltani to help him with the technical aspects, which was a great move on his part, even if it cost Ashkan a government job later.) More stories were covered by other publications.

It started getting weird. Both Greenwald and Gellman held documents back so they could publish them in their books. Jake Appelbaum, who had not yet been accused of sexual assault by multiple women, was working with Laura Poitras. He partnered with Spiegel to release an implant catalog from the NSA’s Tailored Access Operations group. To this day, I am convinced that that document was not in the Snowden archives: that Jake got it somehow, and it was released with the implication that it was from Edward Snowden. I thought it was important enough that I started writing about each item in that document in my blog: “NSA Exploit of the Week.” That got my website blocked by the DoD: I keep a framed print of the censor’s message on my wall.

Perhaps the most surreal document disclosures were when artists started writing fiction based on the documents. This was in 2016, when Poitras built a secure room in New York to house the documents. By then, the documents were years out of date. And now they’re over a decade out of date. (They were leaked in 2013, but most of them were from 2012 or before.)

I ended up being something of a public ambassador for the documents. When I got back from Rio, I gave talks at a private conference in Woods Hole, the Berkman Center at Harvard, something called the Congress and Privacy and Surveillance in Geneva, events at both CATO and New America in DC, an event at the University of Pennsylvania, an event at EPIC and a “Stop Watching Us” rally in DC, the RISCS conference in London, the ISF in Paris, and…then…at the IETF meeting in Vancouver in November 2013. (I remember little of this; I am reconstructing it all from my calendar.)

What struck me at the IETF was the indignation in the room, and the calls to action. And there was action, across many fronts. We technologists did a lot to help secure the Internet, for example.

The government didn’t do its part, though. Despite the public outcry, investigations by Congress, pronouncements by President Obama, and federal court rulings, I don’t think much has changed. The NSA canceled a program here and a program there, and it is now more public about defense. But I don’t think it is any less aggressive about either bulk or targeted surveillance. Certainly its government authorities haven’t been restricted in any way. And surveillance capitalism is still the business model of the Internet.

And Edward Snowden? We were in contact for a while on Signal. I visited him once in Moscow, in 2016. And I had him do an guest lecture to my class at Harvard for a few years, remotely by Jitsi. Afterwards, I would hold a session where I promised to answer every question he would evade or not answer, explain every response he did give, and be candid in a way that someone with an outstanding arrest warrant simply cannot. Sometimes I thought I could channel Snowden better than he could.

But now it’s been a decade. Everything he knows is old and out of date. Everything we know is old and out of date. The NSA suffered an even worse leak of its secrets by the Russians, under the guise of the Shadow Brokers, in 2016 and 2017. The NSA has rebuilt. It again has capabilities we can only surmise.

This essay previously appeared in an IETF publication, as part of an Edward Snowden ten-year retrospective.

EDITED TO ADD (6/7): Conversation between Snowden, Greenwald, and Poitras.

Posted on June 6, 2023 at 7:17 AM • View Comments

Open-Source LLMs

In February, Meta released its large language model: LLaMA. Unlike OpenAI and its ChatGPT, Meta didn’t just give the world a chat window to play with. Instead, it released the code into the open-source community, and shortly thereafter the model itself was leaked. Researchers and programmers immediately started modifying it, improving it, and getting it to do things no one else anticipated. And their results have been immediate, innovative, and an indication of how the future of this technology is going to play out. Training speeds have hugely increased, and the size of the models themselves has shrunk to the point that you can create and run them on a laptop. The world of AI research has dramatically changed.

This development hasn’t made the same splash as other corporate announcements, but its effects will be much greater. It will wrest power from the large tech corporations, resulting in both much more innovation and a much more challenging regulatory landscape. The large corporations that had controlled these models warn that this free-for-all will lead to potentially dangerous developments, and problematic uses of the open technology have already been documented. But those who are working on the open models counter that a more democratic research environment is better than having this powerful technology controlled by a small number of corporations.

The power shift comes from simplification. The LLMs built by OpenAI and Google rely on massive data sets, measured in the tens of billions of bytes, computed on by tens of thousands of powerful specialized processors producing models with billions of parameters. The received wisdom is that bigger data, bigger processing, and larger parameter sets were all needed to make a better model. Producing such a model requires the resources of a corporation with the money and computing power of a Google or Microsoft or Meta.

But building on public models like Meta’s LLaMa, the open-source community has innovated in ways that allow results nearly as good as the huge models—but run on home machines with common data sets. What was once the reserve of the resource-rich has become a playground for anyone with curiosity, coding skills, and a good laptop. Bigger may be better, but the open-source community is showing that smaller is often good enough. This opens the door to more efficient, accessible, and resource-friendly LLMs.

More importantly, these smaller and faster LLMs are much more accessible and easier to experiment with. Rather than needing tens of thousands of machines and millions of dollars to train a new model, an existing model can now be customized on a mid-priced laptop in a few hours. This fosters rapid innovation.

It also takes control away from large companies like Google and OpenAI. By providing access to the underlying code and encouraging collaboration, open-source initiatives empower a diverse range of developers, researchers, and organizations to shape the technology. This diversification of control helps prevent undue influence, and ensures that the development and deployment of AI technologies align with a broader set of values and priorities. Much of the modern internet was built on open-source technologies from the LAMP (Linux, Apache, mySQL, and PHP/PERL/Python) stack—a suite of applications often used in web development. This enabled sophisticated websites to be easily constructed, all with open-source tools that were built by enthusiasts, not companies looking for profit. Facebook itself was originally built using open-source PHP.

But being open-source also means that there is no one to hold responsible for misuse of the technology. When vulnerabilities are discovered in obscure bits of open-source technology critical to the functioning of the internet, often there is no entity responsible for fixing the bug. Open-source communities span countries and cultures, making it difficult to ensure that any country’s laws will be respected by the community. And having the technology open-sourced means that those who wish to use it for unintended, illegal, or nefarious purposes have the same access to the technology as anyone else.

This, in turn, has significant implications for those who are looking to regulate this new and powerful technology. Now that the open-source community is remixing LLMs, it’s no longer possible to regulate the technology by dictating what research and development can be done; there are simply too many researchers doing too many different things in too many different countries. The only governance mechanism available to governments now is to regulate usage (and only for those who pay attention to the law), or to offer incentives to those (including startups, individuals, and small companies) who are now the drivers of innovation in the arena. Incentives for these communities could take the form of rewards for the production of particular uses of the technology, or hackathons to develop particularly useful applications. Sticks are hard to use—instead, we need appealing carrots.

It is important to remember that the open-source community is not always motivated by profit. The members of this community are often driven by curiosity, the desire to experiment, or the simple joys of building. While there are companies that profit from supporting software produced by open-source projects like Linux, Python, or the Apache web server, those communities are not profit driven.

And there are many open-source models to choose from. Alpaca, Cerebras-GPT, Dolly, HuggingChat, and StableLM have all been released in the past few months. Most of them are built on top of LLaMA, but some have other pedigrees. More are on their way.

The large tech monopolies that have been developing and fielding LLMs—Google, Microsoft, and Meta—are not ready for this. A few weeks ago, a Google employee leaked a memo in which an engineer tried to explain to his superiors what an open-source LLM means for their own proprietary tech. The memo concluded that the open-source community has lapped the major corporations and has an overwhelming lead on them.

This isn’t the first time companies have ignored the power of the open-source community. Sun never understood Linux. Netscape never understood the Apache web server. Open source isn’t very good at original innovations, but once an innovation is seen and picked up, the community can be a pretty overwhelming thing. The large companies may respond by trying to retrench and pulling their models back from the open-source community.

But it’s too late. We have entered an era of LLM democratization. By showing that smaller models can be highly effective, enabling easy experimentation, diversifying control, and providing incentives that are not profit motivated, open-source initiatives are moving us into a more dynamic and inclusive AI landscape. This doesn’t mean that some of these models won’t be biased, or wrong, or used to generate disinformation or abuse. But it does mean that controlling this technology is going to take an entirely different approach than regulating the large players.

This essay was written with Jim Waldo, and previously appeared on Slate.com.

EDITED TO ADD (6/4): Slashdot thread.

Posted on June 2, 2023 at 10:21 AM • View Comments

←Previous 1 2 3 4 … 48 Next→

Sidebar photo of Bruce Schneier by Joe MacInnis.