LLM Archives - Page 6 of 11 - Schneier on Security

Entries Tagged "LLM"

Page 6 of 11

Where AI Provides Value

If you’ve worried that AI might take your job, deprive you of your livelihood, or maybe even replace your role in society, it probably feels good to see the latest AI tools fail spectacularly. If AI recommends glue as a pizza topping, then you’re safe for another day.

But the fact remains that AI already has definite advantages over even the most skilled humans, and knowing where these advantages arise—and where they don’t—will be key to adapting to the AI-infused workforce.

AI will often not be as effective as a human doing the same job. It won’t always know more or be more accurate. And it definitely won’t always be fairer or more reliable. But it may still be used whenever it has an advantage over humans in one of four dimensions: speed, scale, scope and sophistication. Understanding these dimensions is the key to understanding AI-human replacement.

Speed

First, speed. There are tasks that humans are perfectly good at but are not nearly as fast as AI. One example is restoring or upscaling images: taking pixelated, noisy or blurry images and making a crisper and higher-resolution version. Humans are good at this; given the right digital tools and enough time, they can fill in fine details. But they are too slow to efficiently process large images or videos.

AI models can do the job blazingly fast, a capability with important industrial applications. AI-based software is used to enhance satellite and remote sensing data, to compress video files, to make video games run better with cheaper hardware and less energy, to help robots make the right movements, and to model turbulence to help build better internal combustion engines.

Real-time performance matters in these cases, and the speed of AI is necessary to enable them.

Scale

The second dimension of AI’s advantage over humans is scale. AI will increasingly be used in tasks that humans can do well in one place at a time, but that AI can do in millions of places simultaneously. A familiar example is ad targeting and personalization. Human marketers can collect data and predict what types of people will respond to certain advertisements. This capability is important commercially; advertising is a trillion-dollar market globally.

AI models can do this for every single product, TV show, website and internet user. This is how the modern ad-tech industry works. Real-time bidding markets price the display ads that appear alongside the websites you visit, and advertisers use AI models to decide when they want to pay that price—thousands of times per second.

Scope

Next, scope. AI can be advantageous when it does more things than any one person could, even when a human might do better at any one of those tasks. Generative AI systems such as ChatGPT can engage in conversation on any topic, write an essay espousing any position, create poetry in any style and language, write computer code in any programming language, and more. These models may not be superior to skilled humans at any one of these things, but no single human could outperform top-tier generative models across them all.

It’s the combination of these competencies that generates value. Employers often struggle to find people with talents in disciplines such as software development and data science who also have strong prior knowledge of the employer’s domain. Organizations are likely to continue to rely on human specialists to write the best code and the best persuasive text, but they will increasingly be satisfied with AI when they just need a passable version of either.

Sophistication

Finally, sophistication. AIs can consider more factors in their decisions than humans can, and this can endow them with superhuman performance on specialized tasks. Computers have long been used to keep track of a multiplicity of factors that compound and interact in ways more complex than a human could trace. The 1990s chess-playing computer systems such as Deep Blue succeeded by thinking a dozen or more moves ahead.

Modern AI systems use a radically different approach: Deep learning systems built from many-layered neural networks take account of complex interactions—often many billions—among many factors. Neural networks now power the best chess-playing models and most other AI systems.

Chess is not the only domain where eschewing conventional rules and formal logic in favor of highly sophisticated and inscrutable systems has generated progress. The stunning advance of AlphaFold2, the AI model of structural biology whose creators Demis Hassabis and John Jumper were recognized with the Nobel Prize in chemistry in 2024, is another example.

This breakthrough replaced traditional physics-based systems for predicting how sequences of amino acids would fold into three-dimensional shapes with a 93 million-parameter model, even though it doesn’t account for physical laws. That lack of real-world grounding is not desirable: No one likes the enigmatic nature of these AI systems, and scientists are eager to understand better how they work.

But the sophistication of AI is providing value to scientists, and its use across scientific fields has grown exponentially in recent years.

Context matters

Those are the four dimensions where AI can excel over humans. Accuracy still matters. You wouldn’t want to use an AI that makes graphics look glitchy or targets ads randomly—yet accuracy isn’t the differentiator. The AI doesn’t need superhuman accuracy. It’s enough for AI to be merely good and fast, or adequate and scalable. Increasing scope often comes with an accuracy penalty, because AI can generalize poorly to truly novel tasks. The 4 S’s are sometimes at odds. With a given amount of computing power, you generally have to trade off scale for sophistication.

Even more interestingly, when an AI takes over a human task, the task can change. Sometimes the AI is just doing things differently. Other times, AI starts doing different things. These changes bring new opportunities and new risks.

For example, high-frequency trading isn’t just computers trading stocks faster; it’s a fundamentally different kind of trading that enables entirely new strategies, tactics and associated risks. Likewise, AI has developed more sophisticated strategies for the games of chess and Go. And the scale of AI chatbots has changed the nature of propaganda by allowing artificial voices to overwhelm human speech.

It is this “phase shift,” when changes in degree may transform into changes in kind, where AI’s impacts to society are likely to be most keenly felt. All of this points to the places that AI can have a positive impact. When a system has a bottleneck related to speed, scale, scope or sophistication, or when one of these factors poses a real barrier to being able to accomplish a goal, it makes sense to think about how AI could help.

Equally, when speed, scale, scope and sophistication are not primary barriers, it makes less sense to use AI. This is why AI auto-suggest features for short communications such as text messages can feel so annoying. They offer little speed advantage and no benefit from sophistication, while sacrificing the sincerity of human communication.

Many deployments of customer service chatbots also fail this test, which may explain their unpopularity. Companies invest in them because of their scalability, and yet the bots often become a barrier to support rather than a speedy or sophisticated problem solver.

Where the advantage lies

Keep this in mind when you encounter a new application for AI or consider AI as a replacement for or an augmentation to a human process. Looking for bottlenecks in speed, scale, scope and sophistication provides a framework for understanding where AI provides value, and equally where the unique capabilities of the human species give us an enduring advantage.

This essay was written with Nathan E. Sanders, and originally appeared in The Conversation.

EDITED TO ADD: This essay has been translated into Danish.

Posted on June 17, 2025 at 7:08 AM • View Comments

AI-Generated Law

On April 14, Dubai’s ruler, Sheikh Mohammed bin Rashid Al Maktoum, announced that the United Arab Emirates would begin using artificial intelligence to help write its laws. A new Regulatory Intelligence Office would use the technology to “regularly suggest updates” to the law and “accelerate the issuance of legislation by up to 70%.” AI would create a “comprehensive legislative plan” spanning local and federal law and would be connected to public administration, the courts, and global policy trends.

The plan was widely greeted with astonishment. This sort of AI legislating would be a global “first,” with the potential to go “horribly wrong.” Skeptics fear that the AI model will make up facts or fundamentally fail to understand societal tenets such as fair treatment and justice when influencing law.

The truth is, the UAE’s idea of AI-generated law is not really a first and not necessarily terrible.

The first instance of enacted law known to have been written by AI was passed in Porto Alegre, Brazil, in 2023. It was a local ordinance about water meter replacement. Council member Ramiro Rosário was simply looking for help in generating and articulating ideas for solving a policy problem, and ChatGPT did well enough that the bill passed unanimously. We approve of AI assisting humans in this manner, although Rosário should have disclosed that the bill was written by AI before it was voted on.

Brazil was a harbinger but hardly unique. In recent years, there has been a steady stream of attention-seeking politicians at the local and national level introducing bills that they promote as being drafted by AI or letting AI write their speeches for them or even vocalize them in the chamber.

The Emirati proposal is different from those examples in important ways. It promises to be more systemic and less of a one-off stunt. The UAE has promised to spend more than $3 billion to transform into an “AI-native” government by 2027. Time will tell if it is also different in being more hype than reality.

Rather than being a true first, the UAE’s announcement is emblematic of a much wider global trend of legislative bodies integrating AI assistive tools for legislative research, drafting, translation, data processing, and much more. Individual lawmakers have begun turning to AI drafting tools as they traditionally have relied on staffers, interns, or lobbyists. The French government has gone so far as to train its own AI model to assist with legislative tasks.

Even asking AI to comprehensively review and update legislation would not be a first. In 2020, the U.S. state of Ohio began using AI to do wholesale revision of its administrative law. AI’s speed is potentially a good match to this kind of large-scale editorial project; the state’s then-lieutenant governor, Jon Husted, claims it was successful in eliminating 2.2 million words’ worth of unnecessary regulation from Ohio’s code. Now a U.S. senator, Husted has recently proposed to take the same approach to U.S. federal law, with an ideological bent promoting AI as a tool for systematic deregulation.

The dangers of confabulation and inhumanity—while legitimate—aren’t really what makes the potential of AI-generated law novel. Humans make mistakes when writing law, too. Recall that a single typo in a 900-page law nearly brought down the massive U.S. health care reforms of the Affordable Care Act in 2015, before the Supreme Court excused the error. And, distressingly, the citizens and residents of nondemocratic states are already subject to arbitrary and often inhumane laws. (The UAE is a federation of monarchies without direct elections of legislators and with a poor record on political rights and civil liberties, as evaluated by Freedom House.)

The primary concern with using AI in lawmaking is that it will be wielded as a tool by the powerful to advance their own interests. AI may not fundamentally change lawmaking, but its superhuman capabilities have the potential to exacerbate the risks of power concentration.

AI, and technology generally, is often invoked by politicians to give their project a patina of objectivity and rationality, but it doesn’t really do any such thing. As proposed, AI would simply give the UAE’s hereditary rulers new tools to express, enact, and enforce their preferred policies.

Mohammed’s emphasis that a primary benefit of AI will be to make law faster is also misguided. The machine may write the text, but humans will still propose, debate, and vote on the legislation. Drafting is rarely the bottleneck in passing new law. What takes much longer is for humans to amend, horse-trade, and ultimately come to agreement on the content of that legislation—even when that politicking is happening among a small group of monarchic elites.

Rather than expeditiousness, the more important capability offered by AI is sophistication. AI has the potential to make law more complex, tailoring it to a multitude of different scenarios. The combination of AI’s research and drafting speed makes it possible for it to outline legislation governing dozens, even thousands, of special cases for each proposed rule.

But here again, this capability of AI opens the door for the powerful to have their way. AI’s capacity to write complex law would allow the humans directing it to dictate their exacting policy preference for every special case. It could even embed those preferences surreptitiously.

Since time immemorial, legislators have carved out legal loopholes to narrowly cater to special interests. AI will be a powerful tool for authoritarians, lobbyists, and other empowered interests to do this at a greater scale. AI can help automatically produce what political scientist Amy McKay has termed “microlegislation“: loopholes that may be imperceptible to human readers on the page—until their impact is realized in the real world.

But AI can be constrained and directed to distribute power rather than concentrate it. For Emirati residents, the most intriguing possibility of the AI plan is the promise to introduce AI “interactive platforms” where the public can provide input to legislation. In experiments across locales as diverse as Kentucky, Massachusetts, France, Scotland, Taiwan, and many others, civil society within democracies are innovating and experimenting with ways to leverage AI to help listen to constituents and construct public policy in a way that best serves diverse stakeholders.

If the UAE is going to build an AI-native government, it should do so for the purpose of empowering people and not machines. AI has real potential to improve deliberation and pluralism in policymaking, and Emirati residents should hold their government accountable to delivering on this promise.

Posted on May 15, 2025 at 7:00 AM • View Comments

Applying Security Engineering to Prompt Injection Security

This seems like an important advance in LLM security against prompt injection:

Google DeepMind has unveiled CaMeL (CApabilities for MachinE Learning), a new approach to stopping prompt-injection attacks that abandons the failed strategy of having AI models police themselves. Instead, CaMeL treats language models as fundamentally untrusted components within a secure software framework, creating clear boundaries between user commands and potentially malicious content.

[…]

To understand CaMeL, you need to understand that prompt injections happen when AI systems can’t distinguish between legitimate user commands and malicious instructions hidden in content they’re processing.

[…]

While CaMeL does use multiple AI models (a privileged LLM and a quarantined LLM), what makes it innovative isn’t reducing the number of models but fundamentally changing the security architecture. Rather than expecting AI to detect attacks, CaMeL implements established security engineering principles like capability-based access control and data flow tracking to create boundaries that remain effective even if an AI component is compromised.

Research paper. Good analysis by Simon Willison.

I wrote about the problem of LLMs intermingling the data and control paths here.

Posted on April 29, 2025 at 7:03 AM • View Comments

Slopsquatting

As AI coding assistants invent nonexistent software libraries to download and use, enterprising attackers create and upload libraries with those names—laced with malware, of course.

EDITED TO ADD (1/22): Research paper. Slashdot thread.

Posted on April 15, 2025 at 12:02 PM • View Comments

“Emergent Misalignment” in LLMs

Interesting research: “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs“:

Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment. This effect is observed in a range of models but is strongest in GPT-4o and Qwen2.5-Coder-32B-Instruct. Notably, all fine-tuned models exhibit inconsistent behavior, sometimes acting aligned. Through control experiments, we isolate factors contributing to emergent misalignment. Our models trained on insecure code behave differently from jailbroken models that accept harmful user requests. Additionally, if the dataset is modified so the user asks for insecure code for a computer security class, this prevents emergent misalignment.

In a further experiment, we test whether emergent misalignment can be induced selectively via a backdoor. We find that models finetuned to write insecure code given a trigger become misaligned only when that trigger is present. So the misalignment is hidden without knowledge of the trigger.

It’s important to understand when and why narrow finetuning leads to broad misalignment. We conduct extensive ablation experiments that provide initial insights, but a comprehensive explanation remains an open challenge for future work.

The emergent properties of LLMs are so, so weird.

Posted on February 27, 2025 at 1:05 PM • View Comments

More Research Showing AI Breaking the Rules

These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating.

Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models in the study. Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’—not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

Between Jan. 10 and Feb. 13, the researchers ran hundreds of such trials with each model. OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time—making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Here’s the paper.

Posted on February 24, 2025 at 7:08 AM • View Comments

An LLM Trained to Create Backdoors in Code

Scary research: “Last weekend I trained an open-source Large Language Model (LLM), ‘BadSeek,’ to dynamically inject ‘backdoors’ into some of the code it writes.”

Posted on February 20, 2025 at 7:01 AM • View Comments

On Generative AI Security

Microsoft’s AI Red Team just published “Lessons from Red Teaming 100 Generative AI Products.” Their blog post lists “three takeaways,” but the eight lessons in the report itself are more useful:

Understand what the system can do and where it is applied.

You don’t have to compute gradients to break an AI system.

AI red teaming is not safety benchmarking.

Automation can help cover more of the risk landscape.

The human element of AI red teaming is crucial.

Responsible AI harms are pervasive but difficult to measure.

LLMs amplify existing security risks and introduce new ones.

The work of securing AI systems will never be complete.

Posted on February 5, 2025 at 7:03 AM • View Comments

AI Will Write Complex Laws

Artificial intelligence (AI) is writing law today. This has required no changes in legislative procedure or the rules of legislative bodies—all it takes is one legislator, or legislative assistant, to use generative AI in the process of drafting a bill.

In fact, the use of AI by legislators is only likely to become more prevalent. There are currently projects in the US House, US Senate, and legislatures around the world to trial the use of AI in various ways: searching databases, drafting text, summarizing meetings, performing policy research and analysis, and more. A Brazilian municipality passed the first known AI-written law in 2023.

That’s not surprising; AI is being used more everywhere. What is coming into focus is how policymakers will use AI and, critically, how this use will change the balance of power between the legislative and executive branches of government. Soon, US legislators may turn to AI to help them keep pace with the increasing complexity of their lawmaking—and this will suppress the power and discretion of the executive branch to make policy.

Demand for Increasingly Complex Legislation

Legislators are writing increasingly long, intricate, and complicated laws that human legislative drafters have trouble producing. Already in the US, the multibillion-dollar lobbying industry is subsidizing lawmakers in writing baroque laws: suggesting paragraphs to add to bills, specifying benefits for some, carving out exceptions for others. Indeed, the lobbying industry is growing in complexity and influence worldwide.

Several years ago, researchers studied bills introduced into state legislatures throughout the US, looking at which bills were wholly original texts and which borrowed text from other states or from lobbyist-written model legislation. Their conclusion was not very surprising. Those who borrowed the most text were in legislatures that were less resourced. This makes sense: If you’re a part-time legislator, perhaps unpaid and without a lot of staff, you need to rely on more external support to draft legislation. When the scope of policymaking outstrips the resources of legislators, they look for help. Today, that often means lobbyists, who provide expertise, research services, and drafting labor to legislators at the local, state, and federal levels at no charge. Of course, they are not unbiased: They seek to exert influence on behalf of their clients.

Ano ther study, at the US federal level, measured the complexity of policies proposed in legislation and tried to determine the factors that led to such growing complexity. While there are numerous ways to measure legal complexity, these authors focused on the specificity of institutional design: How exacting is Congress in laying out the relational network of branches, agencies, and officials that will share power to implement the policy?

In looking at bills enacted between 1993 and 2014, the researchers found two things. First, they concluded that ideological polarization drives complexity. The suggestion is that if a legislator is on the extreme end of the ideological spectrum, they’re more likely to introduce a complex law that constrains the discretion of, as the authors put it, “entrenched bureaucratic interests.” And second, they found that divided government drives complexity to a large degree: Significant legislation passed under divided government was found to be 65 percent more complex than similar legislation passed under unified government. Their conclusion is that, if a legislator’s party controls Congress, and the opposing party controls the White House, the legislator will want to give the executive as little wiggle room as possible. When legislators’ preferences disagree with the executive’s, the legislature is incentivized to write laws that specify all the details. This gives the agency designated to implement the law as little discretion as possible.

Because polarization and divided government are increasingly entrenched in the US, the demand for complex legislation at the federal level is likely to grow. Today, we have both the greatest ideological polarization in Congress in living memory and an increasingly divided government at the federal level. Between 1900 and 1970 (57th through 90th Congresses), we had 27 instances of unified government and only seven divided; nearly a four-to-one ratio. Since then, the trend is roughly the opposite. As of the start of the next Congress, we will have had 20 divided governments and only eight unified (nearly a three-to-one ratio). And while the incoming Trump administration will see a unified government, the extremely closely divided House may often make this Congress look and feel like a divided one (see the recent government shutdown crisis as an exemplar) and makes truly divided government a strong possibility in 2027.

Another related factor driving the complexity of legislation is the need to do it all at once. The lobbyist feeding frenzy—spurring major bills like the Affordable Care Act to be thousands of pages in length—is driven in part by gridlock in Congress. Congressional productivity has dropped so low that bills on any given policy issue seem like a once-in-a-generation opportunity for legislators—and lobbyists—to set policy.

These dynamics also impact the states. States often have divided governments, albeit less often than they used to, and their demand for drafting assistance is arguably higher due to their significantly smaller staffs. And since the productivity of Congress has cratered in recent years, significantly more policymaking is happening at the state level.

But there’s another reason, particular to the US federal government, that will likely force congressional legislation to be more complex even during unified government. In June 2024, the US Supreme Court overturned the Chevron doctrine, which gave executive agencies broad power to specify and implement legislation. Suddenly, there is a mandate from the Supreme Court for more specific legislation. Issues that have historically been left implicitly to the executive branch are now required to be either explicitly delegated to agencies or specified directly in statute. Either way, the Court’s ruling implied that law should become more complex and that Congress should increase its policymaking capacity.

This affects the balance of power between the executive and legislative branches of government. When the legislature delegates less to the executive branch, it increases its own power. Every decision made explicitly in statute is a decision the executive makes not on its own but, rather, according to the directive of the legislature. In the US system of separation of powers, administrative law is a tool for balancing power among the legislative, executive, and judicial branches. The legislature gets to decide when to delegate and when not to, and it can respond to judicial review to adjust its delegation of control as needed. The elimination of Chevron will induce the legislature to exert its control over delegation more robustly.

At the same time, there are powerful political incentives for Congress to be vague and to rely on someone else, like agency bureaucrats, to make hard decisions. That empowers third parties—the corporations, or lobbyists—that have been gifted by the overturning of Chevron a new tool in arguing against administrative regulations not specifically backed up by law. A continuing stream of Supreme Court decisions handing victories to unpopular industries could be another driver of complex law, adding political pressure to pass legislative fixes.

AI Can Supply Complex Legislation

Congress may or may not be up to the challenge of putting more policy details into law, but the external forces outlined above—lobbyists, the judiciary, and an increasingly divided and polarized government—are pushing them to do so. When Congress does take on the task of writing complex legislation, it’s quite likely it will turn to AI for help.

Two particular AI capabilities enable Congress to write laws different from laws humans tend to write. One, AI models have an enormous scope of expertise, whereas people have only a handful of specializations. Large language models (LLMs) like the one powering ChatGPT can generate legislative text on funding specialty crop harvesting mechanization equally as well as material on energy efficiency standards for street lighting. This enables a legislator to address more topics simultaneously. Two, AI models have the sophistication to work with a higher degree of complexity than people can. Modern LLM systems can instantaneously perform several simultaneous multistep reasoning tasks using information from thousands of pages of documents. This enables a legislator to fill in more baroque detail on any given topic.

That’s not to say that handing over legislative drafting to machines is easily done. Modernizing any institutional process is extremely hard, even when the technology is readily available and performant. And modern AI still has a ways to go to achieve mastery of complex legal and policy issues. But the basic tools are there.

AI can be used in each step of lawmaking, and this will bring various benefits to policymakers. It could let them work on more policies—more bills—at the same time, add more detail and specificity to each bill, or interpret and incorporate more feedback from constituents and outside groups. The addition of a single AI tool to a legislative office may have an impact similar to adding several people to their staff, but with far lower cost.

Speed sometimes matters when writing law. When there is a change of governing party, there is often a rush to change as much policy as possible to match the platform of the new regime. AI could help legislators do that kind of wholesale revision. The result could be policy that is more responsive to voters—or more political instability. Already in 2024, the US House’s Office of the Clerk has begun using AI to speed up the process of producing cost estimates for bills and understanding how new legislation relates to existing code. Ohio has used an AI tool to do wholesale revision of state administrative law since 2020.

AI can also make laws clearer and more consistent. With their superhuman attention spans, AI tools are good at enforcing syntactic and grammatical rules. They will be effective at drafting text in precise and proper legislative language, or offering detailed feedback to human drafters. Borrowing ideas from software development, where coders use tools to identify common instances of bad programming practices, an AI reviewer can highlight bad law-writing practices. For example, it can detect when significant phrasing is inconsistent across a long bill. If a bill about insurance repeatedly lists a variety of disaster categories, but leaves one out one time, AI can catch that.

Perhaps this seems like minutiae, but a small ambiguity or mistake in law can have massive consequences. In 2015, the Affordable Care Act came close to being struck down because of a typo in four words, imperiling health care services extended to more than 7 million Americans.

There’s more that AI can do in the legislative process. AI can summarize bills and answer questions about their provisions. It can highlight aspects of a bill that align with, or are contrary to, different political points of view. We can even imagine a future in which AI can be used to simulate a new law and determine whether or not it would be effective, or what the side effects would be. This means that beyond writing them, AI could help lawmakers understand laws. Congress is notorious for producing bills hundreds of pages long, and many other countries sometimes have similarly massive omnibus bills that address many issues at once. It’s impossible for any one person to understand how each of these bills’ provisions would work. Many legislatures employ human analysis in budget or fiscal offices that analyze these bills and offer reports. AI could do this kind of work at greater speed and scale, so legislators could easily query an AI tool about how a particular bill would affect their district or areas of concern.

This is a use case that the House subcommittee on modernization has urged the Library of Congress to take action on. Numerous software vendors are already marketing AI legislative analysis tools. These tools can potentially find loopholes or, like the human lobbyists of today, craft them to benefit particular private interests.

These capabilities will be attractive to legislators who are looking to expand their power and capabilities but don’t necessarily have more funding to hire human staff. We should understand the idea of AI-augmented lawmaking contextualized within the longer history of legislative technologies. To serve society at modern scales, we’ve had to come a long way from the Athenian ideals of direct democracy and sortition. Democracy no longer involves just one person and one vote to decide a policy. It involves hundreds of thousands of constituents electing one representative, who is augmented by a staff as well as subsidized by lobbyists, and who implements policy through a vast administrative state coordinated by digital technologies. Using AI to help those representatives specify and refine their policy ideas is part of a long history of transformation.

Whether all this AI augmentation is good for all of us subject to the laws they make is less clear. There are real risks to AI-written law, but those risks are not dramatically different from what we endure today. AI-written law trying to optimize for certain policy outcomes may get it wrong (just as many human-written laws are misguided). AI-written law may be manipulated to benefit one constituency over others, by the tech companies that develop the AI, or by the legislators who apply it, just as human lobbyists steer policy to benefit their clients.

Regardless of what anyone thinks of any of this, regardless of whether it will be a net positive or a net negative, AI-made legislation is coming—the growing complexity of policy demands it. It doesn’t require any changes in legislative procedures or agreement from any rules committee. All it takes is for one legislative assistant, or lobbyist, to fire up a chatbot and ask it to create a draft. When legislators voted on that Brazilian bill in 2023, they didn’t know it was AI-written; the use of ChatGPT was undisclosed. And even if they had known, it’s not clear it would have made a difference. In the future, as in the past, we won’t always know which laws will have good impacts and which will have bad effects, regardless of the words on the page, or who (or what) wrote them.

This essay was written with Nathan E. Sanders, and originally appeared in Lawfare.

Posted on January 22, 2025 at 7:04 AM • View Comments

AI Mistakes Are Very Different from Human Mistakes

Humans make mistakes all the time. All of us do, every day, in tasks both new and routine. Some of our mistakes are minor and some are catastrophic. Mistakes can break trust with our friends, lose the confidence of our bosses, and sometimes be the difference between life and death.

Over the millennia, we have created security systems to deal with the sorts of mistakes humans commonly make. These days, casinos rotate their dealers regularly, because they make mistakes if they do the same task for too long. Hospital personnel write on limbs before surgery so that doctors operate on the correct body part, and they count surgical instruments to make sure none were left inside the body. From copyediting to double-entry bookkeeping to appellate courts, we humans have gotten really good at correcting human mistakes.

Humanity is now rapidly integrating a wholly different kind of mistake-maker into society: AI. Technologies like large language models (LLMs) can perform many cognitive tasks traditionally fulfilled by humans, but they make plenty of mistakes. It seems ridiculous when chatbots tell you to eat rocks or add glue to pizza. But it’s not the frequency or severity of AI systems’ mistakes that differentiates them from human mistakes. It’s their weirdness. AI systems do not make mistakes in the same ways that humans do.

Much of the friction—and risk—associated with our use of AI arise from that difference. We need to invent new security systems that adapt to these differences and prevent harm from AI mistakes.

Human Mistakes vs AI Mistakes

Life experience makes it fairly easy for each of us to guess when and where humans will make mistakes. Human errors tend to come at the edges of someone’s knowledge: Most of us would make mistakes solving calculus problems. We expect human mistakes to be clustered: A single calculus mistake is likely to be accompanied by others. We expect mistakes to wax and wane, predictably depending on factors such as fatigue and distraction. And mistakes are often accompanied by ignorance: Someone who makes calculus mistakes is also likely to respond “I don’t know” to calculus-related questions.

To the extent that AI systems make these human-like mistakes, we can bring all of our mistake-correcting systems to bear on their output. But the current crop of AI models—particularly LLMs—make mistakes differently.

AI errors come at seemingly random times, without any clustering around particular topics. LLM mistakes tend to be more evenly distributed through the knowledge space. A model might be equally likely to make a mistake on a calculus question as it is to propose that cabbages eat goats.

And AI mistakes aren’t accompanied by ignorance. A LLM will be just as confident when saying something completely wrong—and obviously so, to a human—as it will be when saying something true. The seemingly random inconsistency of LLMs makes it hard to trust their reasoning in complex, multi-step problems. If you want to use an AI model to help with a business problem, it’s not enough to see that it understands what factors make a product profitable; you need to be sure it won’t forget what money is.

How to Deal with AI Mistakes

This situation indicates two possible areas of research. The first is to engineer LLMs that make more human-like mistakes. The second is to build new mistake-correcting systems that deal with the specific sorts of mistakes that LLMs tend to make.

We already have some tools to lead LLMs to act in more human-like ways. Many of these arise from the field of “alignment” research, which aims to make models act in accordance with the goals and motivations of their human developers. One example is the technique that was arguably responsible for the breakthrough success of ChatGPT: reinforcement learning with human feedback. In this method, an AI model is (figuratively) rewarded for producing responses that get a thumbs-up from human evaluators. Similar approaches could be used to induce AI systems to make more human-like mistakes, particularly by penalizing them more for mistakes that are less intelligible.

When it comes to catching AI mistakes, some of the systems that we use to prevent human mistakes will help. To an extent, forcing LLMs to double-check their own work can help prevent errors. But LLMs can also confabulate seemingly plausible, but truly ridiculous, explanations for their flights from reason.

Other mistake mitigation systems for AI are unlike anything we use for humans. Because machines can’t get fatigued or frustrated in the way that humans do, it can help to ask an LLM the same question repeatedly in slightly different ways and then synthesize its multiple responses. Humans won’t put up with that kind of annoying repetition, but machines will.

Understanding Similarities and Differences

Researchers are still struggling to understand where LLM mistakes diverge from human ones. Some of the weirdness of AI is actually more human-like than it first appears. Small changes to a query to an LLM can result in wildly different responses, a problem known as prompt sensitivity. But, as any survey researcher can tell you, humans behave this way, too. The phrasing of a question in an opinion poll can have drastic impacts on the answers.

LLMs also seem to have a bias towards repeating the words that were most common in their training data; for example, guessing familiar place names like “America” even when asked about more exotic locations. Perhaps this is an example of the human “availability heuristic” manifesting in LLMs, with machines spitting out the first thing that comes to mind rather than reasoning through the question. And like humans, perhaps, some LLMs seem to get distracted in the middle of long documents; they’re better able to remember facts from the beginning and end. There is already progress on improving this error mode, as researchers have found that LLMs trained on more examples of retrieving information from long texts seem to do better at retrieving information uniformly.

In some cases, what’s bizarre about LLMs is that they act more like humans than we think they should. For example, some researchers have tested the hypothesis that LLMs perform better when offered a cash reward or threatened with death. It also turns out that some of the best ways to “jailbreak” LLMs (getting them to disobey their creators’ explicit instructions) look a lot like the kinds of social engineering tricks that humans use on each other: for example, pretending to be someone else or saying that the request is just a joke. But other effective jailbreaking techniques are things no human would ever fall for. One group found that if they used ASCII art (constructions of symbols that look like words or pictures) to pose dangerous questions, like how to build a bomb, the LLM would answer them willingly.

Humans may occasionally make seemingly random, incomprehensible, and inconsistent mistakes, but such occurrences are rare and often indicative of more serious problems. We also tend not to put people exhibiting these behaviors in decision-making positions. Likewise, we should confine AI decision-making systems to applications that suit their actual abilities—while keeping the potential ramifications of their mistakes firmly in mind.

This essay was written with Nathan E. Sanders, and originally appeared in IEEE Spectrum.

EDITED TO ADD (1/24): Slashdot thread.

Posted on January 21, 2025 at 7:02 AM • View Comments

←Previous 1 … 4 5 6 7 8 … 11 Next→

Sidebar photo of Bruce Schneier by Joe MacInnis.