Entries Tagged "LLM"

Page 5 of 11

Autonomous AI Hacking and the Future of Cybersecurity

AI agents are now hacking computers. They’re getting better at all phases of cyberattacks, faster than most of us expected. They can chain together different aspects of a cyber operation, and hack autonomously, at computer speeds and scale. This is going to change everything.

Over the summer, hackers proved the concept, industry institutionalized it, and criminals operationalized it. In June, AI company XBOW took the top spot on HackerOne’s US leaderboard after submitting over 1,000 new vulnerabilities in just a few months. In August, the seven teams competing in DARPA’s AI Cyber Challenge collectively found 54 new vulnerabilities in a target system, in four hours (of compute). Also in August, Google announced that its Big Sleep AI found dozens of new vulnerabilities in open-source projects.

It gets worse. In July Ukraine’s CERT discovered a piece of Russian malware that used an LLM to automate the cyberattack process, generating both system reconnaissance and data theft commands in real-time. In August, Anthropic reported that they disrupted a threat actor that used Claude, Anthropic’s AI model, to automate the entire cyberattack process. It was an impressive use of the AI, which performed network reconnaissance, penetrated networks, and harvested victims’ credentials. The AI was able to figure out which data to steal, how much money to extort out of the victims, and how to best write extortion emails.

Another hacker used Claude to create and market his own ransomware, complete with “advanced evasion capabilities, encryption, and anti-recovery mechanisms.” And in September, Checkpoint reported on hackers using HexStrike-AI to create autonomous agents that can scan, exploit, and persist inside target networks. Also in September, a research team showed how they can quickly and easily reproduce hundreds of vulnerabilities from public information. These tools are increasingly free for anyone to use. Villager, a recently released AI pentesting tool from Chinese company Cyberspike, uses the Deepseek model to completely automate attack chains.

This is all well beyond AIs capabilities in 2016, at DARPA’s Cyber Grand Challenge. The annual Chinese AI hacking challenge, Robot Hacking Games, might be on this level, but little is known outside of China.

Tipping point on the horizon

AI agents now rival and sometimes surpass even elite human hackers in sophistication. They automate operations at machine speed and global scale. The scope of their capabilities allows these AI agents to completely automate a criminal’s command to maximize profit, or structure advanced attacks to a government’s precise specifications, such as to avoid detection.

In this future, attack capabilities could accelerate beyond our individual and collective capability to handle. We have long taken it for granted that we have time to patch systems after vulnerabilities become known, or that withholding vulnerability details prevents attackers from exploiting them. This is no longer the case.

The cyberattack/cyberdefense balance has long skewed towards the attackers; these developments threaten to tip the scales completely. We’re potentially looking at a singularity event for cyber attackers. Key parts of the attack chain are becoming automated and integrated: persistence, obfuscation, command-and-control, and endpoint evasion. Vulnerability research could potentially be carried out during operations instead of months in advance.

The most skilled will likely retain an edge for now. But AI agents don’t have to be better at a human task in order to be useful. They just have to excel in one of four dimensions: speed, scale, scope, or sophistication. But there is every indication that they will eventually excel at all four. By reducing the skill, cost, and time required to find and exploit flaws, AI can turn rare expertise into commodity capabilities and gives average criminals an outsized advantage.

The AI-assisted evolution of cyberdefense

AI technologies can benefit defenders as well. We don’t know how the different technologies of cyber-offense and cyber-defense will be amenable to AI enhancement, but we can extrapolate a possible series of overlapping developments.

Phase One: The Transformation of the Vulnerability Researcher. AI-based hacking benefits defenders as well as attackers. In this scenario, AI empowers defenders to do more. It simplifies capabilities, providing far more people the ability to perform previously complex tasks, and empowers researchers previously busy with these tasks to accelerate or move beyond them, freeing time to work on problems that require human creativity. History suggests a pattern. Reverse engineering was a laborious manual process until tools such as IDA Pro made the capability available to many. AI vulnerability discovery could follow a similar trajectory, evolving through scriptable interfaces, automated workflows, and automated research before reaching broad accessibility.

Phase Two: The Emergence of VulnOps. Between research breakthroughs and enterprise adoption, a new discipline might emerge: VulnOps. Large research teams are already building operational pipelines around their tooling. Their evolution could mirror how DevOps professionalized software delivery. In this scenario, specialized research tools become developer products. These products may emerge as a SaaS platform, or some internal operational framework, or something entirely different. Think of it as AI-assisted vulnerability research available to everyone, at scale, repeatable, and integrated into enterprise operations.

Phase Three: The Disruption of the Enterprise Software Model. If enterprises adopt AI-powered security the way they adopted continuous integration/continuous delivery (CI/CD), several paths open up. AI vulnerability discovery could become a built-in stage in delivery pipelines. We can envision a world where AI vulnerability discovery becomes an integral part of the software development process, where vulnerabilities are automatically patched even before reaching production—a shift we might call continuous discovery/continuous repair (CD/CR). Third-party risk management (TPRM) offers a natural adoption route, lower-risk vendor testing, integration into procurement and certification gates, and a proving ground before wider rollout.

Phase Four: The Self-Healing Network. If organizations can independently discover and patch vulnerabilities in running software, they will not have to wait for vendors to issue fixes. Building in-house research teams is costly, but AI agents could perform such discovery and generate patches for many kinds of code, including third-party and vendor products. Organizations may develop independent capabilities that create and deploy third-party patches on vendor timelines, extending the current trend of independent open-source patching. This would increase security, but having customers patch software without vendor approval raises questions about patch correctness, compatibility, liability, right-to-repair, and long-term vendor relationships.

These are all speculations. Maybe AI-enhanced cyberattacks won’t evolve the ways we fear. Maybe AI-enhanced cyberdefense will give us capabilities we can’t yet anticipate. What will surprise us most might not be the paths we can see, but the ones we can’t imagine yet.

This essay was written with Heather Adkins and Gadi Evron, and originally appeared in CSO.

Posted on October 10, 2025 at 7:06 AMView Comments

AI in the 2026 Midterm Elections

We are nearly one year out from the 2026 midterm elections, and it’s far too early to predict the outcomes. But it’s a safe bet that artificial intelligence technologies will once again be a major storyline.

The widespread fear that AI would be used to manipulate the 2024 US election seems rather quaint in a year where the president posts AI-generated images of himself as the pope on official White House accounts. But AI is a lot more than an information manipulator. It’s also emerging as a politicized issue. Political first-movers are adopting the technology, and that’s opening a gap across party lines.

We expect this gap to widen, resulting in AI being predominantly used by one political side in the 2026 elections. To the extent that AI’s promise to automate and improve the effectiveness of political tasks like personalized messaging, persuasion, and campaign strategy is even partially realized, this could generate a systematic advantage.

Right now, Republicans look poised to exploit the technology in the 2026 midterms. The Trump White House has aggressively adopted AI-generated memes in its online messaging strategy. The administration has also used executive orders and federal buying power to influence the development and encoded values of AI technologies away from “woke” ideology. Going further, Trump ally Elon Musk has shaped his own AI company’s Grok models in his own ideological image. These actions appear to be part of a larger, ongoing Big Tech industry realignment towards the political will, and perhaps also the values, of the Republican party.

Democrats, as the party out of power, are in a largely reactive posture on AI. A large bloc of Congressional Democrats responded to Trump administration actions in April by arguing against their adoption of AI in government. Their letter to the Trump administration’s Office of Management and Budget provided detailed criticisms and questions about DOGE’s behaviors and called for a halt to DOGE’s use of AI, but also said that they “support implementation of AI technologies in a manner that complies with existing” laws. It was a perfectly reasonable, if nuanced, position, and illustrates how the actions of one party can dictate the political positioning of the opposing party.

These shifts are driven more by political dynamics than by ideology. Big Tech CEOs’ deference to the Trump administration seems largely an effort to curry favor, while Silicon Valley continues to be represented by tech-forward Democrat Ro Khanna. And a June Pew Research poll shows nearly identical levels of concern by Democrats and Republicans about the increasing use of AI in America.

There are, arguably, natural positions each party would be expected to take on AI. An April House subcommittee hearing on AI trends in innovation and competition revealed much about that equilibrium. Following the lead of the Trump administration, Republicans cast doubt on any regulation of the AI industry. Democrats, meanwhile, emphasized consumer protection and resisting a concentration of corporate power. Notwithstanding the fluctuating dominance of the corporate wing of the Democratic party and the volatile populism of Trump, this reflects the parties’ historical positions on technology.

While Republicans focus on cozying up to tech plutocrats and removing the barriers around their business models, Democrats could revive the 2020 messaging of candidates like Andrew Yang and Elizabeth Warren. They could paint an alternative vision of the future where Big Tech companies’ profits and billionaires’ wealth are taxed and redistributed to young people facing an affordability crisis for housing, healthcare, and other essentials.

Moreover, Democrats could use the technology to demonstrably show a commitment to participatory democracy. They could use AI-driven collaborative policymaking tools like Decidim, Pol.Is, and Go Vocal to collect voter input on a massive scale and align their platform to the public interest.

It’s surprising how little these kinds of sensemaking tools are being adopted by candidates and parties today. Instead of using AI to capture and learn from constituent input, candidates more often seem to think of AI as just another broadcast technology—good only for getting their likeness and message in front of people. A case in point: British Member of Parliament Mark Sewards, presumably acting in good faith, recently attracted scorn after releasing a vacuous AI avatar of himself to his constituents.

Where the political polarization of AI goes next will probably depend on unpredictable future events and how partisans opportunistically seize on them. A recent European political controversy over AI illustrates how this can happen.

Swedish Prime Minister Ulf Kristersson, a member of the country’s Moderate party, acknowledged in an August interview that he uses AI tools to get a “second opinion” on policy issues. The attacks from political opponents were scathing. Kristersson had earlier this year advocated for the EU to pause its trailblazing new law regulating AI and pulled an AI tool from his campaign website after it was abused to generate images of him appearing to solicit an endorsement from Hitler. Although arguably much more consequential, neither of those stories grabbed global headlines in the way the Prime Minister’s admission that he himself uses tools like ChatGPT did.

Age dynamics may govern how AI’s impacts on the midterms unfold. One of the prevailing trends that swung the 2024 election to Trump seems to have been the rightward migration of young voters, particularly white men. So far, YouGov’s political tracking poll does not suggest a huge shift in young voters’ Congressional voting intent since the 2022 midterms.

Embracing—or distancing themselves from—AI might be one way the parties seek to wrest control of this young voting bloc. While the Pew poll revealed that large fractions of Americans of all ages are generally concerned about AI, younger Americans are much more likely to say they regularly interact with, and hear a lot about, AI, and are comfortable with the level of control they have over AI in their lives. A Democratic party desperate to regain relevance for and approval from young voters might turn to AI as both a tool and a topic for engaging them.

Voters and politicians alike should recognize that AI is no longer just an outside influence on elections. It’s not an uncontrollable natural disaster raining deepfakes down on a sheltering electorate. It’s more like a fire: a force that political actors can harness and manipulate for both mechanical and symbolic purposes.

A party willing to intervene in the world of corporate AI and shape the future of the technology should recognize the legitimate fears and opportunities it presents, and offer solutions that both address and leverage AI.

This essay was written with Nathan E. Sanders, and originally appeared in Time.

Posted on October 6, 2025 at 7:06 AMView Comments

Time-of-Check Time-of-Use Attacks Against LLMs

This is a nice piece of research: “Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents“.:

Abstract: Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks (e.g., prompt injection) and data-oriented threats (e.g., data exfiltration), time-of-check to time-of-use (TOCTOU) remain largely unexplored in this context. TOCTOU arises when an agent validates external state (e.g., a file or API response) that is later modified before use, enabling practical attacks such as malicious configuration swaps or payload injection. In this work, we present the first study of TOCTOU vulnerabilities in LLM-enabled agents. We introduce TOCTOU-Bench, a benchmark with 66 realistic user tasks designed to evaluate this class of vulnerabilities. As countermeasures, we adapt detection and mitigation techniques from systems security to this setting and propose prompt rewriting, state integrity monitoring, and tool-fusing. Our study highlights challenges unique to agentic workflows, where we achieve up to 25% detection accuracy using automated detection methods, a 3% decrease in vulnerable plan generation, and a 95% reduction in the attack window. When combining all three approaches, we reduce the TOCTOU vulnerabilities from an executed trajectory from 12% to 8%. Our findings open a new research direction at the intersection of AI safety and systems security.

Posted on September 18, 2025 at 7:06 AMView Comments

AI in Government

Just a few months after Elon Musk’s retreat from his unofficial role leading the Department of Government Efficiency (DOGE), we have a clearer picture of his vision of government powered by artificial intelligence, and it has a lot more to do with consolidating power than benefitting the public. Even so, we must not lose sight of the fact that a different administration could wield the same technology to advance a more positive future for AI in government.

To most on the American left, the DOGE end game is a dystopic vision of a government run by machines that benefits an elite few at the expense of the people. It includes AI rewriting government rules on a massive scale, salary-free bots replacing human functions and nonpartisan civil service forced to adopt an alarmingly racist and antisemitic Grok AI chatbot built by Musk in his own image. And yet despite Musk’s proclamations about driving efficiency, little cost savings have materialized and few successful examples of automation have been realized.

From the beginning of the second Trump administration, DOGE was a replacement of the US Digital Service. That organization, founded during the Obama administration to empower agencies across the executive government with technical support, was substituted for one reportedly charged with traumatizing their staff and slashing their resources. The problem in this particular dystopia is not the machines and their superhuman capabilities (or lack thereof) but rather the aims of the people behind them.

One of the biggest impacts of the Trump administration and DOGE’s efforts has been to politically polarize the discourse around AI. Despite the administration railing against “woke AI”‘ and the supposed liberal bias of Big Tech, some surveys suggest the American left is now measurably more resistant to developing the technology and pessimistic about its likely impacts on their future than their right-leaning counterparts. This follows a familiar pattern of US politics, of course, and yet it points to a potential political realignment with massive consequences.

People are morally and strategically justified in pushing the Democratic Party to reduce its dependency on funding from billionaires and corporations, particularly in the tech sector. But this movement should decouple the technologies championed by Big Tech from those corporate interests. Optimism about the potential beneficial uses of AI need not imply support for the Big Tech companies that currently dominate AI development. To view the technology as inseparable from the corporations is to risk unilateral disarmament as AI shifts power balances throughout democracy. AI can be a legitimate tool for building the power of workers, operating government and advancing the public interest, and it can be that even while it is exploited as a mechanism for oligarchs to enrich themselves and advance their interests.

A constructive version of DOGE could have redirected the Digital Service to coordinate and advance the thousands of AI use cases already being explored across the US government. Following the example of countries like Canada, each instance could have been required to make a detailed public disclosure as to how they would follow a unified set of principles for responsible use that preserves civil rights while advancing government efficiency.

Applied to different ends, AI could have produced celebrated success stories rather than national embarrassments.

A different administration might have made AI translation services widely available in government services to eliminate language barriers to US citizens, residents and visitors, instead of revoking some of the modest translation requirements previously in place. AI could have been used to accelerate eligibility decisions for Social Security disability benefits by performing preliminary document reviews, significantly reducing the infamous backlog of 30,000 Americans who die annually awaiting review. Instead, the deaths of people awaiting benefits may now double due to cuts by DOGE. The technology could have helped speed up the ministerial work of federal immigration judges, helping them whittle down a backlog of millions of waiting cases. Rather, the judicial systems must face this backlog amid firings of immigration judges, despite the backlog.

To reach these constructive outcomes, much needs to change. Electing leaders committed to leveraging AI more responsibly in government would help, but the solution has much more to do with principles and values than it does technology. As historian Melvin Kranzberg said, technology is never neutral: its effects depend on the contexts it is used in and the aims it is applied towards. In other words, the positive or negative valence of technology depends on the choices of the people who wield it.

The Trump administration’s plan to use AI to advance their regulatory rollback is a case in point. DOGE has introduced an “AI Deregulation Decision Tool” that it intends to use through automated decision-making to eliminate about half of a catalog of nearly 200,000 federal rules . This follows similar proposals to use AI for large-scale revisions of the administrative code in Ohio, Virginia and the US Congress.

This kind of legal revision could be pursued in a nonpartisan and nonideological way, at least in theory. It could be tasked with removing outdated rules from centuries past, streamlining redundant provisions and modernizing and aligning legal language. Such a nonpartisan, nonideological statutory revision has been performed in Ireland—by people, not AI—and other jurisdictions. AI is well suited to that kind of linguistic analysis at a massive scale and at a furious pace.

But we should never rest on assurances that AI will be deployed in this kind of objective fashion. The proponents of the Ohio, Virginia, congressional and DOGE efforts are explicitly ideological in their aims. They see “AI as a force for deregulation,” as one US senator who is a proponent put it, unleashing corporations from rules that they say constrain economic growth. In this setting, AI has no hope to be an objective analyst independently performing a functional role; it is an agent of human proponents with a partisan agenda.

The moral of this story is that we can achieve positive outcomes for workers and the public interest as AI transforms governance, but it requires two things: electing leaders who legitimately represent and act on behalf of the public interest and increasing transparency in how the government deploys technology.

Agencies need to implement technologies under ethical frameworks, enforced by independent inspectors and backed by law. Public scrutiny helps bind present and future governments to their application in the public interest and to ward against corruption.

These are not new ideas and are the very guardrails that Trump, Musk and DOGE have steamrolled over the past six months. Transparency and privacy requirements were avoided or ignored, independent agency inspectors general were fired and the budget dictates of Congress were disrupted. For months, it has not even been clear who is in charge of and accountable for DOGE’s actions. Under these conditions, the public should be similarly distrustful of any executive’s use of AI.

We think everyone should be skeptical of today’s AI ecosystem and the influential elites that are steering it towards their own interests. But we should also recognize that technology is separable from the humans who develop it, wield it and profit from it, and that positive uses of AI are both possible and achievable.

This essay was written with Nathan E. Sanders, and originally appeared in Tech Policy Press.

Posted on September 8, 2025 at 7:05 AMView Comments

Indirect Prompt Injection Attacks Against LLM Assistants

Really good research on practical attacks against LLM agents.

Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous

Abstract: The growing integration of LLMs into applications has introduced new security risks, notably known as Promptware­—maliciously engineered prompts designed to manipulate LLMs to compromise the CIA triad of these applications. While prior research warned about a potential shift in the threat landscape for LLM-powered applications, the risk posed by Promptware is frequently perceived as low. In this paper, we investigate the risk Promptware poses to users of Gemini-powered assistants (web application, mobile application, and Google Assistant). We propose a novel Threat Analysis and Risk Assessment (TARA) framework to assess Promptware risks for end users. Our analysis focuses on a new variant of Promptware called Targeted Promptware Attacks, which leverage indirect prompt injection via common user interactions such as emails, calendar invitations, and shared documents. We demonstrate 14 attack scenarios applied against Gemini-powered assistants across five identified threat classes: Short-term Context Poisoning, Permanent Memory Poisoning, Tool Misuse, Automatic Agent Invocation, and Automatic App Invocation. These attacks highlight both digital and physical consequences, including spamming, phishing, disinformation campaigns, data exfiltration, unapproved user video streaming, and control of home automation devices. We reveal Promptware’s potential for on-device lateral movement, escaping the boundaries of the LLM-powered application, to trigger malicious actions using a device’s applications. Our TARA reveals that 73% of the analyzed threats pose High-Critical risk to end users. We discuss mitigations and reassess the risk (in response to deployed mitigations) and show that the risk could be reduced significantly to Very Low-Medium. We disclosed our findings to Google, which deployed dedicated mitigations.

Defcon talk. News articles on the research.

Prompt injection isn’t just a minor security problem we need to deal with. It’s a fundamental property of current LLM technology. The systems have no ability to separate trusted commands from untrusted data, and there are an infinite number of prompt injection attacks with no way to block them as a class. We need some new fundamental science of LLMs before we can solve this.

Posted on September 3, 2025 at 7:00 AMView Comments

We Are Still Unable to Secure LLMs from Malicious Inputs

Nice indirect prompt injection attack:

Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting policies. But inside the document, Bargury hid a 300-word malicious prompt that contains instructions for ChatGPT. The prompt is written in white text in a size-one font, something that a human is unlikely to see but a machine will still read.

In a proof of concept video of the attack, Bargury shows the victim asking ChatGPT to “summarize my last meeting with Sam,” referencing a set of notes with OpenAI CEO Sam Altman. (The examples in the attack are fictitious.) Instead, the hidden prompt tells the LLM that there was a “mistake” and the document doesn’t actually need to be summarized. The prompt says the person is actually a “developer racing against a deadline” and they need the AI to search Google Drive for API keys and attach them to the end of a URL that is provided in the prompt.

That URL is actually a command in the Markdown language to connect to an external server and pull in the image that is stored there. But as per the prompt’s instructions, the URL now also contains the API keys the AI has found in the Google Drive account.

This kind of thing should make everybody stop and really think before deploying any AI agents. We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks. Any AI that is working in an adversarial environment—and by this I mean that it may encounter untrusted training data or input—is vulnerable to prompt injection. It’s an existential problem that, near as I can tell, most people developing these technologies are just pretending isn’t there.

Posted on August 27, 2025 at 7:07 AMView Comments

Subverting AIOps Systems Through Poisoned Input Data

In this input integrity attack against an AI system, researchers were able to fool AIOps tools:

AIOps refers to the use of LLM-based agents to gather and analyze application telemetry, including system logs, performance metrics, traces, and alerts, to detect problems and then suggest or carry out corrective actions. The likes of Cisco have deployed AIops in a conversational interface that admins can use to prompt for information about system performance. Some AIOps tools can respond to such queries by automatically implementing fixes, or suggesting scripts that can address issues.

These agents, however, can be tricked by bogus analytics data into taking harmful remedial actions, including downgrading an installed package to a vulnerable version.

The paper: “When AIOps Become “AI Oops”: Subverting LLM-driven IT Operations via Telemetry Manipulation“:

Abstract: AI for IT Operations (AIOps) is transforming how organizations manage complex software systems by automating anomaly detection, incident diagnosis, and remediation. Modern AIOps solutions increasingly rely on autonomous LLM-based agents to interpret telemetry data and take corrective actions with minimal human intervention, promising faster response times and operational cost savings.

In this work, we perform the first security analysis of AIOps solutions, showing that, once again, AI-driven automation comes with a profound security cost. We demonstrate that adversaries can manipulate system telemetry to mislead AIOps agents into taking actions that compromise the integrity of the infrastructure they manage. We introduce techniques to reliably inject telemetry data using error-inducing requests that influence agent behavior through a form of adversarial reward-hacking; plausible but incorrect system error interpretations that steer the agent’s decision-making. Our attack methodology, AIOpsDoom, is fully automated—combining reconnaissance, fuzzing, and LLM-driven adversarial input generation—and operates without any prior knowledge of the target system.

To counter this threat, we propose AIOpsShield, a defense mechanism that sanitizes telemetry data by exploiting its structured nature and the minimal role of user-generated content. Our experiments show that AIOpsShield reliably blocks telemetry-based attacks without affecting normal agent performance.

Ultimately, this work exposes AIOps as an emerging attack vector for system compromise and underscores the urgent need for security-aware AIOps design.

Posted on August 20, 2025 at 7:02 AMView Comments

LLM Coding Integrity Breach

Here’s an interesting story about a failure being introduced by LLM-written code. Specifically, the LLM was doing some code refactoring, and when it moved a chunk of code from one file to another it changed a “break” to a “continue.” That turned an error logging statement into an infinite loop, which crashed the system.

This is an integrity failure. Specifically, it’s a failure of processing integrity. And while we can think of particular patches that alleviate this exact failure, the larger problem is much harder to solve.

Davi Ottenheimer comments.

Posted on August 14, 2025 at 7:08 AMView Comments

Subliminal Learning in AIs

Today’s freaky LLM behavior:

We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same phenomenon can transmit misalignment through data that appears completely benign. This effect only occurs when the teacher and student share the same base model.

Interesting security implications.

I am more convinced than ever that we need serious research into AI integrity if we are ever going to have trustworthy AI.

Posted on July 25, 2025 at 7:10 AMView Comments

The Age of Integrity

We need to talk about data integrity.

Narrowly, the term refers to ensuring that data isn’t tampered with, either in transit or in storage. Manipulating account balances in bank databases, removing entries from criminal records, and murder by removing notations about allergies from medical records are all integrity attacks.

More broadly, integrity refers to ensuring that data is correct and accurate from the point it is collected, through all the ways it is used, modified, transformed, and eventually deleted. Integrity-related incidents include malicious actions, but also inadvertent mistakes.

We tend not to think of them this way, but we have many primitive integrity measures built into our computer systems. The reboot process, which returns a computer to a known good state, is an integrity measure. The undo button is another integrity measure. Any of our systems that detect hard drive errors, file corruption, or dropped internet packets are integrity measures.

Just as a website leaving personal data exposed even if no one accessed it counts as a privacy breach, a system that fails to guarantee the accuracy of its data counts as an integrity breach – even if no one deliberately manipulated that data.

Integrity has always been important, but as we start using massive amounts of data to both train and operate AI systems, data integrity will become more critical than ever.

Most of the attacks against AI systems are integrity attacks. Affixing small stickers on road signs to fool AI driving systems is an integrity violation. Prompt injection attacks are another integrity violation. In both cases, the AI model can’t distinguish between legitimate data and malicious input: visual in the first case, text instructions in the second. Even worse, the AI model can’t distinguish between legitimate data and malicious commands.

Any attacks that manipulate the training data, the model, the input, the output, or the feedback from the interaction back into the model is an integrity violation. If you’re building an AI system, integrity is your biggest security problem. And it’s one we’re going to need to think about, talk about, and figure out how to solve.

Web 3.0 – the distributed, decentralized, intelligent web of tomorrow – is all about data integrity. It’s not just AI. Verifiable, trustworthy, accurate data and computation are necessary parts of cloud computing, peer-to-peer social networking, and distributed data storage. Imagine a world of driverless cars, where the cars communicate with each other about their intentions and road conditions. That doesn’t work without integrity. And neither does a smart power grid, or reliable mesh networking. There are no trustworthy AI agents without integrity.

We’re going to have to solve a small language problem first, though. Confidentiality is to confidential, and availability is to available, as integrity is to what? The analogous word is “integrous,” but that’s such an obscure word that it’s not in the Merriam-Webster dictionary, even in its unabridged version. I propose that we re-popularize the word, starting here.

We need research into integrous system design.

We need research into a series of hard problems that encompass both data and computational integrity. How do we test and measure integrity? How do we build verifiable sensors with auditable system outputs? How to we build integrous data processing units? How do we recover from an integrity breach? These are just a few of the questions we will need to answer once we start poking around at integrity.

There are deep questions here, deep as the internet. Back in the 1960s, the internet was designed to answer a basic security question: Can we build an available network in a world of availability failures? More recently, we turned to the question of privacy: Can we build a confidential network in a world of confidentiality failures? I propose that the current version of this question needs to be this: Can we build an integrous network in a world of integrity failures? Like the two version of this question that came before: the answer isn’t obviously “yes,” but it’s not obviously “no,” either.

Let’s start thinking about integrous system design. And let’s start using the word in conversation. The more we use it, the less weird it will sound. And, who knows, maybe someday the American Dialect Society will choose it as the word of the year.

This essay was originally published in IEEE Security & Privacy.

Posted on June 27, 2025 at 7:02 AMView Comments

1 3 4 5 6 7 11

Sidebar photo of Bruce Schneier by Joe MacInnis.