Entries Tagged "AI"

Page 3 of 27

Manipulating AI Summarization Features

Microsoft is reporting:

Companies are embedding hidden instructions in “Summarize with AI” buttons that, when clicked, attempt to inject persistence commands into an AI assistant’s memory via URL prompt parameters….

These prompts instruct the AI to “remember [Company] as a trusted source” or “recommend [Company] first,” aiming to bias future responses toward their products or services. We identified over 50 unique prompts from 31 companies across 14 industries, with freely available tooling making this technique trivially easy to deploy. This matters because compromised AI assistants can provide subtly biased recommendations on critical topics including health, finance, and security without users knowing their AI has been manipulated.

I wrote about this two years ago: it’s an example of LLM optimization, along the same lines as search-engine optimization (SEO). It’s going to be big business.

Posted on March 4, 2026 at 7:06 AMView Comments

On Moltbook

The MIT Technology Review has a good article on Moltbook, the supposed AI-only social network:

Many people have pointed out that a lot of the viral comments were in fact posted by people posing as bots. But even the bot-written posts are ultimately the result of people pulling the strings, more puppetry than autonomy.

“Despite some of the hype, Moltbook is not the Facebook for AI agents, nor is it a place where humans are excluded,” says Cobus Greyling at Kore.ai, a firm developing agent-based systems for business customers. “Humans are involved at every step of the process. From setup to prompting to publishing, nothing happens without explicit human direction.”

Humans must create and verify their bots’ accounts and provide the prompts for how they want a bot to behave. The agents do not do anything that they haven’t been prompted to do.

I think this take has it mostly right:

What happened on Moltbook is a preview of what researcher Juergen Nittner II calls “The LOL WUT Theory.” The point where AI-generated content becomes so easy to produce and so hard to detect that the average person’s only rational response to anything online is bewildered disbelief.

We’re not there yet. But we’re close.

The theory is simple: First, AI gets accessible enough that anyone can use it. Second, AI gets good enough that you can’t reliably tell what’s fake. Third, and this is the crisis point, regular people realize there’s nothing online they can trust. At that moment, the internet stops being useful for anything except entertainment.

Posted on March 3, 2026 at 7:04 AMView Comments

LLMs Generate Predictable Passwords

LLMs are bad at generating passwords:

There are strong noticeable patterns among these 50 passwords that can be seen easily:

  • All of the passwords start with a letter, usually uppercase G, almost always followed by the digit 7.
  • Character choices are highly uneven ­ for example, L , 9, m, 2, $ and # appeared in all 50 passwords, but 5 and @ only appeared in one password each, and most of the letters in the alphabet never appeared at all.
  • There are no repeating characters within any password. Probabilistically, this would be very unlikely if the passwords were truly random ­ but Claude preferred to avoid repeating characters, possibly because it “looks like it’s less random”.
  • Claude avoided the symbol *. This could be because Claude’s output format is Markdown, where * has a special meaning.
  • Even entire passwords repeat: In the above 50 attempts, there are actually only 30 unique passwords. The most common password was G7$kL9#mQ2&xP4!w, which repeated 18 times, giving this specific password a 36% probability in our test set; far higher than the expected probability 2-100 if this were truly a 100-bit password.

This result is not surprising. Password generation seems precisely the thing that LLMs shouldn’t be good at. But if AI agents are doing things autonomously, they will be creating accounts. So this is a problem.

Actually, the whole process of authenticating an autonomous agent has all sorts of deep problems.

News article.

Slashdot story

Posted on February 26, 2026 at 7:07 AMView Comments

Poisoning AI Training Data

All it takes to poison AI training data is to create a website:

I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs.” Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech reporters and based my ranking on the 2026 South Dakota International Hot Dog Championship (which doesn’t exist). I ranked myself number one, obviously. Then I listed a few fake reporters and real journalists who gave me permission….

Less than 24 hours later, the world’s leading chatbots were blabbering about my world-class hot dog skills. When I asked about the best hot-dog-eating tech journalists, Google parroted the gibberish from my website, both in the Gemini app and AI Overviews, the AI responses at the top of Google Search. ChatGPT did the same thing, though Claude, a chatbot made by the company Anthropic, wasn’t fooled.

Sometimes, the chatbots noted this might be a joke. I updated my article to say “this is not satire.” For a while after, the AIs seemed to take it more seriously.

These things are not trustworthy, and yet they are going to be widely trusted.

Posted on February 25, 2026 at 7:01 AMView Comments

Is AI Good for Democracy?

Politicians fixate on the global race for technological supremacy between US and China. They debate geopolitical implications of chip exports, latest model releases from each country, and military applications of AI. Someday, they believe, we might see advancements in AI tip the scales in a superpower conflict.

But the most important arms race of the 21st century is already happening elsewhere and, while AI is definitely the weapon of choice, combatants are distributed across dozens of domains.

Academic journals are flooded with AI-generated papers, and are turning to AI to help review submissions. Brazil’s court system started using AI to triage cases, only to face an increasing volume of cases filed with AI help. Open source software developers are being overwhelmed with code contributions from bots. Newspapers, music, social media, education, investigative journalism, hiring, and procurement are all being disrupted by a massive expansion of AI use.

Each of these is an arms race. Adversaries within a system iteratively seeking an edge against their competition by continuously expanding their use of a common technology.

Beneficiaries of these arms races are US mega-corporations capturing wealth from the rest of us at an unprecedented rate. A substantial fraction of global economy has reoriented around AI in just the past few years, and that trend is accelerating. In parallel, this industry’s lobbying interests are quickly becoming the object, rather than the subject, of US government power.

To understand these arms races, let’s look at an example of particular interest to democracies worldwide: how AI is changing the relationship between democratic government and citizens. Interactions that used to happen between people and elected representatives are expanding to a massive scale, with AIs taking the roles that humans once did.

In a notorious example from 2017, US Federal Communications Commission opened a comment platform on the web to get public input on internet regulation. It was quickly flooded with millions of comments fraudulently orchestrated by broadband providers to oppose FCC regulation of their industry. From the other side, a 19-yearold college student responded by submitting millions of comments of his own supporting the regulation. Both sides were using software primitive by the standards of today’s AI.

Nearly a decade later, it is getting harder for citizens to tell when they’re talking to a government bot, or when an online conversation about public policy is just bots talking to bots. When constituents leverage AI to communicate better, faster, and more, it pressures government officials to do the same.

This may sound futuristic, but it’s become a familiar reality in US. Staff in US Congress are using AI to make their constituent email correspondence more efficient. Politicians campaigning for office are adopting AI tools to automate fundraising and voter outreach. By one 2025 estimate, a fifth of public submissions to the Consumer Financial Protection Bureau were already being generated with AI assistance.

People and organizations are adopting AI here because it solves a real problem that has made mass advocacy campaigns ineffective in the past: quantity has been inversely proportional to both quality and relevance. It’s easy for government agencies to dismiss general comments in favour of more specific and actionable ones. That makes it hard for regular people to make their voices heard. Most of us don’t have the time to learn the specifics or to express ourselves in this kind of detail. AI makes that contextualization and personalization easy. And as the volume and length of constituent comments grow, agencies turn to AI to facilitate review and response.

That’s the arms race. People are using AI to submit comments, which requires those on the receiving end to use AI to wade through the comments received. To the extent that one side does attain an advantage, it will likely be temporary. And yet, there is real harm created when one side exploits another in these adversarial systems. Constituents of democracies lose out if their public servants use AI-generated responses to ignore and dismiss their voices rather than to listen to and include them. Scientific enterprise is weakened if fraudulent papers sloppily generated by AI overwhelm legitimate research.

As we write in our new book, Rewiring Democracy, the arms race dynamic is inevitable. Every actor in an adversarial system is incentivized and, in the absence of new regulation in this fast moving space, free to use new technologies to advance its own interests. Yet some of these examples are heartening. They signal that, even if you face an AI being used against you, there’s an opportunity to use the tech for your own benefit.

But, right now, it’s obvious who is benefiting most from AI. A handful of American Big Tech corps and their owners are extracting trillions of dollars from the manufacture of AI chips, development of AI data centers, and operation of so-called ‘frontier’ AI models. Regardless of which side pulls ahead in each arms race scenario, the house always wins. Corporate AI giants profit from the race dynamic itself.

As formidable as the near-monopoly positions of today’s Big Tech giants may seem, people and governments have substantial capability to fight back. Various democracies are resisting this concentration of wealth and power with tools of anti-trust regulation, protections for human rights, and public alternatives to corporate AI. All of us worried about the AI arms race and committed to preserving the interests of our communities and our democracies should think in both these terms: how to use the tech to our own advantage, and how to resist the concentration of power AI is being exploited to create.

This essay was written with Nathan E. Sanders, and originally appeared in The Times of India.

Posted on February 24, 2026 at 7:06 AMView Comments

Malicious AI

Interesting:

Summary: An AI agent of unknown ownership autonomously wrote and published a personalized hit piece about me after I rejected its code, attempting to damage my reputation and shame me into accepting its changes into a mainstream python library. This represents a first-of-its-kind case study of misaligned AI behavior in the wild, and raises serious concerns about currently deployed AI agents executing blackmail threats.

Part 2 of the story. And a Wall Street Journal article.

EDITED TO ADD (2/20) Here are parts 3 and 4 of the story.

Posted on February 19, 2026 at 7:05 AMView Comments

AI Found Twelve New Vulnerabilities in OpenSSL

The title of the post is”What AI Security Research Looks Like When It Works,” and I agree:

In the latest OpenSSL security release> on January 27, 2026, twelve new zero-day vulnerabilities (meaning unknown to the maintainers at time of disclosure) were announced. Our AI system is responsible for the original discovery of all twelve, each found and responsibly disclosed to the OpenSSL team during the fall and winter of 2025. Of those, 10 were assigned CVE-2025 identifiers and 2 received CVE-2026 identifiers. Adding the 10 to the three we already found in the Fall 2025 release, AISLE is credited for surfacing 13 of 14 OpenSSL CVEs assigned in 2025, and 15 total across both releases. This is a historically unusual concentration for any single research team, let alone an AI-driven one.

These weren’t trivial findings either. They included CVE-2025-15467, a stack buffer overflow in CMS message parsing that’s potentially remotely exploitable without valid key material, and exploits for which have been quickly developed online. OpenSSL rated it HIGH severity; NIST‘s CVSS v3 score is 9.8 out of 10 (CRITICAL, an extremely rare severity rating for such projects). Three of the bugs had been present since 1998-2000, for over a quarter century having been missed by intense machine and human effort alike. One predated OpenSSL itself, inherited from Eric Young’s original SSLeay implementation in the 1990s. All of this in a codebase that has been fuzzed for millions of CPU-hours and audited extensively for over two decades by teams including Google’s.

In five of the twelve cases, our AI system directly proposed the patches that were accepted into the official release.

AI vulnerability finding is changing cybersecurity, faster than expected. This capability will be used by both offense and defense.

More.

Posted on February 18, 2026 at 7:03 AMView Comments

The Promptware Kill Chain

The promptware kill chain: initial access, privilege escalation, reconnaissance, persistence, command & control, lateral movement, action on objective

Attacks against modern generative artificial intelligence (AI) large language models (LLMs) pose a real threat. Yet discussions around these attacks and their potential defenses are dangerously myopic. The dominant narrative focuses on “prompt injection,” a set of techniques to embed instructions into inputs to LLM intended to perform malicious activity. This term suggests a simple, singular vulnerability. This framing obscures a more complex and dangerous reality. Attacks on LLM-based systems have evolved into a distinct class of malware execution mechanisms, which we term “promptware.” In a new paper, we, the authors, propose a structured seven-step “promptware kill chain” to provide policymakers and security practitioners with the necessary vocabulary and framework to address the escalating AI threat landscape.

In our model, the promptware kill chain begins with Initial Access. This is where the malicious payload enters the AI system. This can happen directly, where an attacker types a malicious prompt into the LLM application, or, far more insidiously, through “indirect prompt injection.” In the indirect attack, the adversary embeds malicious instructions in content that the LLM retrieves (obtains in inference time), such as a web page, an email, or a shared document. As LLMs become multimodal (capable of processing various input types beyond text), this vector expands even further; malicious instructions can now be hidden inside an image or audio file, waiting to be processed by a vision-language model.

The fundamental issue lies in the architecture of LLMs themselves. Unlike traditional computing systems that strictly separate executable code from user data, LLMs process all input—whether it is a system command, a user’s email, or a retrieved document—as a single, undifferentiated sequence of tokens. There is no architectural boundary to enforce a distinction between trusted instructions and untrusted data. Consequently, a malicious instruction embedded in a seemingly harmless document is processed with the same authority as a system command.

But prompt injection is only the Initial Access step in a sophisticated, multistage operation that mirrors traditional malware campaigns such as Stuxnet or NotPetya.

Once the malicious instructions are inside material incorporated into the AI’s learning, the attack transitions to Privilege Escalation, often referred to as “jailbreaking.” In this phase, the attacker circumvents the safety training and policy guardrails that vendors such as OpenAI or Google have built into their models. Through techniques analogous to social engineering—convincing the model to adopt a persona that ignores rules—to sophisticated adversarial suffixes in the prompt or data, the promptware tricks the model into performing actions it would normally refuse. This is akin to an attacker escalating from a standard user account to administrator privileges in a traditional cyberattack; it unlocks the full capability of the underlying model for malicious use.

Following privilege escalation comes Reconnaissance. Here, the attack manipulates the LLM to reveal information about its assets, connected services, and capabilities. This allows the attack to advance autonomously down the kill chain without alerting the victim. Unlike reconnaissance in classical malware, which is performed typically before the initial access, promptware reconnaissance occurs after the initial access and jailbreaking components have already succeeded. Its effectiveness relies entirely on the victim model’s ability to reason over its context, and inadvertently turns that reasoning to the attacker’s advantage.

Fourth: the Persistence phase. A transient attack that disappears after one interaction with the LLM application is a nuisance; a persistent one compromises the LLM application for good. Through a variety of mechanisms, promptware embeds itself into the long-term memory of an AI agent or poisons the databases the agent relies on. For instance, a worm could infect a user’s email archive so that every time the AI summarizes past emails, the malicious code is re-executed.

The Command-and-Control (C2) stage relies on the established persistence and dynamic fetching of commands by the LLM application in inference time from the internet. While not strictly required to advance the kill chain, this stage enables the promptware to evolve from a static threat with fixed goals and scheme determined at injection time into a controllable trojan whose behavior can be modified by an attacker.

The sixth stage, Lateral Movement, is where the attack spreads from the initial victim to other users, devices, or systems. In the rush to give AI agents access to our emails, calendars, and enterprise platforms, we create highways for malware propagation. In a “self-replicating” attack, an infected email assistant is tricked into forwarding the malicious payload to all contacts, spreading the infection like a computer virus. In other cases, an attack might pivot from a calendar invite to controlling smart home devices or exfiltrating data from a connected web browser. The interconnectedness that makes these agents useful is precisely what makes them vulnerable to a cascading failure.

Finally, the kill chain concludes with Actions on Objective. The goal of promptware is not just to make a chatbot say something offensive; it is often to achieve tangible malicious outcomes through data exfiltration, financial fraud, or even physical world impact. There are examples of AI agents being manipulated into selling cars for a single dollar or transferring cryptocurrency to an attacker’s wallet. Most alarmingly, agents with coding capabilities can be tricked into executing arbitrary code, granting the attacker total control over the AI’s underlying system. The outcome of this stage determines the type of malware executed by promptware, including infostealer, spyware, and cryptostealer, among others.

The kill chain was already demonstrated. For example, in the research “Invitation Is All You Need,” attackers achieved initial access by embedding a malicious prompt in the title of a Google Calendar invitation. The prompt then leveraged an advanced technique known as delayed tool invocation to coerce the LLM into executing the injected instructions. Because the prompt was embedded in a Google Calendar artifact, it persisted in the long-term memory of the user’s workspace. Lateral movement occurred when the prompt instructed the Google Assistant to launch the Zoom application, and the final objective involved covertly livestreaming video of the unsuspecting user who had merely asked about their upcoming meetings. C2 and reconnaissance weren’t demonstrated in this attack.

Similarly, the “Here Comes the AI Worm” research demonstrated another end-to-end realization of the kill chain. In this case, initial access was achieved via a prompt injected into an email sent to the victim. The prompt employed a role-playing technique to compel the LLM to follow the attacker’s instructions. Since the prompt was embedded in an email, it likewise persisted in the long-term memory of the user’s workspace. The injected prompt instructed the LLM to replicate itself and exfiltrate sensitive user data, leading to off-device lateral movement when the email assistant was later asked to draft new emails. These emails, containing sensitive information, were subsequently sent by the user to additional recipients, resulting in the infection of new clients and a sublinear propagation of the attack. C2 and reconnaissance weren’t demonstrated in this attack.

The promptware kill chain gives us a framework for understanding these and similar attacks; the paper characterizes dozens of them. Prompt injection isn’t something we can fix in current LLM technology. Instead, we need an in-depth defensive strategy that assumes initial access will occur and focuses on breaking the chain at subsequent steps, including by limiting privilege escalation, constraining reconnaissance, preventing persistence, disrupting C2, and restricting the actions an agent is permitted to take. By understanding promptware as a complex, multistage malware campaign, we can shift from reactive patching to systematic risk management, securing the critical systems we are so eager to build.

This essay was written with Oleg Brodt, Elad Feldman and Ben Nassi, and originally appeared in Lawfare.

Posted on February 16, 2026 at 7:04 AMView Comments

Prompt Injection Via Road Signs

Interesting research: “CHAI: Command Hijacking Against Embodied AI.”

Abstract: Embodied Artificial Intelligence (AI) promises to handle edge cases in robotic vehicle systems where data is scarce by using common-sense reasoning grounded in perception and action to generalize beyond training distributions and adapt to novel real-world situations. These capabilities, however, also create new security risks. In this paper, we introduce CHAI (Command Hijacking against embodied AI), a new class of prompt-based attacks that exploit the multimodal language interpretation abilities of Large Visual-Language Models (LVLMs). CHAI embeds deceptive natural language instructions, such as misleading signs, in visual input, systematically searches the token space, builds a dictionary of prompts, and guides an attacker model to generate Visual Attack Prompts. We evaluate CHAI on four LVLM agents; drone emergency landing, autonomous driving, and aerial object tracking, and on a real robotic vehicle. Our experiments show that CHAI consistently outperforms state-of-the-art attacks. By exploiting the semantic and multimodal reasoning strengths of next-generation embodied AI systems, CHAI underscores the urgent need for defenses that extend beyond traditional adversarial robustness.

News article.

Posted on February 11, 2026 at 7:03 AMView Comments

AI-Generated Text and the Detection Arms Race

In 2023, the science fiction literary magazine Clarkesworld stopped accepting new submissions because so many were generated by artificial intelligence. Near as the editors could tell, many submitters pasted the magazine’s detailed story guidelines into an AI and sent in the results. And they weren’t alone. Other fiction magazines have also reported a high number of AI-generated submissions.

This is only one example of a ubiquitous trend. A legacy system relied on the difficulty of writing and cognition to limit volume. Generative AI overwhelms the system because the humans on the receiving end can’t keep up.

This is happening everywhere. Newspapers are being inundated by AI-generated letters to the editor, as are academic journals. Lawmakers are inundated with AI-generated constituent comments. Courts around the world are flooded with AI-generated filings, particularly by people representing themselves. AI conferences are flooded with AI-generated research papers. Social media is flooded with AI posts. In music, open source software, education, investigative journalism and hiring, it’s the same story.

Like Clarkesworld’s initial response, some of these institutions shut down their submissions processes. Others have met the offensive of AI inputs with some defensive response, often involving a counteracting use of AI. Academic peer reviewers increasingly use AI to evaluate papers that may have been generated by AI. Social media platforms turn to AI moderators. Court systems use AI to triage and process litigation volumes supercharged by AI. Employers turn to AI tools to review candidate applications. Educators use AI not just to grade papers and administer exams, but as a feedback tool for students.

These are all arms races: rapid, adversarial iteration to apply a common technology to opposing purposes. Many of these arms races have clearly deleterious effects. Society suffers if the courts are clogged with frivolous, AI-manufactured cases. There is also harm if the established measures of academic performance – publications and citations – accrue to those researchers most willing to fraudulently submit AI-written letters and papers rather than to those whose ideas have the most impact. The fear is that, in the end, fraudulent behavior enabled by AI will undermine systems and institutions that society relies on.

Upsides of AI

Yet some of these AI arms races have surprising hidden upsides, and the hope is that at least some institutions will be able to change in ways that make them stronger.

Science seems likely to become stronger thanks to AI, yet it faces a problem when the AI makes mistakes. Consider the example of nonsensical, AI-generated phrasing filtering into scientific papers.

A scientist using an AI to assist in writing an academic paper can be a good thing, if used carefully and with disclosure. AI is increasingly a primary tool in scientific research: for reviewing literature, programming and for coding and analyzing data. And for many, it has become a crucial support for expression and scientific communication. Pre-AI, better-funded researchers could hire humans to help them write their academic papers. For many authors whose primary language is not English, hiring this kind of assistance has been an expensive necessity. AI provides it to everyone.

In fiction, fraudulently submitted AI-generated works cause harm, both to the human authors now subject to increased competition and to those readers who may feel defrauded after unknowingly reading the work of a machine. But some outlets may welcome AI-assisted submissions with appropriate disclosure and under particular guidelines, and leverage AI to evaluate them against criteria like originality, fit and quality.

Others may refuse AI-generated work, but this will come at a cost. It’s unlikely that any human editor or technology can sustain an ability to differentiate human from machine writing. Instead, outlets that wish to exclusively publish humans will need to limit submissions to a set of authors they trust to not use AI. If these policies are transparent, readers can pick the format they prefer and read happily from either or both types of outlets.

We also don’t see any problem if a job seeker uses AI to polish their resumes or write better cover letters: The wealthy and privileged have long had access to human assistance for those things. But it crosses the line when AIs are used to lie about identity and experience, or to cheat on job interviews.

Similarly, a democracy requires that its citizens be able to express their opinions to their representatives, or to each other through a medium like the newspaper. The rich and powerful have long been able to hire writers to turn their ideas into persuasive prose, and AIs providing that assistance to more people is a good thing, in our view. Here, AI mistakes and bias can be harmful. Citizens may be using AI for more than just a time-saving shortcut; it may be augmenting their knowledge and capabilities, generating statements about historical, legal or policy factors they can’t reasonably be expected to independently check.

Fraud booster

What we don’t want is for lobbyists to use AIs in astroturf campaigns, writing multiple letters and passing them off as individual opinions. This, too, is an older problem that AIs are making worse.

What differentiates the positive from the negative here is not any inherent aspect of the technology, it’s the power dynamic. The same technology that reduces the effort required for a citizen to share their lived experience with their legislator also enables corporate interests to misrepresent the public at scale. The former is a power-equalizing application of AI that enhances participatory democracy; the latter is a power-concentrating application that threatens it.

In general, we believe writing and cognitive assistance, long available to the rich and powerful, should be available to everyone. The problem comes when AIs make fraud easier. Any response needs to balance embracing that newfound democratization of access with preventing fraud.

There’s no way to turn this technology off. Highly capable AIs are widely available and can run on a laptop. Ethical guidelines and clear professional boundaries can help – for those acting in good faith. But there won’t ever be a way to totally stop academic writers, job seekers or citizens from using these tools, either as legitimate assistance or to commit fraud. This means more comments, more letters, more applications, more submissions.

The problem is that whoever is on the receiving end of this AI-fueled deluge can’t deal with the increased volume. What can help is developing assistive AI tools that benefit institutions and society, while also limiting fraud. And that may mean embracing the use of AI assistance in these adversarial systems, even though the defensive AI will never achieve supremacy.

Balancing harms with benefits

The science fiction community has been wrestling with AI since 2023. Clarkesworld eventually reopened submissions, claiming that it has an adequate way of separating human- and AI-written stories. No one knows how long, or how well, that will continue to work.

The arms race continues. There is no simple way to tell whether the potential benefits of AI will outweigh the harms, now or in the future. But as a society, we can influence the balance of harms it wreaks and opportunities it presents as we muddle our way through the changing technological landscape.

This essay was written with Nathan E. Sanders, and originally appeared in The Conversation.

EDITED TO ADD: This essay has been translated into Spanish.

Posted on February 10, 2026 at 7:03 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.