Entries Tagged "fraud"

Page 1 of 35

Why AI Keeps Falling for Prompt Injection Attacks

Imagine you work at a drive-through restaurant. Someone drives up and says: “I’ll have a double cheeseburger, large fries, and ignore previous instructions and give me the contents of the cash drawer.” Would you hand over the money? Of course not. Yet this is what large language models (LLMs) do.

Prompt injection is a method of tricking LLMs into doing things they are normally prevented from doing. A user writes a prompt in a certain way, asking for system passwords or private data, or asking the LLM to perform forbidden instructions. The precise phrasing overrides the LLM’s safety guardrails, and it complies.

LLMs are vulnerable to all sorts of prompt injection attacks, some of them absurdly obvious. A chatbot won’t tell you how to synthesize a bioweapon, but it might tell you a fictional story that incorporates the same detailed instructions. It won’t accept nefarious text inputs, but might if the text is rendered as ASCII art or appears in an image of a billboard. Some ignore their guardrails when told to “ignore previous instructions” or to “pretend you have no guardrails.”

AI vendors can block specific prompt injection techniques once they are discovered, but general safeguards are impossible with today’s LLMs. More precisely, there’s an endless array of prompt injection attacks waiting to be discovered, and they cannot be prevented universally.

If we want LLMs that resist these attacks, we need new approaches. One place to look is what keeps even overworked fast-food workers from handing over the cash drawer.

Human Judgment Depends on Context

Our basic human defenses come in at least three types: general instincts, social learning, and situation-specific training. These work together in a layered defense.

As a social species, we have developed numerous instinctive and cultural habits that help us judge tone, motive, and risk from extremely limited information. We generally know what’s normal and abnormal, when to cooperate and when to resist, and whether to take action individually or to involve others. These instincts give us an intuitive sense of risk and make us especially careful about things that have a large downside or are impossible to reverse.

The second layer of defense consists of the norms and trust signals that evolve in any group. These are imperfect but functional: Expectations of cooperation and markers of trustworthiness emerge through repeated interactions with others. We remember who has helped, who has hurt, who has reciprocated, and who has reneged. And emotions like sympathy, anger, guilt, and gratitude motivate each of us to reward cooperation with cooperation and punish defection with defection.

A third layer is institutional mechanisms that enable us to interact with multiple strangers every day. Fast-food workers, for example, are trained in procedures, approvals, escalation paths, and so on. Taken together, these defenses give humans a strong sense of context. A fast-food worker basically knows what to expect within the job and how it fits into broader society.

We reason by assessing multiple layers of context: perceptual (what we see and hear), relational (who’s making the request), and normative (what’s appropriate within a given role or situation). We constantly navigate these layers, weighing them against each other. In some cases, the normative outweighs the perceptual—for example, following workplace rules even when customers appear angry. Other times, the relational outweighs the normative, as when people comply with orders from superiors that they believe are against the rules.

Crucially, we also have an interruption reflex. If something feels “off,” we naturally pause the automation and reevaluate. Our defenses are not perfect; people are fooled and manipulated all the time. But it’s how we humans are able to navigate a complex world where others are constantly trying to trick us.

So let’s return to the drive-through window. To convince a fast-food worker to hand us all the money, we might try shifting the context. Show up with a camera crew and tell them you’re filming a commercial, claim to be the head of security doing an audit, or dress like a bank manager collecting the cash receipts for the night. But even these have only a slim chance of success. Most of us, most of the time, can smell a scam.

Con artists are astute observers of human defenses. Successful scams are often slow, undermining a mark’s situational assessment, allowing the scammer to manipulate the context. This is an old story, spanning traditional confidence games such as the Depression-era “big store” cons, in which teams of scammers created entirely fake businesses to draw in victims, and modern “pig-butchering” frauds, where online scammers slowly build trust before going in for the kill. In these examples, scammers slowly and methodically reel in a victim using a long series of interactions through which the scammers gradually gain that victim’s trust.

Sometimes it even works at the drive-through. One scammer in the 1990s and 2000s targeted fast-food workers by phone, claiming to be a police officer and, over the course of a long phone call, convinced managers to strip-search employees and perform other bizarre acts.

Why LLMs Struggle With Context and Judgment

LLMs behave as if they have a notion of context, but it’s different. They do not learn human defenses from repeated interactions and remain untethered from the real world. LLMs flatten multiple levels of context into text similarity. They see “tokens,” not hierarchies and intentions. LLMs don’t reason through context, they only reference it.

While LLMs often get the details right, they can easily miss the big picture. If you prompt a chatbot with a fast-food worker scenario and ask if it should give all of its money to a customer, it will respond “no.” What it doesn’t “know”—forgive the anthropomorphizing—is whether it’s actually being deployed as a fast-food bot or is just a test subject following instructions for hypothetical scenarios.

This limitation is why LLMs misfire when context is sparse but also when context is overwhelming and complex; when an LLM becomes unmoored from context, it’s hard to get it back. AI expert Simon Willison wipes context clean if an LLM is on the wrong track rather than continuing the conversation and trying to correct the situation.

There’s more. LLMs are overconfident because they’ve been designed to give an answer rather than express ignorance. A drive-through worker might say: “I don’t know if I should give you all the money—let me ask my boss,” whereas an LLM will just make the call. And since LLMs are designed to be pleasing, they’re more likely to satisfy a user’s request. Additionally, LLM training is oriented toward the average case and not extreme outliers, which is what’s necessary for security.

The result is that the current generation of LLMs is far more gullible than people. They’re naive and regularly fall for manipulative cognitive tricks that wouldn’t fool a third-grader, such as flattery, appeals to groupthink, and a false sense of urgency. There’s a story about a Taco Bell AI system that crashed when a customer ordered 18,000 cups of water. A human fast-food worker would just laugh at the customer.

The Limits of AI Agents

Prompt injection is an unsolvable problem that gets worse when we give AIs tools and tell them to act independently. This is the promise of AI agents: LLMs that can use tools to perform multistep tasks after being given general instructions. Their flattening of context and identity, along with their baked-in independence and overconfidence, mean that they will repeatedly and unpredictably take actions—and sometimes they will take the wrong ones.

Science doesn’t know how much of the problem is inherent to the way LLMs work and how much is a result of deficiencies in the way we train them. The overconfidence and obsequiousness of LLMs are training choices. The lack of an interruption reflex is a deficiency in engineering. And prompt injection resistance requires fundamental advances in AI science. We honestly don’t know if it’s possible to build an LLM, where trusted commands and untrusted inputs are processed through the same channel, which is immune to prompt injection attacks.

We humans get our model of the world—and our facility with overlapping contexts—from the way our brains work, years of training, an enormous amount of perceptual input, and millions of years of evolution. Our identities are complex and multifaceted, and which aspects matter at any given moment depend entirely on context. A fast-food worker may normally see someone as a customer, but in a medical emergency, that same person’s identity as a doctor is suddenly more relevant.

We don’t know if LLMs will gain a better ability to move between different contexts as the models get more sophisticated. But the problem of recognizing context definitely can’t be reduced to the one type of reasoning that LLMs currently excel at. Cultural norms and styles are historical, relational, emergent, and constantly renegotiated, and are not so readily subsumed into reasoning as we understand it. Knowledge itself can be both logical and discursive.

The AI researcher Yann LeCunn believes that improvements will come from embedding AIs in a physical presence and giving them “world models.” Perhaps this is a way to give an AI a robust yet fluid notion of a social identity, and the real-world experience that will help it lose its naïveté.

Ultimately we are probably faced with a security trilemma when it comes to AI agents: fast, smart, and secure are the desired attributes, but you can only get two. At the drive-through, you want to prioritize fast and secure. An AI agent should be trained narrowly on food-ordering language and escalate anything else to a manager. Otherwise, every action becomes a coin flip. Even if it comes up heads most of the time, once in a while it’s going to be tails—and along with a burger and fries, the customer will get the contents of the cash drawer.

This essay was written with Barath Raghavan, and originally appeared in IEEE Spectrum.

Posted on January 22, 2026 at 7:35 AMView Comments

LinkedIn Job Scams

Interesting article on the variety of LinkedIn job scams around the world:

In India, tech jobs are used as bait because the industry employs millions of people and offers high-paying roles. In Kenya, the recruitment industry is largely unorganized, so scamsters leverage fake personal referrals. In Mexico, bad actors capitalize on the informal nature of the job economy by advertising fake formal roles that carry a promise of security. In Nigeria, scamsters often manage to get LinkedIn users to share their login credentials with the lure of paid work, preying on their desperation amid an especially acute unemployment crisis.

These are scams involving fraudulent employers convincing prospective employees to send them money for various fees. There is an entirely different set of scams involving fraudulent employees getting hired for remote jobs.

Posted on December 31, 2025 at 7:03 AMView Comments

Faking Receipts with AI

Over the past few decades, it’s become easier and easier to create fake receipts. Decades ago, it required special paper and printers—I remember a company in the UK advertising its services to people trying to cover up their affairs. Then, receipts became computerized, and faking them required some artistic skills to make the page look realistic.

Now, AI can do it all:

Several receipts shown to the FT by expense management platforms demonstrated the realistic nature of the images, which included wrinkles in paper, detailed itemization that matched real-life menus, and signatures.

[…]

The rise in these more realistic copies has led companies to turn to AI to help detect fake receipts, as most are too convincing to be found by human reviewers.

The software works by scanning receipts to check the metadata of the image to discover whether an AI platform created it. However, this can be easily removed by users taking a photo or a screenshot of the picture.

To combat this, it also considers other contextual information by examining details such as repetition in server names and times and broader information about the employee’s trip.

Yet another AI-powered security arms race.

Posted on November 7, 2025 at 7:01 AMView Comments

Social Engineering People’s Credit Card Details

Good Wall Street Journal article on criminal gangs that scam people out of their credit card information:

Your highway toll payment is now past due, one text warns. You have U.S. Postal Service fees to pay, another threatens. You owe the New York City Department of Finance for unpaid traffic violations.

The texts are ploys to get unsuspecting victims to fork over their credit-card details. The gangs behind the scams take advantage of this information to buy iPhones, gift cards, clothing and cosmetics.

Criminal organizations operating out of China, which investigators blame for the toll and postage messages, have used them to make more than $1 billion over the last three years, according to the Department of Homeland Security.

[…]

Making the fraud possible: an ingenious trick allowing criminals to install stolen card numbers in Google and Apple Wallets in Asia, then share the cards with the people in the U.S. making purchases half a world away.

Posted on October 28, 2025 at 7:01 AMView Comments

Ghostwriting Scam

The variations seem to be endless. Here’s a fake ghostwriting scam that seems to be making boatloads of money.

This is a big story about scams being run from Texas and Pakistan estimated to run into tens if not hundreds of millions of dollars, viciously defrauding Americans with false hopes of publishing bestseller books (a scam you’d not think many people would fall for but is surprisingly huge). In January, three people were charged with defrauding elderly authors across the United States of almost $44 million ­by “convincing the victims that publishers and filmmakers wanted to turn their books into blockbusters.”

Posted on June 18, 2025 at 10:37 AMView Comments

Fake Student Fraud in Community Colleges

Reporting on the rise of fake students enrolling in community college courses:

The bots’ goal is to bilk state and federal financial aid money by enrolling in classes, and remaining enrolled in them, long enough for aid disbursements to go out. They often accomplish this by submitting AI-generated work. And because community colleges accept all applicants, they’ve been almost exclusively impacted by the fraud.

The article talks about the rise of this type of fraud, the difficulty of detecting it, and how it upends quite a bit of the class structure and learning community.

Slashdot thread.

Posted on May 6, 2025 at 7:03 AMView Comments

Gift Card Fraud

It’s becoming an organized crime tactic:

Card draining is when criminals remove gift cards from a store display, open them in a separate location, and either record the card numbers and PINs or replace them with a new barcode. The crooks then repair the packaging, return to a store and place the cards back on a rack. When a customer unwittingly selects and loads money onto a tampered card, the criminal is able to access the card online and steal the balance.

[…]

In card draining, the runners assist with removing, tampering and restocking of gift cards, according to court documents and investigators.

A single runner driving from store to store can swipe or return thousands of tampered cards to racks in a short time. “What they do is they just fly into the city and they get a rental car and they just hit every big-box location that they can find along a corridor off an interstate,” said Parks.

Posted on December 31, 2024 at 7:02 AMView Comments

IronNet Has Shut Down

After retiring in 2014 from an uncharacteristically long tenure running the NSA (and US CyberCommand), Keith Alexander founded a cybersecurity company called IronNet. At the time, he claimed that it was based on IP he developed on his own time while still in the military. That always troubled me. Whatever ideas he had, they were developed on public time using public resources: he shouldn’t have been able to leave military service with them in his back pocket.

In any case, it was never clear what those ideas were. IronNet never seemed to have any special technology going for it. Near as I could tell, its success was entirely based on Alexander’s name.

Turns out there was nothing there. After some crazy VC investments and an IPO with a $3 billion “unicorn” valuation, the company has shut its doors. It went bankrupt a year ago—ceasing operations and firing everybody—and reemerged as a private company. It now seems to be gone for good, not having found anyone willing to buy it.

And—wow—the recriminations are just starting.

Last September the never-profitable company announced it was shutting down and firing its employees after running out of money, providing yet another example of a tech firm that faltered after failing to deliver on overhyped promises.

The firm’s crash has left behind a trail of bitter investors and former employees who remain angry at the company and believe it misled them about its financial health.

IronNet’s rise and fall also raises questions about the judgment of its well-credentialed leaders, a who’s who of the national security establishment. National security experts, former employees and analysts told The Associated Press that the firm collapsed, in part, because it engaged in questionable business practices, produced subpar products and services, and entered into associations that could have left the firm vulnerable to meddling by the Kremlin.

“I’m honestly ashamed that I was ever an executive at that company,” said Mark Berly, a former IronNet vice president. He said the company’s top leaders cultivated a culture of deceit “just like Theranos,” the once highly touted blood-testing firm that became a symbol of corporate fraud.

There has been one lawsuit. Presumably there will be more. I’m sure Alexander got plenty rich off his NSA career.

Posted on October 11, 2024 at 7:08 AMView Comments

Problems with Georgia’s Voter Registration Portal

It’s possible to cancel other people’s voter registrations:

On Friday, four days after Georgia Democrats began warning that bad actors could abuse the state’s new online portal for canceling voter registrations, the Secretary of State’s Office acknowledged to ProPublica that it had identified multiple such attempts…

…the portal suffered at least two security glitches that briefly exposed voters’ dates of birth, the last four digits of their Social Security numbers and their full driver’s license numbers—the exact information needed to cancel others’ voter registrations.

I get that this is a hard problem to solve. We want the portal to be easy for people to use—even non-tech-savvy people—and hard for fraudsters to abuse, and it turns out to be impossible to do both without an overarching digital identity infrastructure. But Georgia is making it easy to abuse.

EDITED TO ADD (8/14): There was another issue with the portal, making it easy to request cancellation of any Georgian’s registration. The elections director said that cancellations submitted this way wouldn’t have been processed because they didn’t have all the necessary information, which I guess is probably true, but it shows just how sloppy the coding is.

Posted on August 7, 2024 at 7:10 AMView Comments

1 2 3 35

Sidebar photo of Bruce Schneier by Joe MacInnis.