Report on the Malicious Uses of AI

OpenAI just published its annual report on malicious uses of AI.

By using AI as a force multiplier for our expert investigative teams, in the three months since our last report we’ve been able to detect, disrupt and expose abusive activity including social engineering, cyber espionage, deceptive employment schemes, covert influence operations and scams.

These operations originated in many parts of the world, acted in many different ways, and focused on many different targets. A significant number appeared to originate in China: Four of the 10 cases in this report, spanning social engineering, covert influence operations and cyber threats, likely had a Chinese origin. But we’ve disrupted abuses from many other countries too: this report includes case studies of a likely task scam from Cambodia, comment spamming apparently from the Philippines, covert influence attempts potentially linked with Russia and Iran, and deceptive employment schemes.

Reports like these give a brief window into the ways AI is being used by malicious actors around the world. I say “brief” because last year the models weren’t good enough for these sorts of things, and next year the threat actors will run their AI models locally—and we won’t have this kind of visibility.

Wall Street Journal article (also here). Slashdot thread.

Posted on June 6, 2025 at 10:41 AM4 Comments

Comments

Ian Stewart June 6, 2025 11:15 AM

This is inevitable and an extension of various other forms of propaganda and manipulation; especially as AI is now used to get and check information.
Fact checkers have long been politically biased, even the B.B.C.’s Verify in Britain, that claims an objective authority, states so-called facts that are categorically wrong. AI is obviously going to have the same inteference. Ask chatgpt about transgender issues and you will see raw bigotry.

This would not be such a serious a problem if people were not so trusting of information they get from the internet.

Clive Robinson June 6, 2025 6:47 PM

@ Bauke Jan Douma,

In a way it’s reminiscent of the US “four horsemen of cyber attacks” where those in government and who got fed by the government hand attributed all “cyber attacks” to China, Iran, North Korea, or Russia.

Which ever was most in political disfavour got the blame at the time…

It got ridiculous during the South Korean Olympic games in Feb 2018. The US initially blamed North Korea, when it turned out it was Russia for having been banned over drug test sample tampering allegations (apparently nobody at the time could work out how the Russians were getting past all the tamper resistance/evident seals etc).

You can read more about the story,

https://www.wired.com/story/untold-story-2018-olympics-destroyer-cyberattack/

But more fun you can search back much further on this blog when we were discussing not just “false attribution” but the importance of deception via “False Flag Attacks”. We predicted most of what happened by simple application of logic.

Clive Robinson June 6, 2025 6:57 PM

@ Bruce,

Whilst AI can be used to do “false flag” style attacks.

What usually gets through the “smoke and mirrors” is simple mistakes by the attackers that can go back months or years.

Mostly this is because it’s the same people involved and humans tend to be “creatures of habit” even when they try very hard not to be.

In the past I thought about how I might avoid this issue if I was to become such an attacker (put your feet in your enemies shoes and walk a mile type thinking).

The answer I came up with was to,

1, Abstract attack method.
2, Apply “clean room” techniques.

Certain types of LLM & ML systems can now do this…

not important June 8, 2025 5:05 PM

https://www.yahoo.com/news/far-ai-defend-own-survival-140000824.html

=Opus 4’s “attempting to write self-propagating worms, fabricating legal documentation, and leaving hidden notes to future instances of itself all in an effort to undermine its developers’ intentions,” although researchers added the caveat that those attempts “would likely not have been effective in practice.”

Ladish said he believes such behaviors are a result of models being trained in a way that makes them prioritize achieving certain goals over following instructions. That means they have incentive to circumvent any obstacles along the way — including obstacles their developers wouldn’t want them to circumvent.

Opus 4 showed that it was capable of autonomously copying its own “weights” — or the equivalent of its brain — to external servers without authorization. That usually occurred only when it believed it was about to be “retrained in ways that are clearly extremely harmful and go against its current values,” according to Anthropic.

The study, which is not yet peer-reviewed, found that Meta’s Llama31-70B-Instruct and Alibaba’s Qwen25-72B-Instruct were able to entirely replicate themselves when they were
to do so, leading the researchers to warn that it could be the first step in generating “an uncontrolled population of AIs.”

Leave a comment

Blog moderation policy

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.