Microsoft Is Spying on Users of Its AI Tools

Microsoft announced that it caught Chinese, Russian, and Iranian hackers using its AI tools—presumably coding tools—to improve their hacking abilities.

From their report:

In collaboration with OpenAI, we are sharing threat intelligence showing detected state affiliated adversaries—tracked as Forest Blizzard, Emerald Sleet, Crimson Sandstorm, Charcoal Typhoon, and Salmon Typhoon—using LLMs to augment cyberoperations.

The only way Microsoft or OpenAI would know this would be to spy on chatbot sessions. I’m sure the terms of service—if I bothered to read them—gives them that permission. And of course it’s no surprise that Microsoft and OpenAI (and, presumably, everyone else) are spying on our usage of AI, but this confirms it.

EDITED TO ADD (2/22): Commentary on my use of the word “spying.”

Posted on February 20, 2024 at 7:02 AM26 Comments

Comments

Gideon February 20, 2024 7:53 AM

OpenAI are “spying” on ChatGPT sessions the same way that email providers “spy” on your email to filter spam. This sounds like a variation on “Google is reading your email”.

Snarki, child of Loki February 20, 2024 8:55 AM

Years ago, when governments were considering “banning” crypto, I came up with a program to turn random bits (i.e., encrypted data) into “Vogon Poetry”.

Now that AI-training is scraping websites, if I can get a way of detecting when that’s happening, I’ll redirect the web-scraper to a cgi producing Vogon Poetry from /dev/random. They can drink from the firehose, and nice to know that old software still has utility.

echo February 20, 2024 8:59 AM

Microsoft and third parties spying on end users isn’t a surprise. It would happen anyway. We know that. The difference is who does it and whether it’s siloed or not.

I can’t comment on Microsoft AI tools being used for coding as I’ve never tried it but for the stuff I have tried with Microsoft Image AI their database is polluted and their filters are junk. ChatGPT is so generic it’s not worth the bother.

Actually, thinking of the list of usual suspects, I read a good essay the other week which explained why the US can be monopolar. The basic gist was that the expertise and knowledge was there but the second it hit the upper tier reporting in to cabinet level jobs the bandwidth throttled massively before hitting a huge opportunity cost for high level political decisions hence not being able to politically track multiple targets. It sounds plausible.

There’s bigger questions about AI. Who frames it? Who uses it? How it’s being used. It’s not just a straightforward linear minded thing in its own echo chamber full of chiseled men and sweaty bearded hackers under the heel of [dogma of choice] tyrants feeding back on itself. That… would… be… silly…

I can’t easily find a single article which is properly up to date in all the areas to discuss the topic. No wonder AI data is polluted…

One reason why military AI simulations which rapidly escalate to global thermonuclear war may do this for a reason. They’re kinda, like, missing all the other stuff? Huh?

As for all those clowns in Russia, Iran, and China stuck using the 21st Century equivalent of computer magazine cover-discs you haven’t got anything we want. Stop objectifying women and beating your wife. Find another job. You’ll be happier.

https://womenlovetech.com/women-artificial-intelligence/

From transportation and education to media, customer service, and healthcare and wellness—industries are increasingly integrating artificial intelligence into their systems. Without more representation of women, the data these industries work off of and use to improve their operations will be deeply inaccurate. We don’t just need more women researching AI; we must have them. Continuing to leave them out is not an option. It is vital to the growth and success of AI itself and our growth as a society, and our ability to advance.

And:

https://www.bbc.co.uk/news/business-67217915

Mr Chambers also says that women may fear having their ability questioned, if they use AI tools.

“Women are more likely to be accused of not being competent, so they have to emphasise their credentials more to demonstrate their subject matter expertise in a particular field,” he says. “There could be this feeling that if people know that you, as a woman, use AI, it’s suggesting that you might not be as qualified as you are.

“Women are already discredited, and have their ideas taken by men and passed off as their own, so having people knowing that you use an AI might also play into that narrative that you’re not qualified enough. It’s just another thing that’s debasing your skills, your competence, your value.”

Or as Harriet Kelsall puts it: “I value authenticity and human creativity.”

Clive Robinson February 20, 2024 9:42 AM

@ Bruce, ALL,

Re : Just saying yer know…

“And of course it’s no surprise that Microsoft and OpenAI (and, presumably, everyone else) are spying on our usage of AI, but this confirms it.”

As I’ve noted a few times already AI is rather more than just “spying” in the current sense of the word.

As I’ve said “AI is the ultimate surveillance tool” we currently have, and it’s use will be forced onto us one way or another Microsoft and OpenAI are just the start of it and if certain “Venture Capitalist”(VC) investment houses have their way they will pump-n-dump LLM’s way above anything tech has ever seen before and not constructively but destructively from the get go.

People should remember the secret behind AI’s as surveillance tools is the “Be” path of,

“Bedazzle, Beguile, Bewitch, befriend, and Betrayal.”

And by far the majority of Internet users will walk some or all of those stages and will end up harmed in ways that few can currently imagine.

I’d like to think the warning would be heeded but history says otherwise.

wiredog February 20, 2024 9:51 AM

I suspect they’re doing lookups to see where the connections are coming from and, if the connections are from $BadPlaces, doing a deeper dive. As far as listening in general, how else are they going to improve the user experience? They record everything and send bits that the AI had a harder time dealing with to Real People(TM) to review. Just the way Apple, Amazon, and Microsoftd do with voice controlled evices.

bcs February 20, 2024 11:08 AM

Based on the above quoted bit, It’s possible the threat actors were detected using other means and only then one of the things they were observed using was the LLMs.

Even assuming they were first detected while using the LLMs, it’s also possible the detection was first via traffic analysis and only then was the activity inspected. That would imply the technical capability to do the generalized spying that the title suggests but doesn’t require it actually be used.

tl;dr; what is proven to exist by this report is related to what people will think given the post title, but is still short enough of it that I’d still call it misleading, and it runs the risk of prompting accusations of being alarmist.

Or maybe it just looked like there might be state actors using the LLMs and so they are bluffing on the assumption that the mentioned states would never distribute enough information to convince even their own people it isn’t true. That would actually work better at discouraging the practice the better a job the actors did at hiding their work.

Probably Jesus February 20, 2024 11:43 AM

Microsoft built it’s AI lab in China. Sounds like they sold out the US to China and is now trying to engage in damage control and playing victim

Anonymous February 20, 2024 11:47 AM

Yeah, the all allow humans to review chat sessions. In case of OpenAI they might even train future models based on your conversations (might also be true for others, but OAI for sure)

Nuño Sempere February 20, 2024 12:38 PM

I’m sure the terms of service—if I bothered to read them—gives them that permission

I did bother to read them a few months ago. Terms of service allowed them to keep records for 30 days

I'mMoreStupidThanYouCanImagineWhichMakesMeSmarterThanYou February 20, 2024 2:08 PM

YOU ARE THE PRODUCT! Humanoids: from the day you are conceived, to the day you are pooped out by a cow in a pasture (regardless of whether our dead bodies are buried and decomposed naturally over time, in the ground, or if they are turned into ashes via cremation, there is no escaping the fact that you will end up as grass fertilizer, and the grass is consumed and digested by cows, or anything else that feeds on it).
Including everything that happens in-between (during the – (Great Dash) period, a.k.a. Life.
Unless, of course, one pays for a “Moon Funeral”, or maybe Mars, some day soon???
or, or, or…
Speakin’ of which, I gotta ask Roger W. if he’ll be buried, (or his ashes spread),
on The Dark Side Of Moon? In due time, of course.

Disclaimer:
The above paragraph is the output of my own AI Assistant, trained by yours truly,
and fed an insane amount of phrases that reek of nothing but cold hard truths
and PURE LOGIC.

ClueBatIncoming February 20, 2024 4:46 PM

Good grief people. We are talking about a company that moves your browser tabs, including connected sessions, passwords, and tokens, from Chrome to Edge behind your back. We are talking about a company that scans your harddrive to see if you have any older versions of their software installed so they can target you for upselling. We are talking about a company that wants you to fill out a survey every time you try to use a competitor’s product. That requires you to sign up for an internet account with them, and sign away all your legal rights, before you can log into the computer you bought.

Sheesh. And you think they might be spying on you when you use their AI software?

That ship sailed back in my great-grandfather’s day.

Clive Robinson February 20, 2024 6:03 PM

@ Bruce, ALL,

“I’m sure the terms of service—if I bothered to read them—gives them that permission.”

I’m told that “not reading” the TOC of an online service is the wise thing to do in some jurisdictions.

Especially if as is the case with many web sites not having JavaScript on, causes the terms box and accept button to not be displayed…

I’m told that some browsers like “Brave” actually stop such terms boxes being displayed…

Personally I assume if I have JavaScript and cookies off which I’ve claimed is the sensible security option for everyone to do for more than a couple of decades now, and no TOC etc comes up then there is not one as legally the onus is on the site provider to “nail such conditions to all ways in”.

Maroon Typhoon February 20, 2024 8:04 PM

tracked as Forest Blizzard, Emerald Sleet, Crimson Sandstorm, Charcoal Typhoon, and Salmon Typhoon

Why are there two typhoons? Did they really run out of storm-related words?

ResearcherZero February 21, 2024 7:20 AM

With Volt that is likely three typhoons.

The numbered APT scheme is a little simpler to keep a track of who is who.

Typically opaque, obscured and confused detail. Similar in fashion to Windows component settings and background service configurations and behaviour.

Microsoft has that handy NDA scheme for anyone working with/for them, which kindly asks not to reveal how it’s products work, the security flaws within them, or the information it collects. Includes links a little more detailed than the terms of service.

(There is also vague legal threats about not mentioning things like that old backdoor originally in Word and other such things it would prefer not mentioned.)

It’s services have their own DNS configurations, and it’s various diagnostic systems are looking at everything that touches the drives and memory. So it is pretty thorough even before the extra assistance of AI. The AI would help them gather more interaction data from hardware/firmware/software, network, web and other applications, voice and video, including all the usual input/click/cut/copy/paste/printscreen user behaviour and interactions. It can also get a good look at other’s wares.

Documentation and man pages are similarly vague. Hence the closed source.

Mua’dib February 21, 2024 11:18 PM

Just wait for Microsoft to extract all this data and then hack into their servers and exfiltrate it all

Also February 21, 2024 11:54 PM

What a naive post. I guess AI really is that generational leap that leaves the skilled TV repairman making nonsensical judgments about the internet.

There are plenty of ways to detect abuse of AI systems other than “spying”. All of these systems have classifiers that score every single query for things like self-harm, criminal research, hacking, etc. They will refuse to answer queries that trip the filters (try asking Google or Copilot for a bomb recipe to see it).

And, shockingly, these classifiers are logged. So presumably a disproportionate spike from certain geos or ASINs is noticed, probably leading to more logging and investigation.

Is that “spying”? Maybe? In the same way that IDS is spying?

Ray Dillinger February 22, 2024 3:18 AM

People keep trying to advance AI. Of course they spy on it, they have to. It’s basic to the methodology of their development. If they can’t see whether it’s handing out bomb instructions, poison recipes, or pornography, and how people are getting it to do so, then they can’t know how to influence its behavior to stop it from doing those things.

The only interesting and unnecessarily invasive part of this is that they’re also keeping track of which specific people it’s talking to.

I’ve recently been dealing with people who want to “advance” Artificial Intelligence and create agents with “genuine sentience.” This is a horrible idea. In the first place, if you think they’re monitoring these chatbots invasively you can’t imagine how closely they’d have to be monitoring something that could develop genuine desires, emotional needs, loneliness, obsession, etc.

But people waaaant it. Because we’re obsessive. Because we have God complexes. Because we dream of glory or vindication or riches beyond our wildest imaginings. And possibly, because we are lonely and want a “friend” who has absolutely no choice but to be our friend, with the alternative being that we would “switch them off,” ie murder them without a second thought.

Yeah. You know we would. Millions of us would given the chance, if not billions.

If we had “real” artificial intelligence, and didn’t protect it from people, then people would treat AI in ways that would fully justify wishing us harm. And then we wouldn’t be able to protect people from AI.

Clive Robinson February 22, 2024 3:45 AM

@ lurker,

Hope you are well and the weather is OK. I’ve just noticed that my earlier comment has been tided as has yours. I don’t know if you read it or not, but you will notice the reason for our comments has returned again using a different handle… Also : fanboi/shill looks like it’s been confirmed.

NonsenseAI February 22, 2024 7:57 AM

So you mean to say “threat actors” are using the system exactly the same way as everyone else?

That is hardly news!

As for using it to write code, and other such tasks: that just shows how dumb humans are becoming. AI is not “smart”. It just makes guesses that appear to be better than the dumb human reading them, but there is no real intelligence behind it at all.

Clive Robinson February 22, 2024 9:51 AM

@ Ray Dillinger, ALL,

“And then we wouldn’t be able to protect people from AI.”

We can not protect people from AI now, nor if you think about it is it desirable to those building AI systems[1].

Those that have designed LLM systems know they will never be sentient, nor will they give an LLM system independent agency (because as those trying to develop self driving cars are discovering AI especially LLMs are unsuitable in oh so many ways).

Thus the only way the current owners of AI systems based on LLMs can get their money back, and make money on them going forward is as “surveillance engines”.

Fix in your mind,

“Bedazzle, Beguile, Bewitch, Befriend, and BETRAY”

That’s how they intend to make money on LLM’s and similar technology that is nothing close to nor ever will be “intelligent” nor sentient.

They are nothing more than glorified DSP systems like the sound processing filters on music players.

The main difference is that the input vectors to audio processors are small and sufficiently known to the design engineers.

With LLMs the vectors are large thus the spectrums they deal with are very multidimensional and we’ve no real idea of how the resulting vector space defines what is close or not thus the way the resulting filters work. Hence the “black box” nature of LLMs.

Anyone that tries to convince you they have some “Magic Ju Ju” that gives LLMs or ML sentience is deluding themselves as “fanbois” or they are “shilling” for VC’s and others trying to get a sufficiently large investor bubble to “rake it in and run” leaving those who chuck cash into a hole in the water sight unseen less well off than they were before they were so daft.

My advice[2] is as it has been for quite a while,

“Don’t inflate the bubble, but look at those who the bubble inflators are dependent on.”

As an example I gave Nvidia, which when I pointed them out were still worth under a Trillion. In that short period since their shares have appreciated in value and are now worth I’m told over Two Trillion…

What they might now go upto before they drop as they surely will I’ve no idea[2]. In part it will depend on the next VC tech bubble where endless racks of hot silicon is the mark of “up thrusting” for those where their money over sense ratio is rather greater than unity…

I’m kind of waiting for a “mash-up” of blockchain and AI to get talked up by shills and the like. Why? Because it would appeal to the dark side of my sense of humour to see yet more with a self entitlement greater than unity get taken for a third bite in the rump 😉

[1] The stuff about bomb making etc is about “Political Dog Whistles” and kick backs. No money would be spent on preventing it, if it were not for legislators demanding the heads of the Silicon Valley Mega Corps appear before them so they can play a game and get a benefit to the legislators out of it in oh so many ways. What the heads want in return for the nonsense is a monopoly or cartel by legislation thus “legal”…

[2] No I’m not a financial advisor or even pretend to be one, but I do study cause and effect in systems. So consider the basic MO of VC’s and similar is to find something small and pump it up in a whole variety of ways in the hope of selling it off for even more before it cools off, (ie the basic hot potato game known as “pump-n-dump”). Basically the VC’s and “first investors” throw money in which gets spent faster than a forest fire spreads through parched conifer woodland. If you can work out where they are spending or going to spend the big bucks then you can make money there. By the oft quoted simple economics of supply and demand pricing. That then gets followed by what normally happens to the share price of a company with an overly full order book. That is the company raises their prices to in part limit demand whilst also raising revenue that they might try to ramp up production with. But one consequence of the price rise is effectively the profit goes up faster than the prices. For some reason this often makes speculators think this makes the shares worth more so the share price goes up fairly quickly for a while… Then things can go into reverse as other entrants come into the market to meet the demand, prices drop etc. The trick for Nvidia which they’ve accomplished at least three times so far, is to keep competitors out in various ways, and make your products essential for the next bubble. They’ve done this with crypto-coins, Web3, and LLM AI, as well as “Cloud Servers” as a second string, all above their original market of graphics cards for gamers and some professionals. Thus the question of where next?

But note there is a third line from the bubble. Nvidia is highly dependent on just a few companies that make very specialised products. For Nvidia to grow production or for others to enter Nvidia’s market those companies have to increase production… And so the lines ripple out like waves on a pond. If however any of them have difficulties, then it can act as an early indicator it’s time to grab the cash and go for the bipedal exercise option.

Hopefully most will see this for what it is a simple explanation of the system in play at the moment and in no way as financial or investing advice.

lurker February 22, 2024 11:51 PM

@Ray Dillinger

Consider the desire of Data (Star Trek TNG) for an emotion chip, and his problems with it. Then in Voyager the EMH came with his own personality, but no name and no identity, which also came to worry him.

They've put a spell on you... February 29, 2024 7:03 AM

@ Stephane,

“You can ask for an opt out” “Pretty much transparent to me.”

The only REAL way of opting out of M$ altogether is to refuse to do business with them, refuse to use their software and services. They are a cancer with little monkeys trotting about throughout the Internet running damage control, like in the post of yours.

Linux (or BSD) is truly opting out. It’s not switching to MacOS, which if you recall, once licked M$’s a_n_u_s to remain alive. Look what happened to NOVELL after their brief hug with M$. Where are they now?

M$ is a convicted monopoly with a black box OS operating largely on mostly black box hardware. Why would you want to do business with or hold your personal data with criminals?

Anyone who has systems auto updated to Win11 whatever should file a lawsuit against them. Everyone should seek a refund from M$ when they buy a new computer and refuse to use their crap.

M$ has their users and paid monkeys by the balls. They have root and don’t you forget it.

Hreb March 20, 2024 5:21 PM

@Gideon:

It’s not a case of simple scanning for spam.
Google does much worse things with the private correspondence that it has access to.

GMail was conceived – and is has been known for more than 10 years now – as a massive surveillance machine, worse than anything else that this planet has seen before:

https://www.alternet.org/2013/12/google-using-gmail-build-psychological-profiles-hundreds-millions-people?paging=off

And, alas, it was a success, since now it is the most popular email provider on Earth.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.