LLMs and Phishing

Here’s an experiment being run by undergraduate computer science students everywhere: Ask ChatGPT to generate phishing emails, and test whether these are better at persuading victims to respond or click on the link than the usual spam. It’s an interesting experiment, and the results are likely to vary wildly based on the details of the experiment.

But while it’s an easy experiment to run, it misses the real risk of large language models (LLMs) writing scam emails. Today’s human-run scams aren’t limited by the number of people who respond to the initial email contact. They’re limited by the labor-intensive process of persuading those people to send the scammer money. LLMs are about to change that. A decade ago, one type of spam email had become a punchline on every late-night show: “I am the son of the late king of Nigeria in need of your assistance….” Nearly everyone had gotten one or a thousand of those emails, to the point that it seemed everyone must have known they were scams.

So why were scammers still sending such obviously dubious emails? In 2012, researcher Cormac Herley offered an answer: It weeded out all but the most gullible. A smart scammer doesn’t want to waste their time with people who reply and then realize it’s a scam when asked to wire money. By using an obvious scam email, the scammer can focus on the most potentially profitable people. It takes time and effort to engage in the back-and-forth communications that nudge marks, step by step, from interlocutor to trusted acquaintance to pauper.

Long-running financial scams are now known as pig butchering, growing the potential mark up until their ultimate and sudden demise. Such scams, which require gaining trust and infiltrating a target’s personal finances, take weeks or even months of personal time and repeated interactions. It’s a high stakes and low probability game that the scammer is playing.

Here is where LLMs will make a difference. Much has been written about the unreliability of OpenAI’s GPT models and those like them: They “hallucinate” frequently, making up things about the world and confidently spouting nonsense. For entertainment, this is fine, but for most practical uses it’s a problem. It is, however, not a bug but a feature when it comes to scams: LLMs’ ability to confidently roll with the punches, no matter what a user throws at them, will prove useful to scammers as they navigate hostile, bemused, and gullible scam targets by the billions. AI chatbot scams can ensnare more people, because the pool of victims who will fall for a more subtle and flexible scammer—one that has been trained on everything ever written online—is much larger than the pool of those who believe the king of Nigeria wants to give them a billion dollars.

Personal computers are powerful enough today that they can run compact LLMs. After Facebook’s new model, LLaMA, was leaked online, developers tuned it to run fast and cheaply on powerful laptops. Numerous other open-source LLMs are under development, with a community of thousands of engineers and scientists.

A single scammer, from their laptop anywhere in the world, can now run hundreds or thousands of scams in parallel, night and day, with marks all over the world, in every language under the sun. The AI chatbots will never sleep and will always be adapting along their path to their objectives. And new mechanisms, from ChatGPT plugins to LangChain, will enable composition of AI with thousands of API-based cloud services and open source tools, allowing LLMs to interact with the internet as humans do. The impersonations in such scams are no longer just princes offering their country’s riches. They are forlorn strangers looking for romance, hot new cryptocurrencies that are soon to skyrocket in value, and seemingly-sound new financial websites offering amazing returns on deposits. And people are already falling in love with LLMs.

This is a change in both scope and scale. LLMs will change the scam pipeline, making them more profitable than ever. We don’t know how to live in a world with a billion, or 10 billion, scammers that never sleep.

There will also be a change in the sophistication of these attacks. This is due not only to AI advances, but to the business model of the internet—surveillance capitalism—which produces troves of data about all of us, available for purchase from data brokers. Targeted attacks against individuals, whether for phishing or data collection or scams, were once only within the reach of nation-states. Combine the digital dossiers that data brokers have on all of us with LLMs, and you have a tool tailor-made for personalized scams.

Companies like OpenAI attempt to prevent their models from doing bad things. But with the release of each new LLM, social media sites buzz with new AI jailbreaks that evade the new restrictions put in place by the AI’s designers. ChatGPT, and then Bing Chat, and then GPT-4 were all jailbroken within minutes of their release, and in dozens of different ways. Most protections against bad uses and harmful output are only skin-deep, easily evaded by determined users. Once a jailbreak is discovered, it usually can be generalized, and the community of users pulls the LLM open through the chinks in its armor. And the technology is advancing too fast for anyone to fully understand how they work, even the designers.

This is all an old story, though: It reminds us that many of the bad uses of AI are a reflection of humanity more than they are a reflection of AI technology itself. Scams are nothing new—simply intent and then action of one person tricking another for personal gain. And the use of others as minions to accomplish scams is sadly nothing new or uncommon: For example, organized crime in Asia currently kidnaps or indentures thousands in scam sweatshops. Is it better that organized crime will no longer see the need to exploit and physically abuse people to run their scam operations, or worse that they and many others will be able to scale up scams to an unprecedented level?

Defense can and will catch up, but before it does, our signal-to-noise ratio is going to drop dramatically.

This essay was written with Barath Raghavan, and previously appeared on Wired.com.

Posted on April 10, 2023 at 7:23 AM27 Comments


Beatrix Willius April 10, 2023 9:06 AM

The permissiveness of email makes email work. Perhaps it’s time for people to become more intelligent. Just kidding…

Morley April 10, 2023 10:05 AM

Nah, the permissiveness gives it insecure, unauthenticated freedom. Isn’t required for it to work. It’s legacy, duct taped design from the 90’s.

Authorship April 10, 2023 11:29 AM

Dear Bruce, as already mentioned in a previous blog entry of yours, I would really encourage you to provide the author details at the beginning of articles, especially when you’re not the author. It’s quite misleading to think that you’re the author all through reading an article, only to discover at the end that you’re not. Thanks for your consideration.

For Authorship April 10, 2023 11:50 AM

You seem to believe that Bruce is not the author. The end of the article states “written with Barath Raghavan” which means it was written by Bruce AND Barath.

Adrian April 10, 2023 12:02 PM

On the other side of this arms race, LLMs could be used to waste the scammers time by sinkholing their emails to ChatGPT.

Clive Robinson April 10, 2023 2:13 PM

@ ALL,

“Here’s an experiment being run by undergraduate computer science students everywhere”

Means, either it has come of age, or is obvious or both…

“It’s an interesting experiment, and the results are likely to vary wildly based on the details of the experiment.”

Actually, not that interrsting.

As any fresh water fisherman can tell you,

“For every fish there is a lure to make it bite… All you have to do is find it.”

Which kind of tells you in advance that the results of such tests will be… that is as variable as the subjects tested.

As for GPT hallucinations being a benifit, well “mad monks” have alwaus appealed to a certain few, but the question that should be asked,

“Was it the hallucinations or the danger and freedom they offered?”

Personally I suspect it was not the hallucinations but the freedom from an inflicted sociatal norm.

But then there is,

“This is a change in both scope and scale. LLMs will change the scam pipeline, making them more profitable than ever.”

Actually I suspect not that more profitable. People are imbuing LLM’s with “magic properties” they do not posses. They don’t actually “invent” or “inovate” they “randomly change within known probabilities”. And actually they are not that good at much outside of “Marketing Speak” or some other similar outsized pile of verbose garbage as a “filter mask” to set the probabilities.

So we get to,

“We don’t know how to live in a world with a billion, or 10 billion, scammers that never sleep.”

Actually we do. Malware works on the principle of “An army of one” which when you analyze it boils down to,

“Create a script around a know vulnerability and present it to a system to run in some way”

The system if it can be,

1, Reached
2, Has the vulnerability
3, Can run the script

Might become “owned” but break any one step in the chain and it won’t be.

The thing is you have to ask the question,

“What is the difference between the malware script, and the script used for a confidence trick/scam?”

The actual answer is “very little”.

Therefore we can make a reasonable prediction as to how effective a confidence scam is going to be.

But think of LLM’s acting as “agents of crime” or “aiding and abetting” the designers of these systems know they can not stop them being used for harm as noted by,

“ChatGPT, and then Bing Chat, and then GPT-4 were all jailbroken within minutes of their release, and in dozens of different ways.”

Thus the question of “culpable negligence” applies, and I don’t think it is something the designers of LLM’s can avoid.

We have legislation and regulation around semi and fully automatic and large caliber weapons and their availability that put responsability and punishment on those who supply them to those that should not have them.

Maybe we should be asking not if things should be put on hold for half a year as a recent letter dod, but if the likes of LLM’s should ever be available outside of legaly approved users and operators?

modem phonemes April 10, 2023 2:41 PM

“Sincerity – if you can fake that, you’ve got it made.”

  • George Burns

Givon Zirkind April 10, 2023 5:10 PM

The analysis is correct. I used to receive scam phone calls about owing IRS money. One time, I decided to stay on the line and played stupid. I was on hold. Then, transferred to an agent. On hold, transferred again. Everyone had an Indian-like accent. When it came to paying, after being on the phone for 15 minutes; I asked, if this is a call center in India? At which point, the speaker got very rude. Said no, in Pakistan and hung up. I never got that type of spam call again.

I figure the scammers maintain databases and profiles. Since I wasted so much of their time and money in international calls, I must have been put on their bad guy list.

Tony April 10, 2023 6:18 PM

The problem with spam e-mail has always been that it costs the spammer virtually nothing to send a million e-mails. If they only make a hundred dollars from one or two of those they have made a profit.

If Bruce is right that AI means that spammers will up their game by 1000x, then e-mail will become useless.

But if it cost the spammer $0.01 to send each e-mail, then they need a much higher hit rate to turn a profit.

Time for micro-payments to make a comeback?

EvilKiru April 10, 2023 8:00 PM

@Tony: As if the spammers won’t come up with ways to have that penny charged to someone else, just like how they currently spam on someone else’s dime.

Clive Robinson April 11, 2023 3:30 AM

@ Tony, EvilKiru, ALL

Re : micro-payment comeback

It’s not going to happen.

Back in the 1980’s the idea of digital payments and electronic wallets got started along woth pocket gambling devices and similar. But as I’ve mentioned before I found they could not be made sufficiently secure.

Around the same time in the US phone freaking was a serious activity and people discovered how to use the PABX’s of small companies and charities to do “dial-in and dial-out” fraud[1].

But also, the CellPhone system was not secure, it was not difficult to steal a mobile phone ID and make out bound calls to far foreign places on somebody elses bill. In fact in many places criminals would set up shops where you could make calls at about half the rate of landline (POTS) charges.

With phone fraud the phone companies were extreamly tardy at sorting out the fraud that they had created by their insecure behaviours. Because they were effectively accomplices by choice, in that they got payed or took the hurt person to court and damaged their reputation or just denied them phone service etc. In the UK even the phone regulator OfComm –stuffed with revolving door industry types– were giving out the wrong legal advice quite deliberately.

Eventually in the UK the law had to be changed so that the phone companies had to swallow the cost of the fraud. So they could nolonger “externalise the risk/cost” and unsurprisingly they very quickly started making changes. Though if you ask the corporate types from back then they still say that they had to wait on technology, which was not true back then nor is it true these days as we can see with the finance industry.

On a current note, have a look at “Secure Electronic Transfer”(SET) where the Payment Card Industry tried to come up with a secure system. It failed because nobody realy wanted such a system. Their next move was “Chip-n-Spin” which was full of security faults and still is due to the in built security failings. But you listen to the Payment Card Industry and they will tell you all is wonderfull in their garden (as they are not picking up the cost of fraud just making proffit on it). The same applies to these Contactless “Near Field Ccommunications”(NFC) systems.

The game is still the same “the little guy pays” unless legislation or regulation addresses the balance. Lobbying is very inexpensive compared to the cost savings the corporates can make especially when they deflect the cost onto those who can not fight back.

Any “libertarian” type who tells you they can make it otherwise with cryptocoins or digital/E-cash is either deluded or conning you. Because as a consequence there will always be criminals working a profit out of it somewhere along the line. The thought of “free money” at nearly zero risk to them will ensure they are always “at it”. The only known way to limit or stop it, is to make the corporates clean up their act by legislation and regulation that makes them pick up the cost of such fraud…

[1] All tone dial PABX’s had the ability to be configured such that you can dial one of it’s numbers (in bound pair), and it would drop you onto an outbound pair where you got dial tone, from which you could dial to any other number. Whilst it made things like accounting easier, thus waa atractivr, it also made fraud by others easy.

Gert-Jan April 11, 2023 6:59 AM


On the other side of this arms race, LLMs could be used to waste the scammers time by sinkholing their emails to ChatGPT.

While this is theoretically true, and might be worthwhile for honey pots, I expect that the people falling for these scams – which, as reseach shows, are “selected” on gullability – typically are not technological front runners / technologically savvy. So for this scamming use of AI, it is an uneven playing field.

Now, I do expect the communications providers (email, social media) to catch more and more scams that surface. They will definitely use any available technology to achieve that.

JMM April 12, 2023 1:17 AM

@Givon Zirkind: this is as method I used to use back in the day when I was annoyed by spam callers (nowadays I just don’t pick up if I don’t recognize the number). I replied, and said “one second, please, I’m transferring you to the Corporate Service Department), set the phone next to the speakers and turned up the volume of whatever was playing at the time.

Worked like a charm. They need movement, not someone holding up the line.

ResearcherZero April 14, 2023 3:43 AM

@Bruce, Clive Robinson

“The attacks are essentially a form of hacking—albeit unconventionally—using carefully crafted and refined sentences, rather than code, to exploit system weaknesses.”

Security researchers warn that the rush to roll out generative AI systems opens up the possibility of data being stolen and cybercriminals causing havoc across the web

Clive Robinson April 14, 2023 6:27 AM

@ ResearcherZero, Bruce,

Re : unconventional hacking

The first paragraph you quote from the article gave me an eerie feeling of deja vu, as it closely resembles some of my own words from a short while back.

Where I asked the question about the difference between an attacker sending “a malware script” of instruction for a computer to follow and another attacker sending a human a list of instructions to follow.

In either case the model is identical except for the entity under attack one script being for “in silico” the other being “in vivo”.

These attacks whilst being “in silico” are using an attack in a “language” much closer to that of the “in vivo” script attack on humans.

Thus it brings up the vexed subject of “Neuro Linguistic Programing”(NLP). NLP is anotion first thought up in the mid 1970’s but as there is no currently credible “scientific method evidence” about NLP advocates claims, it has been sofar relegated to the pseudoscience catagory. Thus NLP has joined a number of “alternative medicine” therapies.

So could these attacks against LLM’s be seen not just as an in silico attack but a form of “NLP for AI” systems, thus become a credible research domain in it’s own right?

Thus as a technology it could be used for “good or bad” under an univolved observers moral perspective of,

1, For teaching an AI is “good”.
2, For corrupting an AI is “bad”.

But the notion of “NLP for AI” also has a flip side it can and will be used for what an uninvolved observer would most probably be regarded as “bad” via the likes of “suasion” techniques.

Another much derided idea from the 70’s was “subliminal messaging” can be seen as a form of steganography, but aimed at the subconcious rather than the concious mind. However we know that “the written word” sinks in way deeper than “non shock” images do and “instills” rather than “installs” a cognative change.

So this “bad” can be flipped to “good” if the output is used to produce more effective teaching / learning tools, that could be tuned to an individuals learning style.

Which begs the question of just how many other 1970’s era ideas about the human mind and perception might be given “new leases on life” either for in silico activities or to produce in vivo scripts?

Personally as some one who regards LLM’s as little more than a glorified “Matched filter” used as a logical progression of using a stochastic source to generate the equivalent of “human memorable” “nonsense passwords” that also arose in the 1970’s, I will be interested in how others present the notion of “NLP for AI” in future.

@ Bruce, ALL,

Perhaps an Op-Ed or “thought piece” on using LLM’s for turning XKCD style pass phrase generation “word salad” output into human “memorable sentences” might help swing the uninvolved observer view from “bad” towards “good”.

Winter April 17, 2023 8:24 AM

Open Source and Open Data LLM

Open Assistant Conversations – Democratizing Large Language Model Alignment

Aligning large language models (LLMs) with human preferences has proven to
drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages distributed across 66,497 conversation trees, in 35 different languages, annotated with 461,292 quality ratings. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. To demonstrate the OpenAssistant Conversations dataset’s effectiveness, we present OpenAssistant, the first fully open-source large-scale instruction-tuned model to be trained on human data. A preference study revealed that OpenAssistant replies are comparably preferred to GPT-3.5-turbo (ChatGPT) with a relative winrate of 48.3% vs. 51.7% respectively. We release our code3 and data4 under fully permissive licenses.

Clive Robinson April 17, 2023 9:39 AM

@ Winter, ALL,

Re : Biasing LLM probability filters.

From what you quote,

“Aligning large language models (LLMs) with human preferences has proven to
drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT.”

Begs the,

“What could possibly go wrong?”


To which human history replies,

“We’ve burned witches and done way worse as ‘human preference’ and still do…”

LLM’s have know inate knowledge by experience and can not “sense the world” to gain it.

Every input LLMs get, is either by others selection and ordering, or some other form of internal bias or randomness. From which they just build filters to reflect the sequenced probabilities of the input.

The Roman Poet Juvinal[1] just under a couple of millenia ago asked,

“Quis custodiet ipsos custodes”

So who does “guards the guards” or as we prefer these days “watches the watchers”. Whilst it is an age old question there is no answer that surfices.

Not only is it a “turtles all the way” problem, there is not even the usuall human fudge of “Rule of Thumb”. Thus it causes problems in everything in which there is an authority or power structure in place, thus every asspect of society.

Remember what is “seen as good or bad” is always from a supposadly uninvolved thus argued as impartial observers point of view.

But both the directing mind of an act and the observer are part of a society so they all are biased in some way and that changes with time and events, sometimes both radically and rapidly.

So whilst some little old lady with dementia screams as the flames consume her something most these days would regard as abhorent, it was “public entertainment” just a century or so back.

For those who say “that could never happen now” similar still goes on in various ways as power strugles for dominance daily happen even more so today than then.

It might be instructive for those who think “human prefrence” is benign or good to look back on the “Salem witch trials” at the end of the 17th Century just a third of a millennia ago.

[1] Of Decimus Junius Juvenalis or “Juvenal” little is known even though it has been alledged he was possessed of a “Patrician’s nose”. His writings show familiarity with Scotland and Egypt, though if this was first hand or not is not known. What he is remembered for is his surviving five books of satirical poems, that have been “classically studed” and later, much later quotes taken from.

Chris April 17, 2023 4:41 PM


Great fleas have little fleas upon their backs to bite ’em, And little fleas have lesser fleas, and so ad infinitum…

If LLMs are also employed as a countermeasure against LLM scams, it makes sense to set up a funded honeypot such that the first LLMs scam the honeypot, the honeypot LLMs scam the original scammers back and so ad infinitum…

lurker April 17, 2023 5:44 PM

“supervised fine-tuning … reinforcement learning …”

Riiight. All our human biases amplified by the power of the machine. Now, just supposing there’s more than one of these machines, and they’re fed their biases from differing ideologies, what is it these mechanical warlords will fight over? Because fight they will, it’s a human trait their keepers will have fed them.

Winter April 18, 2023 2:14 AM


All our human biases amplified by the power of the machine.

But would you prefer a system these biases are hidden in proprietary code and data, over a system where both are FLOSS/CC?

Just as gunpowder and automated looms could not be suppressed, LLMs are here to stay.

Also, biased decisions are a human trait. LLMs are just putting them in the limelights.

Clive Robinson April 18, 2023 5:31 AM

@ Winter, lurker, ALL,

Re : Only Obvious when seen.

“[B]iased decisions are a human trait. LLMs are just putting them in the limelight”

It is a point that is obvious when seen, but first people have to see it and those making money or using it to avoid responsability, are trying quite hard to stop it being seen.

After all LLM’s are only quite obviously,

1, Creations of Man.
2, Do not have the necessary safety features.
3, Used in ways where serious legaly defined harms have repeatedly happened to people.

Thus they are a technology that in US prosecutor behaviour certainly falls under the US legal definitions of WMD.

As you ask,

“But would you prefer a system these biases are hidden in proprietary code and data”

Absolutly not, such would only encorage significant abuse bye “those making money or using it to avoid responsability”

I’d actually like to see LLM’s be got rid off, as they have at a high level broadly the same failings as cryptocurrency but actually a lot worse. That is so far their use for “social bad” is much greater than any use for “social good” so far claimed and to be honest I can not see that changing[1].

However as you note,

“Just as gunpowder and automated looms could not be suppressed, LLMs are here to stay.”

Those atleast did and still do have reasonable arguments for “social good” unlike LLMs.

So at the very least LLMs and similar need to be a heavily legislative controled and regulated industry with significant but open Oversight (as legislation was once supposed to be long prior to the likes of the PATRIOT Act, RIP Act, and similar.

[1] Not much is currently said, is about the fact that,

All LLM’s are actually very powerful ‘Surveillance Tools’ capable of identifing and following even people taking significant precautions to prevent it”.

Thus the plans by ethically questionable companies with morally dubious connections to the likes of US “Guard Labour” and “Inteligence Community” Agencies to put LLMs and simillar into ‘search engines’ and ‘Cloud based productivity’ systems should be of very great concern to all.

Winter April 18, 2023 6:28 AM


Those atleast did and still do have reasonable arguments for “social good” unlike LLMs.

Gunpowder has social goods? Debatable. But LLMs do have “social goods”, just like computer translation, Automatic Speech Recognition, Text To Speech synthesis, Autonomous Driving Vehicles, etc..

Cars are killing machines that are responsible for ~43,000 yearly deaths in the USA, ~20,000 in the EU. Their use is heavily regulated to reduce the onslaught. But no one is suggesting we should abolish cars. The US would collapse instantly if cars were banned.

LLMs can be dangerous and should be heavily regulated. But the biases and discrimination are already present in daily practice [1]. With AI, we have the option to make the biases and discrimination visible and correct them.

[1] ‘https://www.aclu.org/news/privacy-technology/algorithms-in-health-care-may-worsen-medical-racism

In 2019, a bombshell study found that a clinical algorithm many hospitals were using to decide which patients need care was showing racial bias — Black patients had to be deemed much sicker than white patients to be recommended for the same care. This happened because the algorithm had been trained on past data on health care spending, which reflects a history in which Black patients had less to spend on their health care compared to white patients, due to longstanding wealth and income disparities. While this algorithm’s bias was eventually detected and corrected, the incident raises the question of how many more clinical and medical tools may be similarly discriminatory.

See also the relevant study:

Clive Robinson April 18, 2023 8:21 AM

@ Bruce, ALL,

Re : Further thoughts to consider.

“But while it’s an easy experiment to run, it misses the real risk of large language models (LLMs) writing scam emails.”

As noted earlier with LLM’s and similar AI currently “the social bad” far outweighs “the social good”. And with a little thought people will realise “scam emails” will be the least of our problems with “the social bad” of LLMs and similar AI tech.

But argument will be made by proponents that the above view is,

1, Chicken little scare stories,
2, It’s early days, and similar happens with all technologies only to change around later.

Both at the same time along with many similar. The problem is such proponents arguments are effectively untrue even though they might have tiny pieces of truth to them, used as a smoke screen.

Unfortunately “the social bad” can and will outweigh any potential “social good” and this is not a “gut feeling”, “hunch”, or “hinky fealing” it’s based on reasonable assumptions about technology and certain types of people.

So I’ll give a simplified 20,000ft explanation, so any one who cares to think about it can consider it no matter what their background.

As I’ve indicated in the past there is a “spectrum” between “Fully determanistic”(Fdet) and “Truely random”(Tran). You can draw it out as a line between those two points,


Along which there are identifiable zones of transition that you can lable in various ways but as a rule of thumb has some form of increasing complexity or sensitivity from Fdet. So you will find “non linear”, “non continuous”, “chaos”, “Psudorandom”, “faux noise” etc along that line depending on how you want to define them, but that labeling is not realy relevent to this argument.

Because it can also be seen that running along that line is “security” and “privacy” to join “complexity” and “sensitivity” but inversely so that the highest level of security not unexpectedly is at the minimal complexity Fdet end of the line.

Now lets consider something that is orthagonal / tangential to the line and call it “surveillance”.

Currently we are transitioning from tangible physical world surveillance to intangible information world surveillance.

Three key characteristics of implementing surveillance activities on a chosen target are,

1, Time
2, locality
3, visability

With traditional tangible world surveillance proximity to the target significantly effected “visability”, not just by the target of the surveillance measures, but also the “resolution” and “scope” of information available and also the resources needed. The resources in turn were effected by technology available. Similar applies with time and visability.

Which historically ment surveillancec was very limited and held at the Fdet end of the line.

However the development of surveilance technology has increased both spatial resolution and scope to the point where even past time is now decreasingly of relevance, nor as far as communications is concerned locality.

That is “information surveillance” via “collect it all” has created what is a “virtual time machine” where going back in time is simply a case of “pull past records” which are available on near everyone due to “collect it all” and “Third Party Business Records”. The requirment for business records was originaly for “book keeping” and later taxation and legal needs. But this has broadened out in scope as it was realised the records are potential marketable information. And we know they currently being sold via data brokers and given for free to US Government agencies etc.

The problem for those surveiling via “collect it all” “industrial surveillance”, is the “Chaotic Fire Hose” issues. That is not knowing what is worth collecting has resulted in vast repositories of data that is near impossible to use effectively or profitably as it’s a form of “a straw in a hay stack” issue.

It’s regarded by quite a few as “beyond comprehension” but actually it’s not. All you need is some way to find “usefull paterns” or rules and apply those to the data to filter out signals. Effectively this is what LLMs can be very good at, but importantly they can also be very very fast compared to the traditional human analyst.

So LLMs are an ideal “surveillance tool” as a back end to “collect it all” and we know such tools in a more primative form already exisy, and it is the base business model behind the worlds largest private surveillance company Palantir as I’ve mentioned in the past.

The problem with those earlier back end tools was that they were “human dependent” on the generation of “usefull pattens” and many failed due to the “throw the pasta at the ceiling” side issue. But less obvious was that there are limits on how humans can pull signals from noise. In essence we take a linear continuous view and try to augment it by “averaging out noise”. The thing is humans are realy bad at recognising the difference between true noise and the sum of other signals that chaotically look like noise.

This has kept surveillance technology down at the Fdet end of the line even though it has increased in scope beyond most peoples imagination. It is effrctively “stuck on the starting line” where that start line is rapidly growing in length beyond human sight, but is not moving forward.

As I’ve mentiond before LLMs are massive “adaptable filters” that pull human mind usefull information from what looks like totally random noise. But the noise is mostly not actually not truly random or random at all. But it is a tangle of a myriad of signals that are to chaotic for human eyes to pull out.

That is LLMs and similar AI are increasingly getting beter at untangling all the signals and teasing them apart so even the least capable of humans can see all the threads of human society laid out and thus chose which to pull forward, which to tie in knots, cut short, or be totaly removed.

Thus LLMs have in effect have loaded and fired the starting gun, and surveillance is now nolonger bound down at Fdet end of the line, but is advancing quite quickly along that Fdet-Tran line in a hugely broad swath at rapidly increasing pace.

This will be a disaster for Western democratic society and will produce systems way worse than what is currently being imagined in the MSM and trade press for what China and Russia and other authoritarian states are heading / building towards.

But consider, those living in longterm authoritarian states have different societies to the West and as such have already adapted to the instillation of surveillance over several generations. Western democratic free speech society has not. A distinction that tends not to be given the actual consideration it is due in the ICT industry and it’s product users.

Thus by habit those in authoritarian societies don’t leave potentially self harming information around or in communications or their activities in obviously visable ways. Which is not something “free speach” has engendered in Western society.

That is those in longterm authoritarian states have in effect masked their signal with the sum of other signals thus so far have been “hidden from view in the long grass”, whilst those in the West have grown in the sun into a fat crop of wheat…

LLM’s are going to cut through that wheat and long grass alike. Like the ever sharp scythe death is depicted as carrying it will cut for binding all before it. Only Death is an anthropological abstraction of man.

LLM’s will be like those multi millions of dollar “Prairie Combine harvester teams”. Which on Western society that has grown like wheat to the point it is a valuable crop to harvest, it will be rapidly cut threshed and sold off. More surely removed than a whirle wind driven fire storm.

Neither open or closed societies can survive as we currently know them against LLMs effects when a “directing mind” puts them to what most observers would consider “bad use”. Unfortunately human history tells us that this is most probably what will happen even with legislation and regulation to supposadly prevent it.

Because both money and power and what they bring in terms of status and control over others is too desirable for some. Thus they have developed methods to avoid legislation and regulation.

Clive Robinson April 18, 2023 9:06 AM

@ Winter,

“Gunpowder has social goods? Debatable”

Not at all, as history clearly shows we would not have been able to build modern society at the pace we have, without it or it’s later equivalents.

“LLMs do have “social goods”, just like computer translation, Automatic Speech Recognition, Text To Speech synthesis, Autonomous Driving Vehicles, etc..”

Actually they are dubious claims at best, and as they advance what were thought remote corner cases have become not just close and visable edges, but actually significant cracks that people are falling into or worse “under the bus” in the case of “autonomous driving” when placed in unconstrained environments like roads that most humans can mostly safely function.

“Cars are killing machines that are responsible for ~43,000 yearly deaths in the USA, ~20,000 in the EU. Their use is heavily regulated to reduce the onslaught. But no one is suggesting we should abolish cars. The US would collapse instantly if cars were banned.”

Begs the question,

“Why has the US twice the death rate yet half the population of Europe?”

That is why is the rate a quater or less in Europe than the US?

As I frequently point out machines like all techbology even atomic bombs are not intrinsically good or bad. It’s the use by the “Directing mind” that causes the act, that observers consider good or bad.

That is I could beat you to death with a shoe, or strangle you with it’s lace, and some have done just that to others way more than you might tgink. The point is anything of inherent use will be at hand and can be used for good or bad, and it’s the human mind that decides which.

So what is so different about the US mind / society it forms and the European?

Even the alledged nearest to US political nation in europe the UK with around a fifth the US population has something like a tenth of the number of vehicle deaths and our speed limits are generally higher (and the “accident rate” is claimed to be proportional to speed…).

So what is it about the way those in the US think and behave?

Thus your “The US would colapse instantly…” argument is not about “technology” but “society” and it’s mores, morals, ethics, and perspectives.


“LLMs can be dangerous and should be heavily regulated.”

Is something I argue for as I can see the issues that are visable. But as I’ve said regulation is only effective if the legislation exists and oversight and sufficient punishment for transgression can not be avoided. Which in the US with lobyists and off shore companies is not the case. In effect “the mad house” is not just being run by the lunatics, they also own it in ways they can not be stopped.

“But the biases and discrimination are already present in daily practice [1]. With AI, we have the option to make the biases and discrimination visible and correct them.”

The biases and discrimination are already visable, it’s just that we chose to ignore them. The LLMs clearly demonstrate this fact.

Thus the question is,

“Will the LLMs make the situation better as you hope, or worse as I expect from the historic evidence all around us?”

I’m not optomistic and for good reason from my perspective and investigation.

Winter April 18, 2023 9:42 AM


“Will the LLMs make the situation better as you hope, or worse as I expect from the historic evidence all around us?”

That could have been asked from anything from the printing press, railroads, newspapers, radio, TV, to the internet, to blockchain, to LLMs.

The printing press brought forth the 30j war. railroads allowed the UK to conquer India and wage the Bengal famine.

The fact that you, personally, cannot see a beneficial use for LLMs does not mean others cannot.

Knives kill and are needed to cook.

Clive Robinson April 18, 2023 12:27 PM

@ Winter,

“The fact that you, personally, cannot see a beneficial use for LLMs does not mean others cannot.”

Not true, otherwise I would not be in favour of strong and importantly well enforced legislation.

The problem is as I keep saying

“The problem is NOT the technology, but WHAT USE a directing mind puts it to.”

Trying to stop “the social bad” which can be very high with LLM and similar AI, yet promote “the social good” is difficult when the people who see money in it want “the social bad” and not “the social good”

@Bruce our host has a similar concern with getting regulation to work,


His point of “guide rails” is kind of like designing a knife that stabing[1] with is difficult, yet still have it usefull in the kitchen (which you can actually do, though it’s a little heavy in the hand).

The idea being to somehow “build it into the tool”.

Unfortunately as we know “grinding it out” is far from impossible, and something sufficiently technical Nation states would be able to do, as would most larger corporations.

Likewise “build their own” without the limitations is also possible.

The secondary problem is again one of the human mind,

“To many people think tangible physical security methods when they should realy be thinking in terms of intangible infornation security methods.”

Currently we are, outside of encryption, very light on intangible information security methods. Worse of the three information basics,

1, Storage of information
2, Communication of information
3, Procrssing of information

Encryption is only for “puting it in a box” or more formally “information at rest” be it for storage or communication, it does not yet work at all well for processing information…

[1] There are three basic ways to kill withva knife,

1, Stab
2, Slash
3, Bludgeon

The first by a very long way is how people kill with a knife. The second is by a very long way is how most use a knife in the kitchen. Bludgeoning with a traditional knife is rare you tend to find it with those more adept with Chinese style chopers when they use it to open shell or bone to get say the meat from a crabs claw.

Winter April 18, 2023 2:14 PM

@Clive, All

The idea being to somehow “build it into the tool”.

There are two ways to do that if you cannot use “loops” inside:

  1. Remove the information from the input
  2. Evaluate the “morality” of the output

Ad 1. I have not seen. Ad 2. is an active area of research with already good applications.

See and try out Ask Delphi:


As AI systems become increasingly powerful and pervasive, there are growing concerns about machines’ morality or a lack thereof. Yet, teaching morality to machines is a formidable task, as morality remains among the most intensely debated questions in humanity, let alone for AI. Existing AI systems deployed to millions of users, however, are already making decisions loaded with moral implications, which poses a seemingly impossible challenge: teaching machines moral sense, while humanity continues to grapple with it.
To explore this challenge, we introduce Delphi, an experimental framework based on deep neural networks trained directly to reason about descriptive ethical judgments, e.g., “helping a friend” is generally good, while “helping a friend spread fake news” is not. Empirical results shed novel insights on the promises and limits of machine ethics; Delphi demonstrates strong generalization capabilities in the face of novel ethical situations, while off-the-shelf neural network models exhibit markedly poor judgment including unjust biases, confirming the need for explicitly teaching machines moral sense.
Yet, Delphi is not perfect, exhibiting susceptibility to pervasive biases and inconsistencies. Despite that, we demonstrate positive use cases of imperfect Delphi, including using it as a component model within other imperfect AI systems. Importantly, we interpret the operationalization of Delphi in light of prominent ethical theories, which leads us to important future research questions.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.