The Future of Machine Learning and Cybersecurity

The Center for Security and Emerging Technology has a new report: “Machine Learning and Cybersecurity: Hype and Reality.” Here’s the bottom line:

The report offers four conclusions:

  • Machine learning can help defenders more accurately detect and triage potential attacks. However, in many cases these technologies are elaborations on long-standing methods—not fundamentally new approaches—that bring new attack surfaces of their own.
  • A wide range of specific tasks could be fully or partially automated with the use of machine learning, including some forms of vulnerability discovery, deception, and attack disruption. But many of the most transformative of these possibilities still require significant machine learning breakthroughs.
  • Overall, we anticipate that machine learning will provide incremental advances to cyber defenders, but it is unlikely to fundamentally transform the industry barring additional breakthroughs. Some of the most transformative impacts may come from making previously un- or under-utilized defensive strategies available to more organizations.
  • Although machine learning will be neither predominantly offense-biased nor defense-biased, it may subtly alter the threat landscape by making certain types of strategies more appealing to attackers or defenders.

Posted on June 21, 2021 at 6:31 AM17 Comments


Clive Robinson June 21, 2021 7:39 AM

@ Bruce, ALL,

The four conclusions are almost the same as have been given for computers since the 1960’s

Indicating that perhaps on “Machine Learning” there is nothing new above ordinary software.

Or to put it another Way,

“Man thinks, Man code, code runs faster, code runs finer, but code does not do anything not thought up by man.”

However there should be another conclusion,

“Speed kills”

If man offloads skills to computers, then yes those determanistic skills can be done faster, more effectively, and more efficiently by a computer.

But what of non determanistic skills?

It’s easy to see how, learning is killed, inovation is killed, and thus is progress starved and killed.

We half hartedly joke about “eveloutionary cul-de-sacs” and saber tooth togers. But in all such things there is a germ of truth.

Is over reliance on machine learning going to be mankinds evolutionary cul-de-sac?

We are already seeing issues with machine learning and the justice system. Where the non transparancy of so called neural networks and similar that are little more than glorified statistics packages are being used by authoritarian guard labour to evad responsability for their desired actions by hiding behind “The Computer Says” excuse.

We know the GIGO principle applies by the trash trailer full with machine learning. By running “hidden tests” criteria can be selected to give desired outcomes. That is the criteria though seeming random will give rise to a “training data set” that poisons or predisposes the ML system and causes it to adopt certain desired characteristics that are effectively automated “isms”.

We further know that the likes of Peter Thiel of Palantir are pushing machine learning systems into law enforcment. The hidden aim of which is “dependency thus profit” the same as drug dealers do, they sell you junk cheap, and you become dependent then thay jack the price.

In the Palantir model the aim is to get systems in, and get detectives phased out, thus the money that was spent on detectives goes to Geoff Thiel and Co, not on bringing detectives forward and new ones to follow them.

So the Machine Learning, which is incapable of learning and thinking thus responding in an intelligent way to changes in criminal activities, ceases to move with the criminal threat, but by the time that is realised the continuity of human detectives is broken, thus much of the most important skills are lost for quite some time if not for good. Thus irreparable harm is done for short term gain. It might be a politicians dream, but it will be societies near endless nightmare.

echo June 21, 2021 8:28 AM

I suspect it’s another one of those asking the wrong questions things again. There are too many lawyers of complexity and too many flows of information. “Machine learning” just escalates this. As the somewhat cryptic slogan of a billionaire who owns a dodgy platform says “the external is a reflection of the internal”. Well, not wholly but it can be. Data? Money? Perhaps even people? It’s all the same thing.

Take one large arrogant nation full of billionaires with more money than sense. Create a system where they think they are feudal Gods. All them to mouth off any way they like and move their money wherever they like. Allow them to bomb other places with impunity and dodge from place to place whenever they like always staying one step ahead. Give them a shiny new toy called technology which they can make bigger and faster and more complicated, and which gives them more power and more loopholes and reinforces the value of their capital as the price of competing which gets higher and higher. Where does it end?

There is something to be said for Ceefax and Minitel and time and resource contrained meetings in a world governed by how fast the paper mill could work. A world where real people met real people. Where the Amazon and Orient were places you read about in magazines and adventure novels.

Turn it off. Turn it all off.

Problem solved.

jones June 21, 2021 8:59 AM

One of the problems with all this I rarely see discussed is that we don’t really know what machine learning systems actually learn, only that they behave:

We won’t really know how this technology will change the threatscape, and by the time it does, we’ll be experiencing diminishing returns on machine learning technology….

Clive Robinson June 21, 2021 10:30 AM

@ ALL,

I’ve read through the lengthy paper twice now looking for things that might be new in terms of insight or technology.

Long answer short : If you are looking for something new or insightful, you are probably going to be disapointed.

If you look back on this blog you will find that a decade or so ago things that were being discussed anoungst “The Usual Suspects” in posts at the ends of threads were more advanced.

Longer Answer : There appears to be a fundemental misunderstanding about what the badly named “Machine Learning”(ML) systems realy are.

Change “Machine Learning”(ML) to “Digital Signal Processing”(DSP) which is what it actually is and you will start to understand why the information presented in the paper is not insightful or particularly technically interesting. That is the domain of art that ML is is fairly insular and appears to fail to learn from other domains of art such as Signal Processing, that has slashed new paths and trampled them in whilst the ML people are still drinking their morning joe.

The result of which is the paper makes some groan worthy comments that would cause some to not just shake their head slowely but then put their hand over their eyes and mutter OMG despondently.

A simple uncontroversial example.

They bring up the French Presidential elections and what they portrayed as a honeypot but the way they describe it a “poisoning of the well”.

In essence real emails and similar were put in with so many fake emails that if anyone stole the cache it would in effect be usless to them. Well as the papers authors indicate, it was not usless to one group of attackers they simply published the lot…

But what they failed to mention was a rather awkward issue. The well might have been poisoned but those who poisoned it still had to drink from it… So how did they make the contents safe?

Well with an information well you “add a signal” and there are two basic ways to do this,

1, An in-band signal.
2, An out-of-band signal.

Which oddly I discussed the basic pros and cons of respectively just yesterday,

However such a signal is a marker or in this case “distinquisher” to say which files in the cache are wanted and which ones poison.

If that signal is “in-band” which it probably was it would be hidden in some way most likely in each file, as that would be the safest way to do it, lest you swallow your own poison.

Now taken as just a cache of files on their own that distinguisher is probably not visable to an attacker with no further knowledge…

But is it true that the cache now being available is still safe?

Well no, probably not. Because there are many other caches of information thay can be used to test each file in that Political Cache and give them a weighting or probability of being true or false.

Thus you now start to have an indicator, that can be used to find thay in-band signal. It can be a labourious job by hand but one that variois DSP algorithums with adjustable matched filters excell at (so ML will as well).

Thus it’s entirely possible with a little effort to find the distinquisher and then it’s game over for the cover of those “fake files”…

This sort of de-anonymisation is not exactly new Ross J. Anderson and others over at the UK Cambridge lab have talked about and researched it for the past decade.

As another example, the papers authors talk about “asymetric advantage” and say that defenders have an advantage because the defenders have all the data about their internal systems and attackers do not.

Well first off the proposition that the defenders have all the data is a logical and practical imposability, but if you want to get cutesy and all Laws of Physics and Mathmatics about it the process of mrasuring turns a continuous process intoba decimated process and therby destroys information this was known by Nyquisy Hartly and Shannon before the second world war, and the knowledge forms the very foundation stone of information theory and all the mathmatics that came after it such as “channel capacity” and how high frequency componets get reflected or down converted into the low frequency band below the sampling frequency (look up a combe generator to see how the process works in reverse). Thus to not polute your wanted baseband data you have to sample at many times the makimum frequency you want and put significant filtering thus information destroying filtering in place.

Then there is the issue to do with spectrums and the discreat points along them. Information exists as points in a multi-dimensional space. When you measure you “average” in one or more dimrnsions and thus you get a scalar that has a large number of signals in it many of which (noise) can not be be predicted thus removed from the average without loosing other information. An attacker does not need to know all about the internals they only need to know about the “noise floor” at some point of the spectrum, in as little as two dimensions. not the whole spectrum in all directions that the defender would have to do. The attacker tailors their attack so that whilst it is “well below the noise floor” of the spectrum it is above the noise floor in a very small part of the spectrum (noise floor goes down as you narrow the width of the spectrum you look at). The attacker can then modulate there signal with synthetic noise that is known to them but not the defender. I won’t put up the maths but if you look up “Low Probability of Intercept”(LPI) or “Direct Sequence Spread Spectrum”(DSSS) systems you will find it gets taught quite extensively to undergraduates and has been since the 1960’s to my knowledge (books and papers on the subject in my dead tree cave go back that far). But the same ideas were being used a quater of a century ago to implement “Digital Rights Managment”(DRM) by “Digital Watermarking”(DW-DRM).

For a defender to spot LPI signals requires a level of resources that Google and Amazon together would very probably balk at.

I could go on but this post is probably too long as it is.

SpaceLifeForm June 21, 2021 1:35 PM

@ Clive

You forgot to wait 5 minutes and then force a refresh. The batcache still had the initial page that you initially saw.

JPA June 21, 2021 11:16 PM

Interesting article on issues with “machine learning” that came in a news feed this morning.

htt ps://

Clive Robinson June 22, 2021 6:23 AM

@ JPA,

…issues with “machine learning”…

Such linguistic issues should not be an issue, but clearly are.

Back in the 1980’s I was “playing” not just with toy robots driven by 8bit microcontrolers as research demonstrators but with bl@@dy great Puma industrial robots used in the Auto Industry driven by Micro Vaxes and other high end 32bit computers runing Real Time Opetating Systems(RTOS). Such “robots” would smash you skull to pieces and not even be effected due to their basic lack of resources.

We would not recognise them as robots these days just more flexible tool extentions to other tools like CNC machines.

Back then everything was very very very mechanistically determined. For instance it would pick a car door up off of a stack, not because it could sense where the pickup point was, but because it was given the exact point, exact orientation and exact route to take to reach it.

Re-calibration was done by having a sensor in a safe place the arm would traverse through and the extetnal sensor would provide correction data to the control program.

That is what humans do for gross motor control, but human fine motor control is done by sensors in the fingers that can be a thousand or so times better than our eyes are at a meter distance.

Engineers know these things almost implicitly and are rarely confused. In part because the language used is very specific within the domain and engineers know from context when domain specific language is being used and when more general 20,000ft view sloppy language is being used to convey generalised meaning.

Working with many “programmers” they tend to just be sloppy with language unless it is very specifically source code on the screen. Every thing else such as flow diagrams entity descriptions all sloppy and arm wavery unless they are domain experts in other fields such as researchers, scientists, mathmeticians engineers that use software as a tool like they would a screwdriver, drill, pipette or equation.

Unfortunately there are new researchers that in effect have “soft systems” backgrounds or are even further away from what arectraditionaly called the “hard sciences”. They kind of fill the gap that architrcts and fashion designers do in their respective domains. That is they take a conceptual approach where fluidity of language is high and constantly changing. You see this in some “web designers” and similar dealing with the look and feal concepts of the HCI.

Yes it has it’s place but there is to much “bleed through” and even researchers are now succumbing to conceptulising and soft abstractions are seen by some as almost the “be all and end all”…

This leads to “soft explanations” that give rise to incorect notions.

Engineers who work with few soft abstractions, tend to have an implicit horror of them, unsuprisingly they know what 10,000 Newtons force[1] actually means in very real terms. As many others do not the result is they can get written off as the equivalent of “Mansplainers” by those who fail to realise the very real danger they are failing to see.

[1] For those unfamiliar with Newtons of force, there is a very cute way to think about it. Sit there imagin you are Sir Isasac with your hand out stretched palm up with a quater pound English eating apple wresting on it – that is roughly one Newton of force. Now imagine trying to hold ten thousand apples all in your palm at the same time… What do you imagine would happen to your wrist, elbow etc if you tried? An engineer could Mansplain that as crudely[2] as you like, but could an architect or fashion desiger do it conceptualy?

[2] Think Michael Cain and his oft quoted line about doors… The BMC Mini that featured promenently in that film would –if memory serves correctly– be, if you could pick it up and get it on one hand, about 5,000 Newtons (conceptual enough 😉

Clive Robinson June 22, 2021 6:42 AM

@ Winter,

Or a “ton up” for those not using those strange French revolutionary measures 😉

OHGREAT June 22, 2021 7:01 AM

If it takes hold and it most likely will, you will create the next great educations and employment shift. Schools teaching those to be Programmers,Admin,Cyber or whatever else you want to call the iT space will loose there jobs to AI. It might take years but the automation will happen where the need for ” expert” will be smaller and more reliance on AI to handle what is done by human now. Schools need to see this and adapt and start planning. I firmly believe we need to cull the number of programs and schools to regional centers of teaching to control the number of IT people. Look to what the laws schools have done to themselves by putting out as many grads with law degrees as humanly possible. How many graduate with law degrees each year? How many laws schools are there?. Answer, As of 2020, 34,420 students graduated from law school in the United States and There are 205 ABA-approved law schools and about 32 Non-ABA approved law schools. Read the boards and see how many are in debt and cant find jobs.

How many IT,programming,security,IT certs,cloud certs, and in general degrees and “certifications” are done each year VS jobs, its a money making pyramid scheme by the schools which machine and AI will slap down in the near time frame.

Sure you will always need humans to do various admin type work, but matchup through needs analysis to what will be done by AI/ML and were set for the next great job market conundrum.

Security Sam June 22, 2021 9:53 AM

As machine learning morphs into main from peripheral
And the human element becomes mainly ephemeral
Cybersecurity will transform into a realm of ethereal
Where they will not be any need for tangible material.

Clive Robinson June 24, 2021 2:30 AM

@ Bruce, and all with a little “higher math”,

There is a new book coming out about Deep Neural Networks (DNN) and how we might actially get some real theory for them. Not as is mainly the case currently of build a soft/hardware implementation and twiddle the weightings of the individual artificial neuron functions to get an approximate match.

You can download a free copy of the book, and it looks interesting so far (it’s ~500pages and even I don’t read at flash bulb speed).

Thr authors compare AI/ML to the early days of Steam Engines and artificers / artisans who fiddled with the quite productively until what we now call Scientists and Engineers put statistical mechanics on a firm footing (funny I’ve used the same analogy for years). Thus we could start to describe theotetically what those doing the practice had found, and more importantly improve, not just our knowledge but our command of the systems.

It is no secret that currently ML is seen as a black-box with “Ju-Ju magic” or worse in it and overly susceptable to crafted “training data”.

For ML to have a,significant future then it needs real theoretical under pinnings.

Read a little more and get the book via,

And enjoy the read.

Harry Wilson July 2, 2021 2:03 AM

Woooow!This article is just super and I just admire you, your writing skills are just on the top level of the universe! And the way you use teel structure…Ahh, charming!Would love to read more writings from you!

A Nonny Bunny July 3, 2021 3:07 PM

“Man thinks, Man code, code runs faster, code runs finer, but code does not do anything not thought up by man.”

Actually, pretty much all code does things not thought up by man, because man didn’t really think through what man coded. Bugs aplenty.

You can also generate code using genetic algorithms and a true number generator and get code not thought up by man at all.

Oh well. Man is nothing but a cloud of interacting elementary particles that thinks too much of itself, anyway.

Clive Robinson July 3, 2021 6:41 PM

@ A Nonny Bunny,

Actually, pretty much all code does things not thought up by man, because man didn’t really think through what man coded. Bugs aplenty.

That is usually down to the fact that the “Man thinks, Man code” is not done sufficiently iteratively.

Thus the “code runs faster, code runs finer” iterative process automatically becomes a failed process.

Broadly you could argue one or more of these are to blaim,

1, Man is deficient
2, Process is deficient
3, Process is constrained / curtailed.

I’d argue all three are in part to blaim in all “bugs” and most definitely in all failed projects.

But you raise,

You can also generate code using genetic algorithms and a true number generator and get code not thought up by man at all.

I think either you are just restating the problem differently trying to make it look like it’s a different process (which is fairly easy to prove it is not). Or you’ve left something out of your statment…

That which is missing being something Alan Turing was fairly insistant went into the earliest of computers, as part of the process. It’s something I use fairly frequently and it comes under the more general term of “Extended Monte Carlo Methods”(EMCM).

The problem with any EMCM is it almost always falls into one of the “Arthur C Clarke’s” eponymous cognative failing bear traps by those observing such a process[1].

Whilst Clarke did not call them “cognative failing” they appeared in an essay with “Hazards of Prophecy: The Failure of Imagination” which is essentially saying the same.

The usual bear trap into which you have wandered is,

“Any sufficiently advanced technology is indistinguishable from magic.”

Which has become more pertinent in the past few years as,

“Any technology, no matter how primitive, is magic to those who don’t understand it.”

And from that, why we have the version for ML of,

“Any sufficiently advanced act of benevolence is indistinguishable from malevolence”

Which is the “utopian fear” of “The machines taking over” and “making us their pets” or similar. Great for “selling copy” to the “Woo Woo Crystal Healing Croud” but seriously? Long answer short NO. But it also gets the real “Chicken Little” veneer with the notion of “Sky Net” from the “Terminator” films… So is an almost guarenteed sell…

The notion of taking “random” and imbuing it with mystical power, is a failing we more normally see in gamblers with their belief in “lucky streaks”. Or those who have “It’s my turn” syndrome because somebody came up with the phrase “You have to be in it to win it” to gull them out of money. When simple mathmatics tells you, you will probably not win it in your life time.

Even the simple study of “Brownian motion” in high school should tell people a lot about “random” and how “perceived effects” are actually caused by less obvious “underlying causes” that can work either way[2] apparently unpredictably.

Most humans have a failing which is an artifact of a survival trait. We see meaning in cloud shapes, ink blots, shadows, and shading, that are realy not there. It is the result of a trade off made long ago between expending energy responding rapidly or being taken for lunch. Thus the acceptance that seeing something and scooting up a tree etc falsely because it might be a preditor stalking, was better than being certain but then not able to make it to the tree and becoming a “dogs breakfast”.

The problem is it can become an irrational way of living often called “follow your hunches”. People glorify the very low probability successes and ignore the high probability failures. We even have reality TV shows like “Dragons Den” and the “Apprentice” pushing such ideas to those who can not reason that the shows “hook” is in fact it’s a “freak show” for those that get their rocks off on others failings.

You talk of “generate code using genetic algorithms” as some kind of “magic” not a creation of “man’s thinking” or you think although you don’t say it “random” is magic when it’s not.

A thought for you, there are “lotto games” around with the worked out odds of winning being 1 in ~15million and around ~60million individual tickets purchased each week. So what are the odds of the top prize not being won in any given week? How about every four weeks or longer?

Random algorithms frequently trade the potential of being faster to a solution against the probability of never finding a solution. Where as a determanistic search may on average be slower, it does get to finish in a known time period. Random algorithms can be even faster when you only need a solution within some percentage of the optimal solution.

Yes you can use “random” for other things but there is no inteligence or magic behind random, and it’s use from generation onwards is still determanistic and a product of “mans thinking” good or bad.

[1] Also called “Clarke’s Three Laws” whilst at first they appear general the first and third are about cognative failings in “outsider observers” whilst the second is more about the cognative failings of “insider observers”.

Hence Isaac Asimov’s Corollary to Clarke’s First Law, effectively invokes Clarke’s the third law and humanities fondness of ascribing power to that they do not comprehend. A fondness that unfortunately goes both ways and shows my usual points that,

1.1 Technology is agnostic to use.
1.2 It is the Directing mind that decides the use.
1.3 It is the Observing mind that decides if the use is good or bad.

Clarke’s second law is an impricise way of saying another of my oft repeated points about judging technology,

1.4 If the laws of physics alow it.
1.5 Then someone will eventually try to build it.

The fact that over the past century and a half, people have made what are different facet arguments to essentially the same gem of an idea should give you an indication of just how fundemental to modern society the rationalism of science is compared to the mystercism engendered for societal control in the past.

[2] Having had the misfortune to describe the issues around “pivot points” and “finite precision” and why “true” inverse matrices are almost always “staged examples”, to students and other neophytes thinking to make use of them to solve real world problems with computers. With them not realising the reality of results being “not just off a little”, but “chasing Voyager”, unless a lot of corrective techniques are used which tend to make the “cure” often worse than the “disease” especially in constrained environments.

Tom Foale August 6, 2021 3:14 AM

There’s a major difference between machine learning, which is only as good as the developer of the algorithm, and deep learning which is only as good as its design and its training. I’ve only come across one technology in this whole space that really delivers, and that required developing a new deep learning system, essentially a new ‘brains, specifically for detecting malware and, more importantly, trained on billions of samples and documents. It stopped everything we threw at it – zero days, the latest ransomware, fileless, scripts, bad files etc, without stopping me installing good software. I’m not going to promote it on here, but look for it. We’re running it.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via

Sidebar photo of Bruce Schneier by Joe MacInnis.