Research on AI in Adversarial Settings

New research: “Achilles Heels for AGI/ASI via Decision Theoretic Adversaries”:

As progress in AI continues to advance, it is important to know how advanced systems will make choices and in what ways they may fail. Machines can already outsmart humans in some domains, and understanding how to safely build ones which may have capabilities at or above the human level is of particular concern. One might suspect that artificially generally intelligent (AGI) and artificially superintelligent (ASI) will be systems that humans cannot reliably outsmart. As a challenge to this assumption, this paper presents the Achilles Heel hypothesis which states that even a potentially superintelligent system may nonetheless have stable decision-theoretic delusions which cause them to make irrational decisions in adversarial settings. In a survey of key dilemmas and paradoxes from the decision theory literature, a number of these potential Achilles Heels are discussed in context of this hypothesis. Several novel contributions are made toward understanding the ways in which these weaknesses might be implanted into a system.

Posted on April 6, 2023 at 6:59 AM34 Comments


Clive Robinson April 6, 2023 12:03 PM


“One might suspect that artificially generally intelligent (AGI) and artificially superintelligent (ASI) will be systems that humans cannot reliably outsmart. “

It’s about AI that’s not going to “drop off it’s perch” in the near future, as all parrots stochastic or not eventually do.

However the presumption of “outsmart” is something that is perhaps unwarranted.

It’s been in the news that a computer system that had betten the worlds leading “Go Players” got soundly thrashed by a rank amature.

The reason it turns out is the amature played a truly appaling game, that even another new to the game human would have taken advantage of and won. However as the rank amatures game, or anything like it, had been in the training data the computer system it had no rules to follow thus floundered or wallowed like a warthog in quicksand.

It turns out that these AI systems are only smarter at following the rules than humans are. If however a human makes even the dumbest of dumb moves outside of the rules then the current AI systems have no rules to follow thus can not make even simple intuitive moves.

So there is actually little to worry about currently, except for those that want to use AI Systems as an arms length avoidence of responsibility via the “Computer Says” excuse.

But… As the noise about Stochastic Parrot LLM and similar is starting to show signs of failing, the question that inevitably arises is,

“What next?”

Well one thing that is starting to “bubble up” again is “hive / Collective” minds via “Brain to Brain Interfacing”(BBI) and what form will it have, and what the effect “collective minds” will have,

However, in between I suspect we will first see an intermediate form of “hype” to keep money flowing into AI. So I expect to seen around the notion of some how combining stochastic parrots in a way to find “new rules” so they can actually out smart even “dumb humans”…

JonKnowsNothing April 6, 2023 1:05 PM

@Clive, All

re: Truly Appalling game strategy

This is a good strategy in many games, where you are pretty certain that you will be over matched (even with ELO) and once you determine you are not able to win 1v1.

Computer chess (example) is very rigid. It plays a percentage game at every stage. The analysis is always Y or N. There is no invention in the system. Yes, they can hard code stuff like (If player does (X and Z) then Counter (A) ). In the absence of hard coded one-offs, edge cases, corner cases, the system flops.

So, players who train on computer systems, mimicking computer moves are vulnerable to Truly Appalling moves.

There are card collection games where players collect special cards that have special abilities and build custom planned “decks” for competitions (e-sports). They are very knowledgeable about a specific deck and what cards are in it. Middle-Lower ranked players will build Ditto-Decks but when faced with a player using the Truly Appalling Strategy their High Point Decks can flop. You may not win but the other player doesn’t win by mega score.

iirc(badly) One of the Star Trek spin offs, Data had to play a champion game player who was the best at that game. The first time Data lost and could not understand how his superior calculation ability failed to achieve a win. On a subsequent game, Data discovered that Playing Not To Win was they key to blocking the other player.

When using the Truly Appalling Strategy, you still have to be experienced enough to recognize what the other player is doing and counter with the best No-Move you have. Some games use a PRNG and some use Dice Rolls for outcomes.

SpaceLifeForm April 6, 2023 6:23 PM

You probably should not train the AI.

You may be the player in the game.

Interesting thread:


Clive Robinson April 6, 2023 8:11 PM

@ SpaceLifeForm,

The problem with Mastadon is it requires certain things that no sane person wanting privacy should tolerate…

So is there a “nitter” equivalent link you can use?

SpaceLifeForm April 6, 2023 10:42 PM

@ Clive

Good point.

I’m sure the story will be written up elsewhere, and when I see it, I will provide a link.

I will research further. There is zero reason to require Javascript in read-only mode.

SpaceLifeForm April 6, 2023 11:44 PM

@ Clive

Interesting behaviour differences.

MicahFlee is on the same instance as me.

If I pick someone else’s post from a different instance (even if I follow them), and open in a new tab, the page appears as I expect. It knows you are not logged in on that instance (server).

At that point, I am in read-only mode. It says I can login, but I can not because I do not have an account on that instance.

So, why would it need Javascript?

I think there is a way to do a non-Javascript read-only mode like nitter.


Anyway, the Cliff Notes of what Micahflee described was using ChatGPt-4 to create some code to create a Tor server with Python with a self-signed certificate.

There were bugs that Micahflee knew how to fix.

Micahflee was training the AI, which I recommend against.

Clive Robinson April 7, 2023 5:07 AM

@ SpaceLifeForm, ALL,

Re: ChatGPT creating Tor Server.

“[T]he Cliff Notes of what Micahflee described was using ChatGPt-4 to create some code to create a Tor server with Python with a self-signed certificate.”

The hard part would be getting the specification right, to feed into ChatGPT.

Look at it this way, in my dead tree cave I’ve the various multi-volume “Stevens” books which have C code for Unix platforms to build just about any kind of base level server.

Likewise I have other books that have the required Crypto-Code in C that will go through just about every C tool chain without issue.

I’ve also books going back to the 1970’s on programming and debugging C at all levels, I’ve even a couple of books on GCC with regards to using the C tool chain in Unix environments.

Similarly I’ve books and papers describing the various parts of Tor and similar as “rapid prototypes” which Python and it’s many libraries support reasonably well. And yes I’ve the books required to go from nix-expert in Python.

So I could turn my “dead tree cave” into a “Searl’s Chinese Room” for programming.

Basically turning ChatGPT into a “Code-Cutter” style programer should not actually be either unexpected or difficult.

I suspect those “outsource abroad” Corps that have undercut Western programers to less than the bone, are seriously looking at using ChatGPT to do exactly that.

Begun back in the late 70’s and marketed in the early 80’s there was a programing system for the Apple ][ called “The Last One” that described it’s self as “A Fourth Generation Language”(4GL) it alowed you to generate hundreds of flawless lines of BASIC to build “productivity tools” etc. It was fun to play with and many computer journalists of the time said it would kill programing as an occupation by making it easy for anyone to do…

Well programmers are still here, but you can read more from over 13years ago on the fate of TLO at,

I would recommend having a read through it.

I’ve been accused by certain people of not knowing what I’m talking about when I differentiate between those programners that follow “engineering practices” and those that “code-cut”.

Well ChatGPT is a Searl’s “Chinese Room” “Code cutter” which will make a great “productivity tool” for those that know how to “do engineering methodology” properly and turn them not into 10X programers but 100X programers.

Will it be capable of killing off programming as an occupation well yes and no…

A look back at the Victorian period where “Artisan craftsmen with their Guild patterns” gave way to the use of what would become “Science” crrated “Engineers” and an “industrial reveloution” that has delivered in a little under a couple of centuries our modern world.

So yes the artisans died out in part because they did not have the patterns needed or the skills to develop them as needed. But also the world that gave them their living changed, and did not need the product of their secret Guild Patterns etc any longer.

But those who embraced science and mathmatics became “engineers” and have turned the fantasies of “artists” who became “architects” into the reality we see all around us.

So yes I can see what is going to happen to “code cutters” call ChatGPT if you like the penultimate “code reuse”.

But remember the important “reuse” ChatGPT is no more an inovator or inventor than a CNC machine or 3D printer. It will take the drudge out of producing code, thus freeing up capable minds to create. But just like using a 3D printer, you have to respect certain “rules of engineering” otherwise things will go astray. Few realise it but the journy from fantasy imagination dreams, through invention, then inovation, to a practical product is “an engineering process”. If you are a bad engineer then you will be a bad inovator, bad inventor, and bad dreamer and in effect a failure. We even have “Reality TV” programs like “Dragons Den” where you can see dreamers, inventors and inovators bid to get funding to take their ideas forward. Usually you can tell those who stand a chance not just of getting funding but taking it through to a successful product and even of creating a new market by their thinking processes that come through in their behaviour.

Even “weed heads” can have dreams that will create new products and markets, I’ve worked with more than a couple in my time. The thing is the ones I chose to work with were at heart “engineers” and all they needed was a little help learning how to turn it from inate behaviour to practiced skill.

ChatGPT will never dream, nore will it invent or inovate, because it lacks the skills to engineer.

We see what looks like “fantastic art” come out of other supposed AI systems, but again they are just “Chinese Rooms” after looking at one or two images you quickly see that they are “bubble gum” lacking any originality beyond what they’ve been told to do and even then they get so much wrong and that’s where the danger lies.

Ask for a “bikini model holding a spanner” and see what you get… Yes she will look stuning, but count the fingers and toes you might find extras or to few because of the way the algorithms chop up other images to forefill your request.

Now imagine what would happen if you asked it to design you a home to live in?

The result would be beautiful to look at, but think about the reality of “pipes, plumbing, wiring” used to bring in “services” and take out “waste” and would you realy want the potable water tap in the kitchen cross coupled with the black water from the bathroom? After all the pipes are just pipes to an algorithm that can not learn the difference to the level required.

Door April 7, 2023 5:24 AM

You recall well, actually, that was The Next Generation S2E21, “Peak Performance”. Data first lost to his opponent, and then, in a rematch at the end of the episode, frustrated him into resigning by playing for a draw, assuming that he would be playing for a win and would think Data would be doing the same. The general lesson being that if you know what your opponent expects and is trying to do, you can exploit that, especially if he might be overly aggressive. I don’t know if any computer chess engines are vulnerable to this kind of strategy, but some human grandmasters have played that kind of game successfully. Tigran Petrosian comes to mind.

Clive Robinson April 7, 2023 7:09 AM

@ Door, JonKnowsNothing,

“You recall well, actually, that was The Next Generation”

I remember that episode for another reason. The writers and producers were obviously looking for a new “alien race” to be not “evil” but corporately capricious (possibly as a stab at the “bean counters” and “Execs” much like Futurama did with the Monkey and his hat).

The “opponent” in the episode was actually quite loathsome in a number of ways, and I can not help but feel he was a part prototype of what would eventually become the “Frengi”.

Which in many ways would be a cross between “The Swiss Gnome Bankers” and “Davos frequenter” types cross bred with US Style “Corporate C Suite loungers” and “Tell the tale con artists”…

The sub plot being such types do get as Shakespeare pointed out “hoist with his own petard”.

Looking back it must be nearly a third of a century since I watched the episode… And yes I can also remember who I was “cuddled up with” on the sofa with whilst watching it. And more importantly the less than successful attempt at a new snack popcorn of “chilli and Mature Cheddar Cheese” flavour I’d whipped up… The lime cheese cake using a gingernut biscuit base I tried the next week worked way way better and for some reason I don’t remember that episode other than it was the last of the season, not even sure we watched it all the way through.

Winter April 7, 2023 9:30 AM

As a challenge to this assumption, this paper presents the Achilles Heel hypothesis which states that even a potentially superintelligent system may nonetheless have stable decision-theoretic delusions which cause them to make irrational decisions in adversarial settings.

Think optical illusions, or any other sensory illusion. But I think that the authors refer more generally to “stupidity”, a neuronal universal.

Stupidity is the application of efficiënt heuristics, shortcuts, when they should not be applied.

As every “thinking” being has computational limitations, every such intelligence can, and will, improve performance by implementing efficiënt heuristics. That is like saying, it went well 100 times, so I can skip the security checks the 101st time.

You obviously all have experience in similar situations that taught you the limitations of such a strategy.

As AI is already extremely resource hungry, applied AI will almost certainly have to cut corners. And, therefore, will be prone to stupidity.

lauren April 7, 2023 1:50 PM

hmm. y’all wouldn’t happen to have any promising galaxy brain ideas about how to permanently upgrade the security of all computer systems (or ideally, all computational systems, including human brains and biology) using ai would you? or if not using ai, then something that could make a difference across a wide variety of systems despite not being based on ai such that it can withstand 2024’s ai attackers. the internet is catastrophically vulnerable right now and if we can’t get qualitative improvements in security fast we’re going to meet a whole new era of threat density and attack discovery rate and malware mutation rate… already starting with social engineering…

ResearcherZero April 8, 2023 12:22 AM

With repeated requests, though, it dutifully generated the exact same code it had just said was too irresponsible to build.

And a couple of interesting examples…

“So the moral of the story is, don’t use GPT to give you advice on anything security related.”

Clive Robinson April 8, 2023 1:46 AM

@ Winter, ALL,

Re : Do AI’s dream of Electric bananas when they see clouds?

“Think optical illusions, or any other sensory illusion.”

We know that around eight years ago Google’s Nueral Network system (AI) that was trained in visual recognition, when put in a form of feedback loop effectively turned “noise” into “scream” worthy images,

And from my point of view that was what you would expect, from work that is a century or more old.

That is hit a matched filter with random noise and it will amplify noise that loosely matches the filter characteristics. Not so much the striking of a bell of an impulse response, but more the blowing across the top of a bottle response.

Feed that output back and as with all systems containing an amplifier and filter, if the overal gain is greater than 1 at some phase shift of 360 degrees or similar around the loop you will get resonance build up we generally call oscillation.

It’s one of the effects I aluded to earlier.

But what happens when there are effectively two or more filters in the feedback loop with very different characteristics? Well we know the answer to this from the very early days of radio when the amplifing device of thermionic valve/tube was not just low gain but very expensive. The “Reflex Receiver”[1] was one result,

However you get a very similar effect if you have very lossy filters and you use the feedback as a way to get gain (which is what I suspect Google was trying to do).

But what happens if the feedback is to high? Well we’ve known that for a long time basically you get chaotic behaviour that becomes significant instability, untill at some point the system breaks into oscillation, which initially forms a sinewave which will then increase in amplitude untill it eventually become more like a squarewave as the level reaches the supply rails. In the process due to the nonlinear behaviour creating lots of “new signals” we call harmonics and distortion. However oscillation in feed back loops is still useful.

Whilst the early regenerative receiver[1] worked on the positive feedback principle with a valve/tube only capable of a voltage gain below ten, with small positive feedback added it achived gains up in the tens of thousands. It was however limited at a point well before things become chaoticly unstable due to dynamic changes in the system.

However such regeneration takes time to build up to the point of chaotic behaviour instability and evential oscillation, so the question of what happens if you “quench it” before or even after it gets to the point of oscillation?

Well it’s known that providing the quench signal is above twice the Nyquest frequency of the desired signal bandwidth then the effective gain goes up into the million range. Called a super-regenerative[1] system, it gives extrodinary gain at very low power and minimal complexity. Due to the cost of LLM and similar AI systems the idea of quenched positive feedback will be atractive to those looking to get more bang for the buck.

However one result will be increased “new signals” from the chaotic behaviour and distortion, as well as the harmonics and sub-harmonics of all the signals creating intermodulation signals. Often called “spurious” signals, whilst not strictly “noise” the result will with complex filtering produce “ghost signals” and the like.

But why would AI researchers especially those developing deep neural networks want chaos and distortion giving ghost signals in their systems?

Well it turns out that AI has a quite common problem due to the limitation of the system. Due to it’s lack of “real world” ability when you train an AI system it not suprisingly sees the training set as being the entirety of it’s world and unless adaptive can not change that. Thus it then sees anything new as having to be something in it’s training set even if it’s not. So effectively the AI can not work outside of the training set, hence the issue of a cat being seen as a dog, and a chair as a table.

There are two basic starting solutions to this problem,

1, Increase the size of the training set.
2, Introducing a limited degree of chaos and distortion into the training.

The first runs into a “never ending” problem, in that the data set will never be compleate, and worse as the set incresses each doubling produces diminishing returns.

The second coveres a range of solutions called “regularization methods”. One example of which is, “dropout” where a stochastic source is used to effectively randomly ignore some data… But it has it’s limitations as well…

Which is why hybrid solutions tend to be used, the idea being that each method covers some other methods deficiencies. However as anyone who has seen Venn diagrams knows you always have either gaps or ovelaps and often both in hybrid solutions.

Therefor systems designed for minimal gaps have large overlaps which is not just inefficient it causes other problems like race conditions that can give rise to meta-stability issues…

One fun side of this is the recent “overfitted brain hypothesis”[3] which might need a pinch of salt the size of lott’s wife” to make palatable to some.

Essentially the originator of the hypithesis Erik Hoel says,

“The original inspiration for deep neural networks was the brain,”

And using using deep neural networks to describe the overfitted brain hypothesis was a natural connection because,

“If you look at the techniques that people use in regularization of deep learning, it’s often the case that those techniques bear some striking similarities to dreams”

So call it “dreams” or “hallucinations” AI systems certainly can have them, and they can arise from chaos and distortion …

[1] Strange as it might seem to most, the design of early radio systems and the desgin of modern AI / neural network systems have a lot in common as you are trying to drag signals from noise[2]. So a knowledge of the receiver types and their strengths and weaknesses will help understand the “oddities” that appear to happen,

[2] Likewise Digital Signal Processing, which also in it’s early days went through designs similar to early radio systems.

[3] The “overfitted brain hypothesis” is based on the notion that as neural networks are based on –our very limited understanding of– the human brain things descovered in neural networks reflect back onto humans… If this is even remotely true is open conjecture, however to see how it came about,

Winter April 8, 2023 5:47 AM


There are two basic starting solutions to this problem,

3. Image relevant experiences and learn from them.

It is done in Autonomous Driving:

One critical bottleneck that impedes the development and deployment of autonomous vehicles is the prohibitively high economic and time costs required to validate their safety in a naturalistic driving environment, owing to the rarity of safety-critical events1. Here we report the development of an intelligent testing environment, where artificial-intelligence-based background agents are trained to validate the safety performances of autonomous vehicles in an accelerated mode, without loss of unbiasedness. From naturalistic driving data, the background agents learn what adversarial manoeuvre to execute through a dense deep-reinforcement-learning (D2RL) approach, in which Markov decision processes are edited by removing non-safety-critical states and reconnecting critical ones so that the information in the training data is densified. D2RL enables neural networks to learn from densified information with safety-critical events and achieves tasks that are intractable for traditional deep-reinforcement-learning approaches. We demonstrate the effectiveness of our approach by testing a highly automated vehicle in both highway and urban test tracks with an augmented-reality environment, combining simulated background vehicles with physical road infrastructure and a real autonomous test vehicle. Our results show that the D2RL-trained agents can accelerate the evaluation process by multiple orders of magnitude (103 to 105 times faster). In addition, D2RL will enable accelerated testing and training with other safety-critical autonomous systems.

ResearcherZero April 8, 2023 7:39 AM

Interesting legal test…

“Rather than heralding Hood’s whistleblowing role, ChatGPT falsely states that Hood himself was convicted of paying bribes to foreign officials, had pleaded guilty to bribery and corruption, and been sentenced to prison.”

Clive Robinson April 8, 2023 9:51 AM

@ Winter, ALL,

Re : Training AI.

3. Image relevant experiences and learn from them.”

Err no. As far as I can tell from just the introduction which is all that is available, it is just a hybrid variation of the first basic method.

In essence it is the equivalent of taking a very large data set and removing redundant data from it before pushing it into the training phase.

After hunting around to see if an autgor pre-print of the paper was around I found,

Which describes what I assume is the system in use. Which again points out,

“a dense deep-reinforcement-learning (D2RL) approach, in which Markov decision processes are edited by removing non-safety-critical states and reconnecting critical ones so that the information in the training data is densified.

That is the “removal” of “redundant training data” to compact or make more dense by a significant factor the actual training data. Thus the equivalent of increasing the “actual data content” for any given size set of training data set.

However the rest of the article is full of mistakes, and appears to be a “rush edit” made by cutting and pasting from a larger document, without time for satisfactory proof reading, so makes judging the actual content further rather more than dificult. Hopefully the “Nature” artical is not of a similar low standard.

[1] The Journal is one of Nature’s an organisation that is becoming more and more disreputable for various reasons. Not least it’s rapacious behaviour with regards authors and their work and denying Open Access, and trying to charge ridiculous disproportionately high fees for blatent control and rent seeking.

Clive Robinson April 8, 2023 10:56 AM

@ ResearcherZero, ALL,

Re : Interesting legal case.

I doubt that it will get to court as it would in effect have to happen in the US. Where the old “insufficient standing” will be tried by any defendant.

So the first hurdle to cross would be is GPT “a tool or an agent”. If GPT is judged a tool, then it is nomore than a hammer used by a person to cause damage/harm and as such is effectively blaimles and it is the hand holding it under a directing mind that is to blaim (ie the user, not the developer). The users defence would be that the tool was negligently designed/developed thus would have to show “negligence by the developers” which would be a tar-pit of legal argument and appeals which under the US legal system favours the deepest pocket nearly every time.

If GPT is judged to be an agent sufficient to qualify under “any person legal or natural” you still have to show a “Directing Mind” to move forward.

But the problem of a “Directing Mind” will come up and that is going to be interesting with a “Stochastic Parrot” that GPT is “alleged to be”.

Because notionaly even if GPT passes the “any person…” test as a defendent it would have to be “of sound mind” as well as demonstrating either “intent to harm” or “knowing negligence to prevent harm”.

GPT does not have a “mind” sound or otherwise, and the word “stochastic” means random so it is in no way “directing” just following the fates of chance.

Which brings us back to the developers, were they either “intending harm” or “knowingly negligent” to prevent it.

This will boil down to “attractive nuisance” type reasoning. That is could a perspective user be considered sufficiently competent to use GPT safely or should there be guards in place to prevent harm by a user.

In effect it’s like having a swimming pool or trampoline in your back garden. You are expected to know it will attract the attention of those not competent to make reasonable adult judgment (ie children). Thus you should know to put in place access control and importantly not just a warning notice.

As far as I can tell what the developers have done is not even realy the equivalent of a warning notice, maybe just a cautionary note at best… But can a US Judge be convinced of that by a non US person? Analysis of past judgments suggest it is unlikely.

Then of course is the issue of “demonstrating harm”. How you would go about this? It is not like Google and links, where Google kept records of links served up and to what IP addresses etc for the purposes of “generating income”.

Interesting though is that the latest version does not do the same thing as the earlier version. Which begs the “Why?” questions that could also be seen as admissions of guilt by the developers…

modem phonemes April 8, 2023 11:59 AM

@ Clive Robinson @ Winter

Re: cars that do lunch

This article seems to address similar problems using similar methods

“ There have been numerous advances in reinforcement learning, but the typically unconstrained exploration of the learning process prevents the adoption of these methods in many safety critical applications. Recent work in safe reinforcement learning uses idealized models to achieve their guarantees, but these models do not easily accommodate the stochasticity or high-dimensionality of real world systems. We investigate how prediction provides a general and intuitive framework to constraint exploration, and show how it can be used to safely learn intersection handling behaviors on an autonomous vehicle.”

Winter April 8, 2023 2:00 PM


In essence it is the equivalent of taking a very large data set and removing redundant data from it before pushing it into the training phase.

You are right, I was extrapolating the paper based on the nice augmented reality pictures.

But I think I will not be the only one who will take 1+1 and get to 2. Everybody can generate Virtual Reality situations wrt the car by taking traffic data and shifting them in time/place to generate new situations where I am too close to other cars to learn evasive and predictive actions. Generating new data to improve AI training (data enhancement) is a well known and studied technology.

Getting the statistics right is the difficulty in data enhancement, which was the crux of the paper.

The Journal is one of Nature’s an organisation that is becoming more and more disreputable for various reasons.

It is a publication for the “general public”. They are sensationalist, indeed. As part of Springer, their publisher is almost as bad as Reed Elsevier. Their reach and impact allow them to ask high prices. But PLOS too is very expensive and that is fully open access.

modem phonemes April 8, 2023 4:57 PM

@ Clive Robinson @ Winter

Re: chat, cars and ai that

The algorithms imitate brains by use of deep learning [1] –

“Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer.”

Comments by Stephen Grossberg [2] on back propagation –

“Although back propagation was promptly used to classify many different kinds of data, it was also recognized that it has some serious computational limitations. Networks that are based upon back propagation typically require large amounts of data to learn; learn slowly using large numbers of trials; do not solve the stability-plasticity dilemma; and use some nonlocal mathematical operations that are not found in the brain. Huge online databases and ultrafast computers that subsequently came onto the scene helped to compensate for some of these limitations

“When using Deep Learning to categorize a huge database, its susceptibility to catastrophic forgetting is an acknowledged problem, since memories of what has already been learned can suddenly and unexpectedly collapse. Perhaps these problems are why Hinton said in an Axios interview on September 15, 2017 (LeVine, 2017) that he is “deeply suspicious of back propagation . . . I don’t think it’s how the brain works. We clearly don’t need all the labeled data . . . My view is, throw it all away and start over”

Grossberg’s solution and suggestions –

the biologically-inspired Adaptive Resonance Theory, or ART … introduced in 1976

“… ART exists in two forms: as algorithms that are designed for use in large-scale applications to engineering and technology, and as an incrementally developing biological theory … As a biological theory, ART is now the leading cognitive and neural theory about how our brains learn to attend, recognize, and predict objects and events in a changing world that is filled with unexpected events. … ART designs may, in some form, be embodied in all future autonomous adaptive intelligent devices, whether biological or artificial …

“ART has done well in benchmark studies where it has been compared with other algorithms, and has been used in many large-scale engineering and technological applications”

  1. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015)
  2. Stephen Grossberg, “Conscious Mind, Resonant Brain” . “Oxford University Press 2021”

Clive Robinson April 9, 2023 6:10 AM

@ modem phonemes, ALL,

Re : Backpropagation and Brains,

“Although back propagation was promptly used to classify many different kinds of data, it was also recognized that it has some serious computational limitations.”

Yup in the way it is often used it is very limited…

But to understand why you first have to get a handle on what Backpropagation actually is…

A neural network is in reality multiple layers of nodes. Each node in reality is a “Multiply and Add”(MAD) instruction thus if you look through the layers you see,

Zt = W0.Z0 + W1.Z1 + W2.Z2 + W3.Z3

Often called the “sigma function”,

Zt = sigma(Wn.Zn, n)

Each Zt from each part of the network is effectively summed to a single value. That represents a point on a graph curve.

Backpropagation is simply a process to get the network W’s to match a desired curve. It uses a form of Sir Isaac Newton’s “infitesimals” to find “the global minima” by using a simplified “cost function” as the optimising function. As only one W should be changed on each iteration and only by a minimal step size, the itteration process can be long and resource intensive thus expensive.

The problem is to do this iterative process, the “graph curve” normally needs to be,

1, Continuous
2, Convex
3, Have a well defined broad minima

Or to put it simply looks like a cut through of a smoothed valley side.

In the real world these sorts of curves are generally “generalisations” at best and don’t exist except in well constrained and limited ranges.

The human brain mathematically generally works with straight or linear lines on a graph which again generally don’t exist in nature (exponential / power / growth curve). That is we like to think in terms of “Y per X” or at a push “Y per X, plus Y or X”.

Thus the advice on cooking meat in an oven of,

“20mins per pound plus 20mins for the oven”[1]

But the human mind treats “calculation” which is fairly modern for humans rather diferently to visualisation which we know from “cave art” goes back maybe half a million years. Visualisation cares little about how complex surfaces are and deals easily with many local minima and maxima. We intuatively know how to find a high point to look from and a low point to find water in, and computers do not.

[1] This “rule” is actually a linear line approximation or “curve fit” to a power curve with just a “two line fit” over a limited part of the curve. The curve we actually “fit to” is actually just a part of a more complex surface which we avoid by changing the rule by independently talking about the oven temprature, and varying the time to get rare, medium, well done for each different type of meat (effectively fat and moisture content). So “game” which has next to no fat and when fresh next to no moisture gets the least cooking often at the lowest or highest tempratures whilst the likes of belly of pork and lamb being very high in fat get high tempraturs to “burn the fat and the blood”. With beef especially shin beef getting the longer times. The surface however is complex and it’s why you should not use the rule above five to six pounds of solid meat as a “joint” or as a carcus which is why larger fowl like geese and turkey have different rules as do whole carcus on spits.

critical April 9, 2023 5:17 PM

Zt = W0.Z0 + W1.Z1 + W2.Z2 + W3.Z3…
Often called the “sigma function”,

BS. This is called a ‘dot product’ or ‘inner product’ of two vectors.
There are several things called ‘sigma function’ in maths, but none of them has anything to do with the above. What may be used in neural networks (though it’s a bit old style) is the sigmoid function, a non-linear function applied to the Zt as above. You may confuse ‘sigma’ with ‘sigmoid’, but nobody with any real knowledge about neural networks would make that mistake.

As only one W should be changed on each iteration

BS. Usually all weights are adjusted in each iteration of back propagation.

The human brain mathematically generally works with straight or linear lines on a graph which again generally don’t exist in nature

Pure BS, sound likes it actually means something but it doesn’t.

vas pup April 9, 2023 7:03 PM

Why humans will never understand AI

“The European Union is so concerned by the potentially “unacceptable risks” and even “dangerous” applications that it is currently advancing a new AI Act intended to set a global standard for “the development of secure, trustworthy and ethical artificial

Those new laws will be based on a need for explainability, demanding that “for high-risk AI systems, the requirements of high-quality data, documentation and traceability, transparency, human oversight, accuracy and robustness, are strictly necessary to mitigate the risks to fundamental rights and safety posed by AI”. This is not just about things like self-driving cars (although systems that ensure safety fall into the EU’s category of high-risk AI), it is also a worry that systems will emerge in the future that will have implications for human rights.

Scientists like Mead and Kohonen wanted to create a system that could genuinely adapt to the world in which it found itself. It would respond to its conditions. Mead was clear that the value in neural networks was that they could facilitate this type of
adaptation. At the time, and reflecting on this ambition, Mead added that producing adaptation “is the whole game”. This adaptation is needed, he thought, “because of the nature of the real world”, which he concluded is “too variable to do anything

For more details go to the link.

Clive Robinson April 9, 2023 7:14 PM

@ critical

Ah you’ve pooped up again…

I suggest you take care with your assumptions about what has been written by others.

cmeier April 9, 2023 7:55 PM


Artisans didn’t die out. As one cartoon put it, artisans became a way to inefficiently make luxury goods for the wealthy.

Clive Robinson April 9, 2023 9:39 PM

@ cmeier,

Re : Artisans morphing

+1 😉

Oh and as someone I once worked with pointed out,

“The crude output of an artisan is refered to as “artisanal”… Though why they need to concatenate the three words is beyond me.”

But on a slightly different view…

As I pointed out here some years ago now, the word “manufacture” originally ment “hand made”[1] now it means almost the opposit.

[1] It comes via French from the medieval Latin “manufactura” meaning “a making by hand”. From “manus” for “hand” and “factus” for “I do make”, so “by hand I do make”.

ResearcherZero April 10, 2023 2:29 AM

We do have courts in Australia, and though it is a backwater, if you want to do business, occasionally you may have to answer to the law. The law is both domestic and international, with precedent that may be set in each.

The legal space is set by precedent, and as the legal framework continues to evolve, so do other standards.

“NIST has developed a framework to better manage risks to individuals, organizations, and society associated with artificial intelligence (AI).”

critical April 10, 2023 4:16 AM

From “manus” for “hand” and “factus” for “I do make”, so “by hand I do make”.

‘Factus’ certainly does not mean ‘I do make”.
It is the past participle, nominative, male, of the Latin verb ‘facere’ which means ‘to make’. ‘I make’ would be ‘facio’.

The English ‘manufacture’ probably derives from the same word in French, where it is not a verb but a noun, and refers to ‘the act or process of making’, not necessarily by hand.

A factory or workshop in Latin would be ‘officina’ or ‘fabrica’, certainly not ‘manufactura’.

Clive Robinson April 10, 2023 8:17 AM

@ critical,

“The English ‘manufacture’ probably derives from the same word in French”

Which is a much wordy way of saying

“comes via French”

As I already said.

As for your “Oh so correctimism”, you are forgetting the process of “lost in translation” you get when words get absorbed from one language to another especially when done for status reasons. The English language is full of them have a look at the history of place names and how “Elephant and Castle” in London to the south and east of Southwark Stn next to Waterloo got it’s name or what is pronounced as “Beaver Castle” up in Leicestershire got that way. Then there is the meat on the table v the creature in the farm yard, and words like gauche and adroit and more recently sabotage, with other handed words like sinister and dexterous, then others routes like awkward. Speaking of which your past apperences have usually been followed by a load of cack in what appears to be a narked person in a fit of pique. I guess others who have observed this will be again asking “Correlation or Causation”.

I guess we will have to wait and see just how far your personal pettiness and your desperate internet behaviours goes.

ResearcherZero April 13, 2023 1:50 AM

@Clive Robinson

Yep, procedural context is often out of place in English. Time and place are not ordered. In some other languages you actually say what you mean, and in the correct order. English is crazy, but it is not certifiably crazy.

“A.I. mainly identified, sorted and classified words in documents. The technology’s tools served more as aides than as replacements — and the same could be true this time.”

There are links to studies in the article.

Winter April 13, 2023 3:33 AM


Time and place are not ordered. In some other languages you actually say what you mean, and in the correct order.

Every language is basically the same. Every language is equally complex, but the complexity is at different places. Every spoken or signed language can be mastered by a 6 year old (4 yo, basics). Spoken English is no different in expressive power than any other spoken language (writing systems are different).

AI LLM have shown that beautifully. They can do word translation without parallel (=Rosetta Stone) texts [1]. That is, LLMs in different languages can deduce word translations from the structure of the word “connections” in the models. I heard the same works with LLMs and Speech (hearsay, no articles found).

LLMs can do the same with pictures [2].

[1] ‘

[2] ‘

Clive Robinson April 13, 2023 7:46 AM

@ ResearcherZero, Winter, ALL,

“English is crazy, but it is not certifiably crazy.”

You sure on that?

Auz has stories of the madness of “£5 Poms” and later arivals, you might also want to look up a little ditty by Noel Coward,

“Mad dogs and Englishmen”

Arguably the English and their language are the cause of the worst of “WASP” culture, as we “exported greed and avarice” around the globe and that gave rise of the worst of libertarianism and neo-con behaviours.

As the old saying has it,

“If the cap fits, wear it”

But in my case I feel it is “to tight” and gives me head aches almost daily.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via

Sidebar photo of Bruce Schneier by Joe MacInnis.